How we got a 10x Performance boost at Radio Free Asia

flipmcf · July 25, 2017, 8:13pm

This is basically a "Blog Post".

My sysadmin came in and told me about how slow the site was running now. He got tired of spinning up new instances and tweaking the cache. Our contributors were also complaining about how slow the content adds and edits were going. Those complaints are really bad, because the whole organization will now think Plone is horrible and we need a different CMS.

So, I was tasked with making things faster.

Also, I was given some hints:

Removing the "twitter card" and "open graph" tags from the page template would significantly improve performance.
Requesting and rendering a "Story" (much like the default document type) would also render the main portal page.

This is just weird, and wrong. So I took out the big guns (cProfile and kCacheGrind) and started profiling our TAL templates.

For background, note we use TAL/METAL exclusively for all our templating needs at RFA, but the lessons learned here can certainly be applied to chamelion + diazo implementations.

Without going into the whole "how to profile" subject, which is a talk in itself, I'll jump right to the conclusion and best practice result:

Don't use a TAL path expressions when you can use a python expression
-- Phillip Bauer

and a bit more specific:

Never, ever use a TAL expression in a tal:condition
-- Me

This might be a bit broad, there may be cases where you want to break that rule, but you should know exactly why you are breaking that rule, and have a very strong understanding about how TAL expressions are evaluated before breaking it.

So, how can a simple "twitter card" bring a Plone site to it's knees? That sounds kind of severe, doesn't it?

Consider this handy macro that gets pulled into stories, so we get pretty twitter cards when our stories are shared on twitter:

<metal:macro metal:define-macro="story_twitter">
    <tal:x tal:define="image context/getFeaturedImage | nothing;
                       description context/Description;
		               text context/text">
     <meta property="twitter:card" content="summary_large_image"/>
     <meta property="twitter:site" content="@RadioFreeAsia"/>
     <meta property="twitter:url" content="" 
                     tal:attributes="content context/absolute_url"/>
     <meta property="twitter:domain" content="rfa.org"/>
     <meta property="twitter:title" content="" 
                    tal:attributes="content context/Title"/>
     <meta property="twitter:description" content="" 
                     tal:condition="description"
                     tal:attributes="content description"/>
     <meta property="twitter:description" content="" 
                     tal:condition="not: description"
                     tal:attributes="content text"/>

     <tal:if tal:condition="image">
       <meta property="twitter:image" content="" 
                       tal:attributes="content image/absolute_url"/>
       <meta property="twitter:image:src" content=""
                       tal:attributes="content image/absolute_url"/>
     </tal:if>
   </tal:x>
</metal:macro>

Seems pretty straight forward. The context is a story, we get the story's image, it's descrption, text, and it's url. Then we just fill out the twitter card meta tags.
We also conditionally add an image property, because we might not have an image.
We don't even use a browser view - we just use the context.

The 'image' is an image object, not brains - because it might be scaled. (hint hint)

If we add this macro to our story, it will double the page load time.

What's Going On Here?

How can such simple code cause such a load on the site?

Well, cProfile said that I was spending a whole bunch of time in 'tal:condition'
That doesn't make sense... are you None or are you not? That should be super fast.

cProfile also said I was rendering the main template all over again.

Digging deeper, I was rendering our default view of an image, which I discovered is some strange template somewhere I still haven't found that happily adds a header, footer, featured stories, and all sorts of crap just to show you an image.

If you haven't figured it out yet, %99 of the work is done here:

<tal:if tal:condition="image">

Why?

Because:

It's a TAL expression and
'image' is callable.

if you 'call' and object, you set off a ton of work looking for a default view of that object, finding a template, and rendering that object in HTML.

This line eventually, after a few million CPU cycles, results in this:

<tal:if tal:condition="python:str('<!DOCTYPE HTML PUBLIC ..... and a really long string of HTML')

Which evaluates to True - because it is not an empty string
(cue laughter)

This is the most wonderfully inefficient way to cast an object to a bool I have ever encountered in my career.

This is also the most awesome violation of the 'explicit is better than implicit' philosophy of python: If you want to 'call' something, you should explicitly 'call' it.

TAL takes a different approach: If you don't want to call something, explicitly don't call it: add nocall:

<tal:if tal:condition="nocall:image">

Wow, I just sped up our site by a whole factor of two!

Now back to what Phillip Bauer said, and put it to practice:

<tal:if tal:condition="python:image">

Or even better for those who like to be explicit - like me:

<tal:if tal:condition="python:image is not None">

From here, I cleaned up the entire macro - making everything a python expression where I could.
I then searched through all my templates for such behavior, and found that I have been doing this kind of thing way too often to admit.

Just this one kind of 'bare object token' in TAL - removing this one kind of sin, has sped up our site by over a factor of 10.
I would consider this part of Donald Knuth's "critical 3%" that should not be overlooked when using TALES

Quite embarrassing, but at the same time, quite a relief that it's not Plone's fault - it's my own.

No, I'll blame it on the TAL syntax - that sounds better.

If interested, I'll add another post on how to do performance testing on templates. It's not hard.

zopyx · July 25, 2017, 8:36pm

Your observations are totally right and the issue with the image in a TAL expression is a common mistake. I would not condemn TAL expressions totally..often more readable than a related python expression. python expressions that are often evaluated e.g. in a loop for rendering table rows are accumulated much faster than TAL expressions. General rule of thumb: move your logic into a browser view and prepare everything for rendering..Keep your rendering template as dumb as possible.

-aj

flipmcf · July 25, 2017, 8:42pm

So you're saying it is my fault?

cdw9 · July 25, 2017, 8:44pm

A similar tip: don't call getObject() unless you absolutely need it. I've run across code in the view class for a folder listing, that was calling getObject() twice on all contents of the folder. The page sped up quite a bit when I cleaned those up!

hvelarde · July 25, 2017, 8:44pm

nice post! I'm curious: are you using caching?

just a side advice: don't reinvent the wheel: sc.social.like has been around since Plone 3 and it's still the de-facto standard to include social media metadata and widgets in your site.

your Twitter Card implementation can still be enhanced if you remove all the redundant metadata and just let Twitter use the one from Facebook's Open Graph (we're currently working on that).

also, note that the canonical URL as defined by Facebook has some unintended consequences when you move content around; we tried to address that in latest release, but it still has some bugs (#114, #115) and limitations (#119).

wait for release 2.10.1 later this week if you want to give a try.

flipmcf · July 25, 2017, 8:47pm

Yes. We have a local caching at the origin, and Akamai does the rest.

So our performance hits are mostly seen by the contributor or manager roles. Not the anonymous views.

This kind of practice was all over our site. Twitter cards were just the easiest way to demonstrate it.

Another offender is the 'preoptomization' of looking for TAL variables already set by a callee macro. It's still falls into the category of callables in conditionals:

<tal:x tal:define="foo view/getAnObject">
   <div metal:use-macro='mymacro'/>
</tal:x>

---
<div metal:define-macro='mymacro'>
   <tal:x tal:define="foo foo | view/getAnObject />  <!-- in case the caller forgot to -->

In this case, the 'foo' object set by the calling macro was called and rendered for Truth in the called macro.

hvelarde · July 25, 2017, 8:53pm

you can hide the metadata for the authenticated roles to further enhance performance: they are only useful for Twitter and Facebook's crawlers.

flipmcf · July 25, 2017, 9:13pm

We've tried to keep a cap on that. However, we need to learn how to add metadata to the brain.

For example, our site adds video and audio icons to story listings if the story contains a video or audio.
Unfortunately, to find out if a story has an audio or video can become an expensive call that requires a getObject()

So, any story listing that used to be brain driven now is object driven. (kept in check by a limit of no more than 10 or so objects)

But, what I'd like to do, is create bool metadata on this, and have it returned in the brains.

The project got back burnered. Mostly because the performance lift from such work is not worth it, and would require a reindex once completed - which pisses off sysadmin.

tkimnguyen · July 26, 2017, 2:26am

Sounds like a custom indexer will do this for you.

tkimnguyen · July 26, 2017, 2:27am

Nice work!

djay · July 26, 2017, 3:16am

Actually a custom indexer doesn't help with custom metadata in the brains. They are different things. This is a missing feature. The only way to get a custom Metadata in Plone currently is to have it as an attribute on the object OR via acquisition such as a python script with the same name. ie, if you want has_audio_icon in the brain in a catalog result, add a python script called has_audio_icon to your site root and then add "has_audio_icon" to your portal_catalog metadata.
We need something more flexible right? Something that still works TTW but also via an adapter?

jensens · July 26, 2017, 5:51am

The object is wrapped before it is passed to the ZCatalog, so no, this is the same. The metadata and the indexes are working on the wrapped object and if a custom indexer applies it is effective for metadata as well.

See
https://github.com/zopefoundation/Products.CMFCore/blob/master/Products/CMFCore/CatalogTool.py#L279-L289
for details.

djay · July 26, 2017, 6:07am

@jensens. Thanks. thats good to know.
That means I might be able to make collective.listingviews able to add and remove custom Metadata easily TTW.

pbauer · July 26, 2017, 6:42am

If someone would like to tackle this we could speed up listing quite a it: https://github.com/plone/plone.app.contenttypes/issues/342

tiberiuichim · July 31, 2017, 11:18am

More metadata columns could mean a bigger catalog and less cache hits. For many types of indexes (ex FieldIndex, KeywordIndex) it is possible to read the original value from the index itself, something like this https://play.pixelblaster.ro/blog/2017/01/08/do-you-really-need-that-metadata-column/

pgg · September 19, 2017, 4:57pm

Another hint:
Take a look at your portal_types tool and check, if icon_expr says for some types something like: string:${portal_url}/. What happens with this expression is, that your startpage will be rendered everytime when attempting to get the src for the <img/> tag of the icon. These icons are basically shown multiple times when a logged-in user is browsing. This massively degrades your sites performance!

It might be the case, that you dont use this icon mechanism at all and leave content_icon empty in your types/*.xml. In our setup for instance we use web fonts instead and do not need images. To prevent this problem, refrain from using the deprecrated content_icon as it currently contains a bug (which is gonna be fixed) and use icon_expr instead.

flipmcf · October 10, 2018, 4:18pm

I'm checking this now on my sites, and ELI5'ing your post:

first off, in the ZMI->portal_types and I select a type (say Plone Site - we all should have that one installed) The Icon (Expression) says

string:${portal_url}/site_icon.png

And that's healthy. we have an image at the end of the url.

Now I go and look at 'MyCustomType' and see that the "Icon (Expression)" is string:${portal_url}/

Oops.

And I can see that in MyAddon/profiles/default/types/MyCustomType.xml

<property name="content_icon"></property>

What you are saying is that line needs to be replaced with

<property name="icon_expr"></property>

Correct?

pgg · October 10, 2018, 4:42pm

You are perfectly correct. When either content_icon or icon_expr are empty, you are on the safe side. The main problem comes into your Plone site, when you install your custom type, say MyCustomType, via some Add-on with the file profiles/default/types/MyCustomType.xml.

When there is a line, saying:

<property name="content_icon"></property>

it autogenerates the bad url string:${portal_url}/ which causes the mentioned performance issues. With my fix in GenericSetup this should not happen anymore for future installations, but existing Plone installations might have these urls luring in portal_types. The property icon_expr has not that problem whatsoever and is therefor recommended.