How to analyze Diazo performance issues?

I am current involved with a large Plone 5 project where the customer has a large rules.xml and some large templates. Rendering a page with Diazo takes 20-30 seconds unless you set DIAZO_ALWAYS_CACHE_RULES=1.

Is there any tool for profiling a Diazo theme for figuring out the execution time of each individual rule? Something like PTProfiler? Right now trial & error, binary search etc. seems to be the only appropriate approach for digging into performance issues...but this approach obviously sucks...I need clear numbers...anyone else encountered such huge transformation times with Diazo?

-aj

I had the very same problem some time ago.

FTR: (from my answer at IRC):

First: I added - some time ago - events to the transformchain. Then, I used a hacky subscriber to log the time used for the transform. Plan was to make an addon from it which never happend. See https://github.com/plone/plone.transformchain/blob/master/plone/transformchain/events.py - Meanwhile it is released and part of latest 4.3/5.0/5.1 releases.

Second: I used the diazocompiler to build the XSLT and analysed it by reviewing the output. Its usually far from optimized. A big speed-up (in our case) was to skip using CSS selectors and use as specific as possible XPath rules.

Diazo being a preprocessor can't give much profiling info of course, the real work is done by lxml or more appropriately by libxslt

you could try to setup a breakpoint at the transform point in plone/app/theming/transform.py
Plone code: transformed = transform(result.tree, **params)

then run the transform yourself:

x transformed=transform(result.tree, profile_run=True, **params)
x transformed.xslt_profile.write('/tmp/profile.xml')

you get a file such as

<profile>
 <template rank="1" match="/" name="" mode="r1" calls="1" time="1715" average="1715"/>
 <template rank="2" match="@*|node()" name="" mode="" calls="819" time="541" average="0"/>
 <template rank="3" match="text()" name="" mode="" calls="530" time="464" average="0"/>
 <template rank="4" match="@*|node()" name="" mode="raw" calls="273" time="176" average="0"/>
 <template rank="5" match="text()" name="" mode="raw" calls="168" time="157" average="0"/>
 <template rank="6" match="/" name="" mode="" calls="1" time="71" average="71"/>
 <template rank="7" match="style/text()|script/text()" name="" mode="" calls="1" time="2" average="2"/>
 <template rank="8" match="input[contains(@class, 'searchButton')]" name="" mode="" calls="1" time="2" average="2"/>
 <template rank="9" match="//*[@class and contains(concat(' ', normalize-space(@class), ' '), ' pat-moment ')]/@data-pat-moment" name="" mode="" calls="1" time="0" average="0"/>
</profile>

time are in ms AFAICT.

if you want to see the raw xsl generated by Diazo you can get it by setting a breakpoint at
plone/app/theming/utils.py circa line 610
return etree.XSLT(
compiledTheme,
access_control=accessControl,
)

x compiledTheme.write_output('/tmp/rawxsl.xsl')

It's a bit offtopic regarding the optimization aspect, but I ought to share this thought anyway:
Often times you find yourself situations, where Diazo is the downright inappriate choice to return the payload. For example in AJAX requests, where only small HTML snippets are returned it's not justified to apply the whole Diazo ruleset. More often the contents are put to the right place anyway via JavaScript. So disabling Diazo would massively improves those AJAX-heavy places like folder_contents.

So that's why I thought of another transformation handler, that acts right before Diazo would kick in, and checks if Diazo can be disabled for this request. You can see this gist as proof of concept.

3 Likes

Would you please elaborate more on this? This seems to be an easy way to speed things up. Thanks !

If Ajax calls don't return html, there will not be any attempt at theming (plone/app/theming/transform.py)

def parseTree(self, result):
        contentType = self.request.response.getHeader('Content-Type')
        if contentType is None or not contentType.startswith('text/html'):
             return None

As an exemple when you upload a file to Plone with mockup the status returned by Plone is a small snippet yes, but of Json. No theming at all.

There was a debugger build. not sure if its in there https://shuttlethread.com/blog/cow-tents-a-debugger-for-diazo

But what I used was a plugin that added a response header with timings including that of theming. That helped narrow which pages diazo makes slow, but not which rules. But pretty much any xpath with // near the start is going to be slow. The more specific the faster.

Here is an example implementation for subscribing to all transformation events:

The approach is at least partly useless since it revealed two long running transformations (in particular their handlers). Now can I look deeper using the standard Python profile into both handlers.

The XSLT profile is not really helpful here.

-aj

XPath expressions in general are very fast because the execution engines (lxml here) are highly optimized (also other XSLT/XPath engines like Saxon). In generall google for "css selectors slower than xpath" and find something like

We have made the same experience in a screen scraping projects where CSS selectors are usually slower than XPath...in general I prefer XPath just because XPath is more powerful and more expressive...but also more complex...well..

-aj

A CSS selector (translated by cssselect to a not-that-efficient XPath) usually searches through the whole tree. Even selecting ids seems not to be very fast. Compare to this XPath is highly optimized. Using absolute XPath reduces the sub-tree to search in. The longer the content of the page get, the slower it gets. Compared to this a specific selection speeds up the search for the element.

Unfortunately Diazo duplicates the queries into the tree - dependent on the theme - 100s of times. Here it helps to first assign the result to a xslt variable and use it. This is really (!) a speed-up and I would like to see diazo doing this on its own, because doing the very same query in a loop several times is not very clever.

@pgg to disable diazo for ajax requests, I generally append ajax_load=1 as URL query parameter, add to the themes manifest.cfg

[theme:parameters]
ajax_load = python:request.form.get('ajax_load', None)

and disable diazo in the ruleset via: <notheme if="$ajax_load" />.

In plonetheme.barceloneta there is a <notheme css:if-not-content="#visual-portal-wrapper" /> which should math many ajax requests with HTML response too.

However, I like your approach and there is for sure room for improvement here.

If timing is the problem, we are using Products.LongRequestLogger which works great.

Products.LongRequestLogger is a good option for other performance issues but not when it comes to DIAZO related performance issues.

as long as diazo is part of the zope request (i.e. in plone.app.theming) then yes.

you're running in development mode; in production mode this isn't necessary, according to the documentation.

Hi all. I just figured out changing the etree.LIBXML_COMPILED_VERSION
etree.LIBXML_VERSION
etree.LIBXSLT_COMPILED_VERSION
to (2, 9, 8) (2, 9, 8) (1, 1, 32) made rendering much faster.

Slow values for me where:
(2, 7, 8) (2, 7, 8) (1, 1, 26)

Before: 80 seconds -after 0.4 Seconds

Maybe this helps anybody ? I also wrote an ACCEPTANCE_TEST using ACCEPTANCE_TESTING so this never ever happens to me/us again.

Any pointers where I can insert this test into the plone core are of course more than welcome.

3 Likes

That's interesting - so upgrading the libxml and libxslt helps to get a performance gain of 800x? Incredible.

This is something we may include in the Plone/Diazo install docs. I don't think it makes sense to put these tests in Plone core as we have no control over system libraries installed on the Plone server.

Well, correlation does not imply causation. Maybe something else besides version was also different... When I was using the fast version, I also told buildout to build my own lxml like this:

# lxml should be first in the parts list
parts =
    lxml

[lxml]
recipe = z3c.recipe.staticlxml
egg = lxml

build-libxslt = true
build-libxml2 = true
static-build = true
libxml2-url = http://xmlsoft.org/sources/libxml2-2.9.8.tar.gz
libxslt-url = http://xmlsoft.org/sources/libxslt-1.1.32.tar.gz

Target systems where debian. I discovered it can be faster on my MacOSX where I was able to build a diazorun using Anaconda/2.7.14 ...

Slow setups did not include an lxml part in their buildout.

1 Like

That looks to me like not building with Cython in the build-time environment as lxml wants it vs. having it there appropriately. If you look at the build output, it'll tell you if this is (or is not) the case.

1 Like