So, yes, sometimes Plone is slow. But why?
The following outputs are coming from Py-Spy.
The Own-Time is here of interest.
It is what a function takes without time of the functions it calles.
For simpler later calculations I toggled the output to aggregate by-function instead of the default by-line.
It is a buildout.coredev
on the 6.0
branch as on 2020-02-18.
I let all tests run on my Thinkpad 490s w/o Robot-tests.
After a normal buildout I did pip install py-spy
.
Then I started the tests in py-spy with time py-spy top ./bin/test
.
And had to wait.
Following is the output.
It is test-code, so a running Plone in production has different numbers.
Also the numbers differ from run to run by 1-2%, so take it as rough estimates.
But it gives us hints.
%Own %Total OwnTime TotalTime Function (filename)
0.00% 0.00% 153.9s 153.9s __hash__ (zope/interface/interface.py)
0.00% 0.00% 146.1s 312.6s _lookup (zope/interface/adapter.py)
0.00% 0.00% 55.07s 1470s run_tests (zope/testrunner/runner.py)
0.00% 0.00% 45.54s 45.62s parse (chameleon/astutil.py)
0.00% 0.00% 45.27s 45.27s __hash__ (zope/interface/declarations.py)
0.00% 0.00% 28.79s 121.8s wrapper (chameleon/codegen.py)
0.00% 0.00% 27.33s 65.63s __get__ (plone/dexterity/content.py)
0.00% 0.00% 24.58s 36.95s decorator (plone/dexterity/schema.py)
0.00% 0.00% 24.56s 85.15s visit (chameleon/astutil.py)
0.00% 0.00% 22.08s 55.75s generic_visit (ast.py)
0.00% 0.00% 21.42s 41.05s getState (ZODB/serialize.py)
0.00% 0.00% 17.27s 380.1s queryMultiAdapter (zope/interface/adapter.py)
0.00% 0.00% 17.13s 18.87s rewrap_in_request_container (Products/CMFPlone/patches/gtbn.py)
0.00% 0.00% 16.28s 16.42s build (chameleon/loader.py)
0.00% 0.00% 11.54s 21.26s __call__ (ZODB/utils.py)
0.00% 0.00% 10.99s 11.63s get (weakref.py)
0.00% 0.00% 10.79s 26.47s _dump (ZODB/serialize.py)
0.00% 0.00% 9.69s 9.70s iter_fields (ast.py)
0.00% 0.00% 9.66s 34.84s loadBefore (ZODB/DemoStorage.py)
0.00% 0.00% 9.34s 45.26s __getattr__ (plone/dexterity/content.py)
0.00% 0.00% 9.03s 56.51s adapter_hook (zope/component/hooks.py)
0.00% 0.00% 8.95s 25.43s getFields (zope/schema/_bootstrapfields.py)
0.00% 0.00% 8.76s 57.94s visit (ast.py)
0.00% 0.00% 8.75s 12.07s get (zope/interface/interface.py)
0.00% 0.00% 8.64s 9.54s __iter__ (plone/registry/recordsproxy.py)
0.00% 0.00% 8.57s 8.57s _write (chameleon/astutil.py)
0.00% 0.00% 8.45s 13.06s transformIterable (plone/app/theming/transform.py)
0.00% 0.00% 8.43s 75.38s __init__ (chameleon/astutil.py)
0.00% 0.00% 8.23s 66.34s visit (chameleon/codegen.py)
0.00% 0.00% 7.96s 8.98s names (zope/interface/interface.py)
0.00% 0.00% 7.93s 15.58s persistent_id (ZODB/serialize.py)
0.00% 0.00% 7.88s 18.05s __getitem__ (collections/__init__.py)
0.00% 0.00% 7.69s 7.89s _getOb (OFS/ObjectManager.py)
0.00% 0.00% 7.68s 364.0s __getattr__ (plone/indexer/wrapper.py)
0.00% 0.00% 7.51s 8.76s loadBefore (ZODB/MappingStorage.py)
0.00% 0.00% 6.99s 11.75s lookup_fti (plone/dexterity/schema.py)
0.00% 0.00% 6.69s 19.16s queryUtility (zope/interface/registry.py)
0.00% 0.00% 6.50s 6.91s getPhysicalPath (OFS/Traversable.py)
0.00% 0.00% 6.06s 6.11s load_persistent (ZODB/serialize.py)
0.00% 0.00% 5.77s 10.79s _subscriptions (zope/interface/adapter.py)
0.00% 0.00% 5.69s 6.02s store (ZODB/Connection.py)
0.00% 0.00% 5.36s 5.45s __init__ (ZODB/_compat.py)
0.00% 0.00% 5.30s 32.83s __iter__ (_collections_abc.py)
0.00% 0.00% 5.09s 6.51s dedent (textwrap.py)
0.00% 0.00% 4.88s 4.88s __cmp (zope/interface/interface.py)
0.00% 0.00% 4.80s 4.80s normalize_lang (zope/i18n/negotiator.py)
0.00% 0.00% 4.71s 80.37s setstate (ZODB/Connection.py)
0.00% 0.00% 4.70s 43.60s _store_objects (ZODB/Connection.py)
0.00% 0.00% 4.66s 319.7s _uncached_lookup (zope/interface/adapter.py)
0.00% 0.00% 4.16s 589.1s subscribers (zope/interface/adapter.py)
0.00% 0.00% 4.07s 63.28s __call__ (chameleon/compiler.py)
0.00% 0.00% 3.87s 48.86s setGhostState (ZODB/serialize.py)
0.00% 0.00% 3.75s 3.77s __contains__ (weakref.py)
0.00% 0.00% 3.75s 3.75s records (plone/registry/registry.py)
0.00% 0.00% 3.67s 6.91s rolesForPermissionOn (AccessControl/ImplPython.py)
0.00% 0.00% 3.66s 3.66s _new_line (chameleon/astutil.py)
The output of time:
real 25m3,478s
user 24m36,560s
sys 0m50,579s
Let's reduce this to time in zope.interface:
%Own %Total OwnTime TotalTime Function (filename)
0.00% 0.00% 153.9s 153.9s __hash__ (zope/interface/interface.py)
0.00% 0.00% 146.1s 312.6s _lookup (zope/interface/adapter.py)
0.00% 0.00% 45.27s 45.27s __hash__ (zope/interface/declarations.py)
0.00% 0.00% 17.27s 380.1s queryMultiAdapter (zope/interface/adapter.py)
0.00% 0.00% 8.75s 12.07s get (zope/interface/interface.py)
0.00% 0.00% 7.96s 8.98s names (zope/interface/interface.py)
0.00% 0.00% 6.69s 19.16s queryUtility (zope/interface/registry.py)
0.00% 0.00% 5.77s 10.79s _subscriptions (zope/interface/adapter.py)
0.00% 0.00% 4.88s 4.88s __cmp (zope/interface/interface.py)
0.00% 0.00% 4.66s 319.7s _uncached_lookup (zope/interface/adapter.py)
0.00% 0.00% 4.16s 589.1s subscribers (zope/interface/adapter.py)
This sums up to 405.24s only in zope.interface.
Or ~27.5% of the test runtime! WTF!
Now there were optimizations in zope.interface master done by me and others.
What about using zope.interface master in Plone?
It is not well tested by now with Plone and is highly experimental!
Anyway, there are no unsolved issues blocking us from running the tests.
So I added zope.interface to sources
and auto-checkout.
Then I run py-spy as above, here the reduced output::
%Own %Total OwnTime TotalTime Function (filename)
0.00% 0.00% 96.23s 125.7s _lookup (zope/interface/adapter.py)
0.00% 0.00% 36.35s 36.35s __hash__ (zope/interface/interface.py)
0.00% 0.00% 14.64s 185.2s queryMultiAdapter (zope/interface/adapter.py)
0.00% 0.00% 8.10s 8.58s names (zope/interface/interface.py)
0.00% 0.00% 8.07s 11.29s get (zope/interface/interface.py)
0.00% 0.00% 6.60s 17.36s queryUtility (zope/interface/registry.py)
0.00% 0.00% 4.64s 132.3s _uncached_lookup (zope/interface/adapter.py)
0.00% 0.00% 4.41s 4.41s __cmp (zope/interface/interface.py)
0.00% 0.00% 4.22s 5.19s _subscriptions (zope/interface/adapter.py)
real 21m20,264s
user 20m54,982s
sys 0m43,547s
This sums up to 183.26s or ~14.5% of the tests time.
Much better, in fact 2.2x faster, but not good enough.
Component architecture is a great thing, but not if it slows down the program that hard.
What about implementing _lookup
in C?
I took my 20 years sleeping C knowledge out of some drawers back in my brain and learned about the Python C-API.
The zope.interface branch implement-_lookup-in-C is the outcome.
It works in Python 3.7.9 at least and is far from finished.
Probably someone with better C-Python skills could get more out of it.
Lets see how it performs (reduced output).
%Own %Total OwnTime TotalTime Function (filename)
0.00% 0.00% 63.13s 96.56s _uncached_lookup (zope/interface/adapter.py)
0.00% 0.00% 39.80s 39.80s __hash__ (zope/interface/interface.py)
0.00% 0.00% 15.42s 156.0s queryMultiAdapter (zope/interface/adapter.py)
0.00% 0.00% 9.16s 13.12s get (zope/interface/interface.py)
0.00% 0.00% 9.07s 9.67s names (zope/interface/interface.py)
0.00% 0.00% 7.47s 17.37s queryUtility (zope/interface/registry.py)
0.00% 0.00% 6.83s 6.83s __cmp (zope/interface/interface.py)
0.00% 0.00% 5.10s 6.70s _mergeOrderings (zope/interface/ro.py)
0.00% 0.00% 5.03s 6.13s _flatten (zope/interface/ro.py)
0.00% 0.00% 5.03s 6.18s _subscriptions (zope/interface/adapter.py)
real 23m46,821s
user 23m17,787s
sys 0m52,630s
This sum up to 166s or ~12% of the tests time.
Still a lot, but much better than in current Plone 5.2.1.
More optimizations need to be done!