I've been working on fiddling with getting our test runner wrapper from https://github.com/4teamwork/opengever.core/blob/master/bin/mtest to run the buildout.coredev
tests. Our original wrapper is by @jone and the current rewrite is mostly mine. Usability improvements and print safety of the current one are in large parts via feedback and contributions from @lukasgraf.
Our original problem was having very large long running layers, so we originally just split by module and mapped those onto the CPU cores and made sure each had a unique ZSERVER_PORT
(and also used the Jenkins port allocator to allow for running multiple builds like that in parallel on the same CI server). The issue with the buildout.coredev
tests is different, but the current wrapper does a half decent job at cutting down the wallclock runtime of those as well.
5.0 https://github.com/plone/buildout.coredev/blob/roto-testrunner-wrapper-50/bin/mtest
5.1 https://github.com/plone/buildout.coredev/blob/roto-testrunner-wrapper-51/bin/mtest
5.2 https://github.com/plone/buildout.coredev/blob/roto-testrunner-wrapper-52/bin/mtest
@thet can you give the 5.1 branch a timing run on your server again at this point? Install PhantomJS and run the tests niced.
I'm now AFAIK rather close to the truth on:
- Discovering all the layers (some layer names have spaces in them!) and tests
1.1 Dexterity tests
1.2 Archetype tests
1.3 Doctests
1.4 Robot tests - Detecting if PhantomJS is installed
- Splitting large layers into multiple test run batches per test class
- Assigning a unique
ZSERVER_PORT
to all batches
4.1 Already baked in is also support for the Jenkins port allocator plugin - If something fails, giving the user the exact command line to copy paste and run (including single quoting names with spaces in them!)
And as a new thing: asserting on an unclean STDERR. This can get annoying if there are mysterious transient failures like recursion limits being hit and those flaring up as we've had on our opengever.core - we had to turn this off. It's picked up a few curiosities on 5.0 and 5.1, so I'd like for people to give this a spin and look if those are meaningful or not.
It also commits swift and total infanticide upon ^C
. Thank you @jone for that one.
What is still missing is figuring out the required XML output for Jenkins to parse. @gforcada would you know where that stuff is defined (flushed to where and picked up on what condition)?
For giving it a spin, I recommend the branch based on 5.1 as STDERR on 5.2 is noisy per deprecation warnings (as is to be expected at this point). Check out the roto-testrunner-wrapper-51
branch and run bin/mtest
. There is also a -d
flag for more verbose output (including runtime stats per batch) and a -j flag for controlling the parallelity.
Current strategy:
time (export ROBOT_BROWSER='phantomjs' && bin/alltests --all; bin/alltests-at --all)
Parallel via zope.testrunner
:
time (export ROBOT_BROWSER='phantomjs' && export NPROC="$(getconf _NPROCESSORS_ONLN)" && bin/alltests --all -j"$NPROC"; bin/alltests-at --all -j"$NPROC")
Normal mtest:
bin/mtest
Debug mtest with monotonic logging and streaming debug output:
bin/mtest -d
So how, why, what? bin/alltests
runners run things in groups and the -j
flag on zope.testrunner
only splits by layer, not within layers. This means most of the time we'd be running 1..3 layers in parallel while the group runs and most groups only have one layer in them. So this works by trading that idle CPU time for wasted duplicated CPU time and gaining a wallclock time runtime improvement in return. The only blocker to running all of the layers in parallel from each other seemed to be the ZSERVER_PORT
allocation.
If one runs all 4 variants from above for comparison, one can see the ratio of wallclock to CPU time visualizing the core concept at play.
I'd eventually like to bring parallelization improvements in some form upstream to zope.testrunner
, but that's a rather steep uphill at this point still.
Pain points:
- Layer setup duplication is a lot of unnecessary CPU time
1.1 There are many ways around this, but all require further exploration - The test discovery parsing from subprocess output is an ugly idea, but got me started and this far
- The log timestamps on the normal run are not monotonic as per how the logger flushes
It'd seem even as it is now, having CI runs complete in under 10 minutes should be rather possible, but there is still some work to be done to get to even that point. Feedback on the current status most welcome.