ZServer vs. WSGI?

This is a followup of

ZServer and WSGI server are running perfectly fine with Python 3 and Plone 5.2.

Any recommendation/advantages using WSGI in a standard production rollout with multiple ZEO clients over ZServer? For an ongoing project we use 5 clients with 2 zserver threads. Moving to WSGI would require to run 10 clients with 1 thread (which can not be changed).

The RAM allocated by 5 instances x 2 threads is likely similar to 10 instances x 1 thread.

However the general memory usage for the instances would double.

Thoughts?

Why must WSGI setup be different from ZServer? (I assume you would not be using waitress for WSGI.)

There is no must. Just asking about any real world benefits by using WSGI.

But why would WSGI mean "10 clients with 1 thread (which can not be changed)"? Eg. plone.recipe.zope2instance with waitress seems to have default of 4 threads (which is probably good default with waitress, because streaming a blob requires a thread with waitress).

I assume that one benefit from WSGI is the option to use the HTTP server of your choice, but I also assume that ZODB connection cache limits the amount of servers that are practical with Plone.

So, will continue to follow this thread with interest.

The documentation says

"""
zserver-threads
Specify the number of threads that Zope’s ZServer web server will use to service requests. The recipes default is 2. Used for ZServer only, not WSGI.
"""

threads is not explained in the docs, only mentioned in the changelog. So a documentation issue.

I see only one connection in the control panel with threads = 2. Shouldn't I see two connections?

You should, eventually.

ZODB connection pool is separate from the ZServer / WSGI thread limit. I recall that ZODB connection pool size is not even limited. It just naturally grows to match the worker thread limit.

A standard ZServer instance would show one connection per thread.

Well, as Asko said, the connection pool is decoupled from the threads. If you have a ZServer with 4 threads and low-load, you will find i.e. only 2 connections in the pool, because there never were more concurrent requests. Then, the pool will only be increased by another connection when three concurrent requests (IOW, three parallel non-idleing threads) are working.

So: A standard ZServer instance under full load would show one connection per thread.

1 Like

Btw, some years ago I wrote this part of the Plone documentation
https://docs.plone.org/manage/deploying/performance/instancesthreads.html
explaining this (and more). With WSGI the section needs some (but no major) overhaul.

@zopyx when running multiple threads per Zserver, as the object cache is per thread, when something finally does hit the second thread, it'll hit a cold cache. If the deployment is so busy you hit the second thread often, you'll eventually spend double the RAM as now both threads have hit the cache limit. This way you also have incoherent caches per backend server, so the benefits of frontend-side session pinning somewhat diminish with this approach.

You most likely want to run one thread per ZServer and spawn more ZServers to allow for more things to happen in parallel, so long as you have the RAM to spare.

In that light WSGI and ZServer provisioning is about the same for serving basic sites. The main practical difference is that file uploads and downloads consume a full WSGI worker, whereas ZServer spawns ad-hoc threads for handling those and does not block the server from handling other requests until the file transfer is complete.

Some WSGI servers, I know of at least gunicorn, can be configured to do a copy on write fork of the application code, so unless you mutate the code at runtime, you'll only have the per server process cache as 'extra' in memory per worker and they all share the same bit of RAM for holding the app.

1 Like

So I don't see any real reason for switching from ZServer to WSGI. There is no real major advantage that WSGI will gives us over ZServer for traditional Plone setups.

This is only the case if ZServer runs on Python 3.

Before the Sorrento sprint a few weeks back, it did not run on Python 3, and no one was expecting it to run on Python 3.

Hindsight brings clarity.

Ah...forgot about this constraint.

Minor correction. No threads are spawned for delivering blobs, but because it is only I/O the main thread async loop can handle the delivery without blocking. Both with Medusa or with Twisted.

1 Like

In production it doesn't matter does it? or have you found a solution to the below?
If you use a LB like HAProxy I've yet to find away to make affective use of this feature. ie if you have 1 thread you should be setting MAXCON to 1 which means until its finished streaming it won't send another request to that client so the advantage is lost. You can set MAXCON to 2 or more but then you run the risk of a request being queued inside zope behind some 10s long transaction when it could have been handled by some other client quicker.
The only solution I could think of is if haproxy supported some mode which considered the first byte of response received as the signal to send another connection (instead of last byte received). I've requested this. no luck. too niche.
So for the moment we use c.xsendfile and ignore streaming.

Downloads are easily solved by such hosting / solutions architecture sculpting. Hard stuff and full of subtleties to get wrong. If this is the desired way for the ecosystem to move, the best practices documentation should be authoritatively from Plone itself.

How would one handle file uploads on a write heavy installation, though? Dedicated upload workers with no cache and separation at load balancer based on path segments? Sounds like a game of whack-a-mole in regards to adding any addons with unique upload endpoints.

we have 10 pods (instances) with two threads with gunicorn serving content from the zodb (zsqls,zpt..)
Thought between 50k to 100k page/days that is around 1million zodb objects. we have inmutabilized the zodb.. because it's only servng code (thats the worst thing) and this way we can roll new versions in parallelel (at least we shouldnt worry about zodb migrations.. are done on the docker pipeline).

it works but it' so cpu bound.. we also have another part still at zserver (that we also moved to 2 threads).. someone has had the clever idea of running long tasks as web requests.. and we have to stay with zserver).. Personally I don't like having a complex software.. not updated and full of features I will never use (ftp, webdav, xmlrpc..) but till we get the time to convert such long runnig tasks from requests to k8 cron jobs we can't move.

how?

https://www.nginx.com/resources/wiki/start/topics/examples/xsendfile/

1 Like

i already pointed out that its the workaround I use but the original thing I was replying to that you can't take advantage of zserver streaming in actual production setup.