@zopyx when running multiple threads per Zserver, as the object cache is per thread, when something finally does hit the second thread, it'll hit a cold cache. If the deployment is so busy you hit the second thread often, you'll eventually spend double the RAM as now both threads have hit the cache limit. This way you also have incoherent caches per backend server, so the benefits of frontend-side session pinning somewhat diminish with this approach.
You most likely want to run one thread per ZServer and spawn more ZServers to allow for more things to happen in parallel, so long as you have the RAM to spare.
In that light WSGI and ZServer provisioning is about the same for serving basic sites. The main practical difference is that file uploads and downloads consume a full WSGI worker, whereas ZServer spawns ad-hoc threads for handling those and does not block the server from handling other requests until the file transfer is complete.
Some WSGI servers, I know of at least gunicorn, can be configured to do a copy on write fork of the application code, so unless you mutate the code at runtime, you'll only have the per server process cache as 'extra' in memory per worker and they all share the same bit of RAM for holding the app.