But is it good enough? What does it add featurewise?
For example, I'd be interested in implementing various plone.transformchain adapters as WSGI middlewares, but unless those could be made asynchronous, they would not add much alone.
With Medusa, I could refactor most of those (all without too much ZODB access) to be executed after ZPublisher so that they would free zope workers to handle following requests (and save memory by serving more request with a single instance). How to do that with WSGI?
They have an adapted version of Pyramid compatible with async WSGI. Yet, we cannot make Plone fully async, because we need to control the amount of open ZODB connections (to optimize caching memory usage).
Probably we could use semaphores to on async WSGI to allow only 1-2 request at time to call ZPublisher, but then execute all transform code async.
Well, maybe with fully externalized catalog, it would be OK with smaller ZODB cache and it would be OK to have like 10 simultaneous async request, each with their own ZODB connection.
@jaroel Yes. But you have to be careful to not allow more simultaneous requests than you like to have active ZODB connections in connection pool (with caching).
I was wrong about Zope WSGI publisher not having support for blob stream iterators. It has (so it returns filehandle like iterable through WSGI pipeline, which could be streamed with its own without db connection). Now have to check how they behave and will returning an iterator allow WSGI server to process another request.
The answer is not well. Haproxy will block until the connection is closed which underutilized the zope streaming support. You can set haproxy to have higher connections than zope threads but its not an ideal solution because if you have some CPU intensive requests then it can mean other requests might be queued in zope when they could have been handled elsewhere.
I did discuss this issue with the author of haproxy. I suggested a feature where you could set it to send new requests on first byte rather than connection close. He thought it was not common enough to support however.
For now I use c.xsendfile which works well except for scaled.images.
I'd be interested if anyone has another solution to this problem. Perhaps another load balncer? I know zope corp was working on a new one.
I'm still not really sure, how blob streaming works with WSGI, but benchmarking looks like it doesn't really matter. Single thread ZServer (with async medusa for blobs) and single thread GUnicorn have similar berformance for downloading a blob:
ZServer with 1 worker thread:
Server Software: Zope/(2.13.22,
Server Hostname: localhost
Server Port: 8080
Document Path: /Plone4/testimage
Document Length: 2613064 bytes
Concurrency Level: 20
Time taken for tests: 2.243 seconds
Complete requests: 200
Failed requests: 0
Total transferred: 522674200 bytes
HTML transferred: 522612800 bytes
Requests per second: 89.18 [#/sec] (mean)
Time per request: 224.263 [ms] (mean)
Time per request: 11.213 [ms] (mean, across all concurrent requests)
Transfer rate: 227600.44 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.1 0 1
Processing: 118 222 24.5 223 290
Waiting: 10 33 22.1 27 128
Total: 118 222 24.5 223 290
Percentage of the requests served within a certain time (ms)
50% 223
66% 228
75% 232
80% 234
90% 245
95% 253
98% 288
99% 290
100% 290 (longest request)
GUnicorn with 1 worker thread:
Server Software: gunicorn/19.3.0
Server Hostname: localhost
Server Port: 8080
Document Path: /Plone4/testimage
Document Length: 2613064 bytes
Concurrency Level: 20
Time taken for tests: 2.281 seconds
Complete requests: 200
Failed requests: 0
Total transferred: 522679400 bytes
HTML transferred: 522612800 bytes
Requests per second: 87.66 [#/sec] (mean)
Time per request: 228.144 [ms] (mean)
Time per request: 11.407 [ms] (mean, across all concurrent requests)
Transfer rate: 223731.11 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.8 0 7
Processing: 15 217 40.2 223 255
Waiting: 7 213 40.2 219 251
Total: 16 218 40.1 223 255
Percentage of the requests served within a certain time (ms)
50% 223
66% 233
75% 236
80% 238
90% 244
95% 249
98% 252
99% 255
100% 255 (longest request)
Here is perhaps one reason WSGI might be a good idea.
I'm not 100% sure I understand it but I believe it lets you dynamically adjust which works get which kind of requests by giving periodic feedback to the load balancer. You could for instance start telling the lb about frequent large blob requests making sure one instance gets them most often.
It's low on documentation though so it doesn't say how it handles when to send the next request.