(and I added a script at boot to start these sites).
Breaking news: what is very strange is that it was enough to restart nginx to make these Plone sites available and working again, even the one with "daemon manager not running" status!
And ps aux | grep lone returns only something about the latter:
I could but the problem is the sites usually works normally for a few days and I read somewhere it is not recommended to run this for sites in production.
Sorry for the delay: I was awaiting the next unavailability of my sites and here it is
The problem is, the unavailability generally happens several days after I restarted the sites. Nevertheless, I will this time restart them in foreground mode in hopes of tracking the trouble.
Here are the few last lines of instance.log:
2020-04-14 19:28:39,389 WARNING [waitress.queue:122][MainThread] Task queue depth is 1
2020-04-14 19:28:39,524 WARNING [waitress.queue:122][MainThread] Task queue depth is 2
2020-04-14 19:28:39,800 WARNING [waitress.queue:122][MainThread] Task queue depth is 2
2020-04-14 19:28:39,933 WARNING [waitress.queue:122][MainThread] Task queue depth is 3
2020-04-14 19:28:40,717 WARNING [waitress.queue:122][MainThread] Task queue depth is 3
2020-04-14 19:28:40,733 WARNING [waitress.queue:122][MainThread] Task queue depth is 4
2020-04-14 19:28:40,814 WARNING [waitress.queue:122][MainThread] Task queue depth is 5
2020-04-14 19:28:41,228 WARNING [waitress.queue:122][MainThread] Task queue depth is 6
2020-04-14 19:28:41,919 WARNING [waitress.queue:122][MainThread] Task queue depth is 5
2020-04-14 19:28:41,950 WARNING [waitress.queue:122][MainThread] Task queue depth is 6
2020-04-14 19:28:42,041 WARNING [waitress.queue:122][MainThread] Task queue depth is 7
2020-04-14 19:28:42,222 WARNING [waitress.queue:122][MainThread] Task queue depth is 8
2020-04-14 19:28:42,262 WARNING [waitress.queue:122][MainThread] Task queue depth is 9
2020-04-14 19:28:42,263 WARNING [waitress.queue:122][MainThread] Task queue depth is 10
2020-04-14 19:28:42,428 WARNING [waitress.queue:122][MainThread] Task queue depth is 11
2020-04-14 19:28:42,566 WARNING [waitress.queue:122][MainThread] Task queue depth is 11
2020-04-14 19:28:42,810 WARNING [waitress.queue:122][MainThread] Task queue depth is 12
2020-04-14 19:28:43,077 WARNING [waitress.queue:122][MainThread] Task queue depth is 12
2020-04-14 19:28:43,246 WARNING [waitress.queue:122][MainThread] Task queue depth is 13
2020-04-14 19:28:43,608 WARNING [waitress.queue:122][MainThread] Task queue depth is 14
2020-04-14 19:28:43,948 WARNING [waitress.queue:122][MainThread] Task queue depth is 15
2020-04-14 19:28:44,038 WARNING [waitress.queue:122][MainThread] Task queue depth is 16
2020-04-14 19:28:44,043 WARNING [waitress.queue:122][MainThread] Task queue depth is 17
2020-04-14 19:28:44,102 WARNING [waitress.queue:122][MainThread] Task queue depth is 18
2020-04-14 19:28:45,196 WARNING [waitress.queue:122][MainThread] Task queue depth is 16
2020-04-14 19:28:45,879 WARNING [waitress.queue:122][MainThread] Task queue depth is 16
2020-04-14 19:28:46,011 WARNING [waitress.queue:122][MainThread] Task queue depth is 17
2020-04-14 19:28:48,457 WARNING [waitress.queue:122][MainThread] Task queue depth is 13
2020-04-14 19:28:48,654 WARNING [waitress.queue:122][MainThread] Task queue depth is 14
2020-04-14 19:28:49,372 WARNING [waitress.queue:122][MainThread] Task queue depth is 12
2020-04-14 19:28:49,683 WARNING [waitress.queue:122][MainThread] Task queue depth is 13
2020-04-14 19:28:49,798 WARNING [waitress.queue:122][MainThread] Task queue depth is 14
2020-04-14 19:28:50,272 WARNING [waitress.queue:122][MainThread] Task queue depth is 14
2020-04-14 19:28:50,407 WARNING [waitress.queue:122][MainThread] Task queue depth is 15
2020-04-14 19:28:50,910 WARNING [waitress.queue:122][MainThread] Task queue depth is 15
2020-04-14 19:28:51,171 WARNING [waitress.queue:122][MainThread] Task queue depth is 16
2020-04-14 19:28:51,214 WARNING [waitress.queue:122][MainThread] Task queue depth is 16
2020-04-14 19:28:51,226 WARNING [waitress.queue:122][MainThread] Task queue depth is 17
2020-04-14 19:28:52,838 WARNING [waitress.queue:122][MainThread] Task queue depth is 15
2020-04-14 19:28:53,023 WARNING [waitress.queue:122][MainThread] Task queue depth is 16
2020-04-14 19:28:53,690 WARNING [waitress.queue:122][MainThread] Task queue depth is 17
2020-04-14 19:28:53,701 WARNING [waitress.queue:122][MainThread] Task queue depth is 18
Hi, Just some "wild" guessing.
If you have print statements in your python code this type of symptom could happen (this should normally also give a clear error-message, but maybe in your setup does not). The symptom you describe with the "2 days factor" of some kind could be symptom of some type of "buffer" in your OS being filled to the top.
If you have print statements in your python code this type of symptom could happen (this should normally also give a clear error-message, but maybe in your setup does not).
I'm not sure to understand what you mean by "print statements in your python code". AFAICT, no such statements in my sites.
The symptom you describe with the "2 days factor" of some kind could be symptom of some type of "buffer" in your OS being filled to the top.
Currently, while my sites are up, that's not the case:
This indicates that your incoming requests are stacking up because the Plone backend is unable to answer the requests quickly enough. No idea about the nature and setup of your site but usually you would add more ZEO clients and balance requests been ZEO clients.
@zopyx Good to know: thanks! Unfortunately, I didn't take the ZEO route. Though the Plone backend is unable to answer the requests quickly enough, I guess it answers them later: do you think the delay is a problem for the site visitors?
It's not a good sign that there is a delay. You should set up some kind of monitoring to track the behaviour of the site (the entire stack from nginx down to Plone) to be able to observe what is normal and what is not for your particular setup and traffic patterns.
It depends on how you've deployed the site(s). If, say, on Amazon AWS there are tools there that help you monitor sites. Prometheus, Grafana, zabbix, xymon, but even just basic "is my site up" like uptime robot.