Sites not available without any visible reason

Context

  • Plone 5.2 with Python 3 and Plone 4.3 with Python 2.7
  • Nginx 1.16.1

Problem

My Plone sites sometimes are not available without any visible reason. One of them (Plone 5.2) is down:

$ ~/site-gte/zinstance//bin/plonectl status
instance: daemon manager not running

but not the other one:

$ ~/stage-latex/zinstance/bin/plonectl status
instance: program running; pid=2116

It happened this morning whereas these sites were working nicely yesterday, hence a possible reboot of the server isn't involved:

$ uptime 
 11:01:02 up 11 days, 18:16,  1 user,  load average: 0,01, 0,00, 0,00

(and I added a script at boot to start these sites).

Breaking news: what is very strange is that it was enough to restart nginx to make these Plone sites available and working again, even the one with "daemon manager not running" status!

And ps aux | grep lone returns only something about the latter:

$ ps aux | grep lone
root      1906  0.0  0.4 109216 18252 ?        Ssl  févr.27   0:18 /home/gest/site-gte/zinstance/bin/python3.7 /home/gest/site-gte/zinstance/parts/instance/bin/interpreter /home/gest/site-gte/buildout-cache/eggs/zdaemon-4.3-py3.7.egg/zdaemon/zdrun.py -S /home/gest/site-gte/buildout-cache/eggs/plone.recipe.zope2instance-6.3.0-py3.7.egg/plone/recipe/zope2instance/wsgischema.xml -b 10 -d -s /home/gest/site-gte/zinstance/var/instance/zopectlsock -m 0o22 -x 0,2 -z /home/gest/site-gte/zinstance/parts/instance /home/gest/site-gte/zinstance/bin/python3.7 /home/gest/site-gte/zinstance/parts/instance/bin/interpreter /home/gest/site-gte/buildout-cache/eggs/Zope-4.1.1-py3.7.egg/Zope2/Startup/serve.py /home/gest/site-gte/zinstance/parts/instance/etc/wsgi.ini
gest     12566  0.0  0.0  13284   752 pts/0    S+   11:22   0:00 grep --color lone

Well, I'm completely lost... Any help very appreciated!

Check your Plone logfile first.

Check your Plone logfile first.

Before I restarted nginx, for the Plone 5.2 site, the last 10 lines of instance.log were:

$ tail -n10 var/log/instance.log 
2020-03-02 21:20:47,100 INFO    [plone.protect:38][waitress] auto rotating keyring _forms
2020-03-02 21:20:47,101 INFO    [plone.protect:38][waitress] auto rotating keyring _anon
2020-03-02 21:20:47,101 INFO    [plone.protect:38][waitress] auto rotating keyring _anon
2020-03-02 22:33:14,138 INFO    [waitress:359][waitress] Client disconnected while serving /VirtualHostBase/https/gte.univ-littoral.fr:443/iutgte-dk/VirtualHostRoot/Members/denis-bitouze/pub/latex/formations/installation-latex.pdf/@@download/file/installation-latex.pdf
2020-03-02 22:33:14,138 INFO    [waitress:359][waitress] Client disconnected while serving /VirtualHostBase/https/gte.univ-littoral.fr:443/iutgte-dk/VirtualHostRoot/Members/denis-bitouze/pub/latex/formations/installation-latex.pdf/@@download/file/installation-latex.pdf
2020-03-05 16:05:38,162 WARNING [waitress.queue:122][MainThread] Task queue depth is 1
2020-03-06 17:07:06,613 WARNING [waitress.queue:122][MainThread] Task queue depth is 1
2020-03-06 19:48:27,307 WARNING [waitress.queue:122][MainThread] Task queue depth is 1
2020-03-07 08:04:06,791 WARNING [waitress.queue:122][MainThread] Task queue depth is 1
2020-03-09 11:16:38,511 WARNING [waitress.queue:122][MainThread] Task queue depth is 1

and the last 10 lines of instance-access.log were:

$ tail -n10 var/log/instance-access.log 
127.0.0.1 - - [10/mars/2020:01:25:16 +0200] "GET /VirtualHostBase/https/gte.univ-littoral.fr%3A443/iutgte-dk/VirtualHostRoot/%2B%2Bplone%2B%2Bproduction/%2B%2Bunique%2B%2B2019-09-18T21%3A49%3A19.093006/default.js HTTP/1.0" 200 528387 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)"
127.0.0.1 - - [10/mars/2020:01:43:07 +0200] "GET /VirtualHostBase/https/gte.univ-littoral.fr%3A443/iutgte-dk/VirtualHostRoot/corrige-du-controle-de-mathematiques-du-18-01-2019-en-ligne HTTP/1.0" 200 20318 "-" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.96 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
127.0.0.1 - - [10/mars/2020:02:26:24 +0200] "GET /VirtualHostBase/https/gte.univ-littoral.fr%3A443/iutgte-dk/VirtualHostRoot/plonejsi18n?domain=widgets&language=fr HTTP/1.0" 200 15487 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/534+ (KHTML, like Gecko) BingPreview/1.0b"
127.0.0.1 - - [10/mars/2020:02:42:37 +0200] "GET /VirtualHostBase/https/gte.univ-littoral.fr%3A443/iutgte-dk/VirtualHostRoot/ HTTP/1.0" 200 21859 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.113 Safari/537.36"
127.0.0.1 - - [10/mars/2020:02:55:07 +0200] "GET /VirtualHostBase/https/gte.univ-littoral.fr%3A443/iutgte-dk/VirtualHostRoot/futurs-etudiants/licence-professionnelle HTTP/1.0" 200 21893 "-" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.96 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
127.0.0.1 - - [10/mars/2020:03:03:50 +0200] "GET /VirtualHostBase/https/gte.univ-littoral.fr%3A443/iutgte-dk/VirtualHostRoot/robots.txt HTTP/1.0" 200 884 "-" "Mozilla/5.0 (Linux; Android 7.0;) AppleWebKit/537.36 (KHTML, like Gecko) Mobile Safari/537.36 (compatible; AspiegelBot)"
127.0.0.1 - - [10/mars/2020:03:06:09 +0200] "GET /VirtualHostBase/https/gte.univ-littoral.fr%3A443/iutgte-dk/VirtualHostRoot/Members/denis-bitouze/pub/latex/divers/traitements-de-texte-stupides-et-inefficaces HTTP/1.0" 200 25128 "-" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.92 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
127.0.0.1 - - [10/mars/2020:03:08:09 +0200] "POST /VirtualHostBase/https/gte.univ-littoral.fr%3A443/iutgte-dk/VirtualHostRoot/vendor/phpunit/phpunit/src/Util/PHP/eval-stdin.php HTTP/1.0" 404 26 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36"
127.0.0.1 - - [10/mars/2020:03:28:09 +0200] "GET /VirtualHostBase/https/gte.univ-littoral.fr%3A443/iutgte-dk/VirtualHostRoot/ HTTP/1.0" 200 21859 "-" "Mozilla/5.0 (compatible; Nimbostratus-Bot/v1.3.2; http://cloudsystemnetworks.com)"
127.0.0.1 - - [10/mars/2020:03:43:08 +0200] "GET /VirtualHostBase/https/gte.univ-littoral.fr%3A443/iutgte-dk/VirtualHostRoot/Members/denis-bitouze/pub/latex/formations/college-doctoral-lille-nord-de-france/cours HTTP/1.0" 200 24381 "-" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.92 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

Nothing suspect, IMHO.

A guess. Kill the processes and restart it again (maybe in debug mode?).

I already stopped and restarted the sites:

/bin/plonectl stop && /bin/plonectl start

but the problem reappeared a few days later.

Do you mean with:

/bin/plonectl fg

I could but the problem is the sites usually works normally for a few days and I read somewhere it is not recommended to run this for sites in production.

Yes.

But I meant that you should do this only to check if it started. If it did, then stop it and start it the normal way.

( Just to check that something has not 'frozen')

Plone Foundation Code of Conduct