Plone6: nginx-volto-plone-zeo containers not working (backend crashes)

Hi! I wanted to play with the latest Plone and decided that I should give the containers a go. So I went to the 'nginx-volto-plone-zeo' example because that seems to reflect my intended use case best, added a few entries in /etc/hosts, basically adding a few aliases to 'localhost', and changed the listening port of nginx to something unprivileged, and spun the ensemble up.

However, somehow, the backend appears to be unable to contact the ZEO server, and the entire site only results in 502s. The error message I can see in the backend, is always a stacktrace ending with

raise ClientDisconnected("timed out waiting for connection")

The backend usually dies after that, but I have added a restart policy of 'always' to help with debugging. With 'tcpdump', I couldn't see any packets, and I haven't figured out how to 'tcpdump' the interface for the container bridge.

Can you please help?

Toni Müller via Plone Community wrote at 2023-11-29 23:58 +0000:

...
Can you please help?

Check the logfile of the zeoserver.
Does it tell you that the server is listening on the expected port?

Does "Starting ZEO on port 8100" count? With lsof, I can see that the container has port 8100 open. But there is no explicit statement like "I am now done with my startup and am ready to listen on this port".


Toni Müller via Plone Community wrote at 2023-11-30 23:20 +0000:

Does "Starting ZEO on port 8100" count? With lsof, I can see that the container has port 8100 open. But there is no explicit statement like "I am now done with my startup and am ready to listen on this port".

I am not sure.
In my logfile, I see:

2023-11-29T09:36:14 daemonizing the process
2023-11-29T09:36:14 set current directory: '...'
2023-11-29T09:36:14 daemon manager started
2023-11-29T09:36:14 spawned process pid=6128
2023-11-29T09:36:14 (6128) created PID file '...'
2023-11-29T09:36:14 (6128) opening storage '1' using FileStorage
2023-11-29T09:36:15 (6128) opening storage 'sessions' using FileStorage
2023-11-29T09:36:15 StorageServer created RW with storages: 1:RW:.../p3/var/filestorage/Data.fs, sessions:RW:.../p3/var/filestorage/sessions.fs
2023-11-29T09:36:15 listening on ('localhost', 9088)
2023-11-29T09:36:51 Connected server protocol
2023-11-29T09:36:51 received handshake 'Z5'

You might use a different ZEO server version; therefore, the log entries
might be slightly different.
But you should see "opening" log entries for your storages
and the "StorageServer created" entry -- at least if you log at
level "info" (as I do).

The last 2 log records above represent a client connection.

Thank you for showing how it should look like. What kind of environment are you using to run those containers, please? Also, what kind of networking setup do you have?
I am running this with podman-compose and a standard networking setup on a bookworm machine. FWIW, I deleted the ZEO image, all "old" interfaces and started over. However, I failed to look into the logs, as I was relying on getting the logs via stdout, ie, 'podman logs '. The logs inside the volume show that the ZEO server does indeed open the storage and listen on the port, but never does it receive a connect message. At this point, I suspect some networking problem. I could never see the bridge getting created anywhere, but it might be part of some namespace thing that I haven't yet discovered. Creating it by hand in the default namespace doesn't solve the problem, though, no packet ever arrives there or on 'lo', and also, backend never passes the health check.

Toni Müller via Plone Community wrote at 2023-12-2 02:19 +0000:

Thank you for showing how it should look like. What kind of environment are you using to run those containers, please?

I do not run any container: I use the old style "buildout/zdaemon"
based installation.
But this should not influence much how the log files look -- apart
from potentially different configuration.

I made some notes while trying out nginx and plone6 last year. Note that I was only using nginx in a docker container and no ZEO or Volto. Copied verbatim, I hope some may be useful to you:

Updated 20220929: we now use nginx in front of our Plone6 instance...

Pull the Docker image: docker pull nginix

When using the default nginix image, it will start a http web server on port 80. Our goal is to have it proxy requests to the Plone6 backend (on port 8080).

The following two documents are essential in understanding how to provide non default nginx settings at startup:

How to Deploy an NGINX Image with Docker | NGINX

Docker

Complemented with basic instructions on proxying from DigitalOcean and our documentation on deploying www.pnz.de and a practical example.

Our goal is to proxy Plone through an nginx Docker container. While we are developing, Plone is served from the host itself. Later on, we will also migrate Plone to use Docker.

We also want to keep our configuration files in one place so that they are independent from the Docker containers that we deploy, and can be backed up separately.

For this we create a folder in our RAID5 mount:

mkdir -p /mnt/md0/conf

with subfolders for all apps that we are going to deploy, so first:

mkdir -p /mnt/md0/conf/nginx

In this folder, we have the default nginx.conf and other default settings, plus configuration files for all virtual hosts that we may want to deploy... We used the conf on www.pnz.de as the model.

It is possible to map specific folders upon starting the docker instance. The examples below tell docker to link /var/www and /mnt/md0/conf/nginix on the host for use by the container: respectively the default www_root and the default location of its configuration files. A future improvement would be to also store the http access and error logs on the host, e.g. in /var/log/containers/nginx.

We can tell Docker to either use our host's network interface or (default) to expose a particular port used by the container.

docker run --name plone-nginx --mount type=bind,source=/var/www,target=/usr/share/nginx/html,readonly --mount type=bind,source=/mnt/md0/conf/nginx/,target=/etc/nginx/,readonly --network="host" -d nginx

docker run --name plone-nginx --mount type=bind,source=/var/www,target=/usr/share/nginx/html,readonly --mount type=bind,source=/mnt/md0/conf/nginx/,target=/etc/nginx/,readonly -p 80:80 -d nginx

The --network=host mode has the benefit that it is a bit faster and that we can leave the plone default ip and port as we normally do. The drawback is that any port opened by the container app will also be open on to the outside world. Depending on what we choose, we must update the ip addresses that will be used by Plone and nginx (full example below). For now, we use the Docker default of exposing a single port 80 and to create its own virtual network at 172.17.0.0/16.

ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: enp5s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 84:a9:3e:8f:a5:fd brd ff:ff:ff:ff:ff:ff
    inet 10.0.10.32/20 brd 10.0.15.255 scope global noprefixroute enp5s0
       valid_lft forever preferred_lft forever
    inet6 fe80::86a9:3eff:fe8f:a5fd/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default 
    link/ether 02:42:e9:24:ab:3a brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
       valid_lft forever preferred_lft forever
    inet6 fe80::306f:ed20:af7b:a59a/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever

The nginx main configuration in /mnt/md0/conf/nginix/nginx.conf - a copy from www.pnz.de

We keep our site configuration in the sites-available subfolder, and symlink it to sites-enabled.

upstream plone-backend {
    #server 127.0.0.1:8080;
    server 172.17.0.1:8080;
}

server {
    listen 0.0.0.0:80;
    server_name plone6.pnz.de;
    access_log /var/log/nginx/plone6.pnz.de.access.log;
    error_log /var/log/nginx/plone6.pnz.de.error.log;


    # https://docs.nginx.com/nginx/admin-guide/web-server/compression/
    # text/html is always compressed by HttpGzipModule
    gzip on;
    gzip_types      text/css
                    text/csv
                    text/plain
                    application/javascript
                    application/json
                    application/xml;
                    
    gzip_proxied    no-cache no-store private expired auth;
    gzip_min_length 1000;
    
    # Note that domain name spelling in VirtualHostBase URL matters
    # -> this is what Plone sees as the "real" HTTP request URL.
    # "Plone" in the URL is your site id (case sensitive)
    location / {
        proxy_set_header X-Real-IP $remote_addr;
        proxy_pass http://plone-backend/VirtualHostBase/http/plone6.pnz.de:80/Plone/VirtualHostRoot/;
    }
}

Note how we use 0.0.0.0 as the port instead of the external ip address of the host 10.0.10.32

To run in network mode, we replace the upstream host address with the one that is commented out and also tell Plone to do this.

Below is the relevant part from our Plone server config in /usr/local/Plone/instance/etc/zope.ini

[server:main]
use = egg:waitress#main
#listen = 127.0.0.1:8080
listen = 172.17.0.1:8080
threads = 4
clear_untrusted_proxy_headers = false
max_request_body_size = 1073741824

I've made a little progress. The backend no longer crashes, and I can see the website on port 3000, but I can't access it through nginx. Those requests fail due to a problem with CORS. I have added this to the environment of the backend and then restarted it:

CORS_ALLOW_ORIGIN: "*"

This made the "Welcome to Plone 6" page appear for a moment, before being overwritten by an error message:

The backend is not responding, due to a server timeout or a connection problem of your device. Please check your connection and try again.

Thank you.

The request contains a number of CORS related headers. However, one of the requests is not being re-written properly. Where most requests look like

http://plone.localhost:2080/something

there is one URL that reads

http://plone.localhost/++api++/something

ignoring the server port.

My nginx rewrite rule looks like this:

rewrite ^/(\+\+api\+\+\/?)+($|/.*) /VirtualHostBase/http/$server_name:2080/Plone/++api++/VirtualHostRoot/$2 break;