Plone on Kubernetes?

zopyx · April 3, 2020, 8:49am

Anyone running Plone in production (on a larger scale) on top of Kubernetes in the cloud?

I am currently digging through K8s and wonder how you would organize ZEO server and clients in a Kubernetes setup - in particular from the prospective of the storage layer (Data.fs and blob storage).

Some thoughts (and configurations) would be very welcome.

pnicolli · April 3, 2020, 9:29am

We are currently exploring that world as well, and will probably deploy something in the second half of the year. Some things to consider (that you might already know, but still worth writing them down).

The Plone docker image has support for separation of zeoserver and client, so scaling clients appropriately should be fairly easy. K8s allows auto-scaling tricks, so that could be even better.

At the moment we are experimenting with Rancher and configured a Storage Class with Rancher's own software Longhorn. This allows the storage of data in a persistent way (and highly available, if the cluster is highly available) on the cluster nodes. I haven't tried with an external provider like Amazon EBS or other alternatives yet, but it should behave the same, as long as you configure the storage class.

One thing I don't know and still haven't thought of is if there is a way of scaling the zeoserver, and I don't even know if it's needed and at which scale it would be.

We tried RelStorage on postgres in the past on a cloud architecture based on Marathon+Mesos, it worked pretty well but I wasn't very involved in the configuration so I don't know much about it.

I don't have any solid configurations of our own to share, because we only have drafts and also because most of the configuration is done TTW in Rancher, so we didn't actually write any code.

zopyx · April 3, 2020, 10:06am

It would be really cool if we could get Relstorage running on top of Yugabyte.
It is actually working but there is some issue with timeouts and reconnects which transactions that would take a while...but as proof-of-concept: Yugabyte and Relstorage are obviously working.

agitator · August 27, 2020, 6:00pm

Moving my hosting stuff to Google Kubernetes.
One thing that kind of annoyed me was the VirtualHostMonster...
How do you point to your Plone site within the pod?
Installed an extra Nginx inside the container to point to the site root and have URLs correctly rewritten for http/https.
Isn't there a better way to do this?

djay · August 28, 2020, 4:43am

We are not on kube yet but been on rancher cattle for a number of years. We will convert to kube soon but not sure if we will keep rancher with kube or not.
We solve VHM like this

We use VHM only to configure site domains. We add a site in plone and no changes need to be done in rancher.
We use https://github.com/collective/collective.zopeconsul to bring vhm changes to update a special haproxy instance that sets headers based on which instances the site should go to, p4 vs p5 etc.
the default rancher balancer uses those headers to determine which of the different stacks to send requests to via some custom haproxy config.

Unfortunately it looks like rancher on kube throws away all its use of haproxy so that all needs to be redone some other way

We use EFS for blobs and logs and EBS for ZEO server datafs

agitator · August 28, 2020, 6:36am

For the DB I'm using a managed PostgreSQL with RelStorage.

jensens · August 28, 2020, 8:12am

I do not know much about Google K8s and its ingress (GKE Ingress right?). But if there's a chance to use Traefik as ingress it is pretty easy using a middleware and applying those labels to the plones:

- traefik.enable=true
- traefik.docker.network=traefik-public
- traefik.constraint-label=traefik-public
# SERVICE
- traefik.http.services.my-project.loadbalancer.server.port=8080
# ZMI
- traefik.http.routers.my-project-zmi.rule=Host(`zmi.my-domain.tld`)
- traefik.http.routers.my-project-zmi.entrypoints=https
- traefik.http.routers.my-project-zmi.tls=true
- traefik.http.routers.my-project-zmi.tls.certresolver=le
- traefik.http.routers.my-project-zmi.service=my-project
- traefik.http.middlewares.my-project-zmi.addprefix.prefix=/VirtualHostBase/https/zmi.my-domain.tld/VirtualHostRoot
- traefik.http.routers.my-project-zmi.middlewares=my-project-zmi
# DOMAIN
- traefik.http.routers.my-project-domain.rule=Host(`www.my-domain.tld`)
- traefik.http.routers.my-project-domain.entrypoints=https
- traefik.http.routers.my-project-domain.tls=true
- traefik.http.routers.my-project-domain.tls.certresolver=le
- traefik.http.routers.my-project-domain.service=my-project
- traefik.http.middlewares.my-project-domain.addprefix.prefix=/VirtualHostBase/https/www.my-domain.tld/Plone/VirtualHostRoot
- traefik.http.routers.my-project-domain.middlewares=my-project-domain

pnicolli · October 31, 2020, 11:14am

Got it working by doing the same. We added an nginx sidecar in the Plone Pod by creating a very simple dedicated nginx image with a config that looks like this:

upstream plone {
    server localhost:8080;
}

server {

  listen 80;

  location ~ /intranet($|/.*) {
    rewrite ^/intranet($|/.*) /VirtualHostBase/http/example.com:80/intranet/VirtualHostRoot/_vh_intranet$1 break;
    proxy_pass http://plone;
  }
}

agitator · October 31, 2020, 1:25pm

This is what my config looks like, added an extra location for the zmi.
I'm also using the Gitlab auto-devops stuff and getting the domain configs from the ci/cd-variables.
I put Nginx and Cloud-SQL-Proxy directly into the container, since I didn't want to customize the auto-devops templates...

# Plone nginx configuration
upstream plone {
    server 127.0.0.1:8080;
}

server{
    listen 5000;
    server_name ${server_names};

    location / {
          proxy_pass http://plone/VirtualHostBase/https/\${host}:443/Plone/VirtualHostRoot\${request_uri};
          include /etc/nginx/proxy_params;
    }
}

server {
    listen 5000;
    # Test for IP Address (kubernetes heartbeat) and manage. subdomains
    server_name localhost manage.* "~^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}$";

    location / {
          proxy_pass http://plone/VirtualHostBase/https/\${host}:443\${request_uri};
          include /etc/nginx/proxy_params;
    }
}

@pnicolli any in production experience with resource (ram/cpu) management especially scaling? My Plone instances started "crashing" with the auto scaling option on, most probably due missing min/max memory settings for the pod... but haven't had to time yet to further investigate.

I'm also thinking of replacing Nginx with a mini Varnish... any thoughts on that?

djay · November 2, 2020, 7:40am

Why wouldn't you just put the host values into the VHM in the ZMI instead of making more complicated with templated varnish or nginix?

pnicolli · January 20, 2021, 8:27pm

No production experience on Kubernetes so far, we are currently on a test environment, we want to make sure to have some experience on the stack before starting with the production ones. That being said, we haven't tried autoscaling yet, and it's not on the radar for now, we'll see how it goes in the future.

On the other hand, studying the subject even more, we found out that our nginx sidecar was useless, since the nginx ingress can be customized to handle the most common configurations (also with a fully custom template if needed, but we didn't explore that).

We have removed the sidecar and are now deploying this Ingress to serve Plone. This is a specific configuration that serves Plone at /api because it's used for Volto, but it's still a good example.

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: my-plone
  namespace: my-namespace
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /VirtualHostBase/http/my.plone.it:80/Plone/VirtualHostRoot/_vh_api$1
spec:
  rules:
  - host: my.plone.it
    http:
      paths:
      - backend:
          serviceName: my-plone-svc
          servicePort: 8080
        path: /api($|/.*)

@djay Thanks for the pointer to the ZMI, I'm actually no expert of Zope/ZMI/VHM so I had no idea. Nevertheless, the Ingress solution above is probably better because it's actually just a couple of extra lines of code away.

Regarding Varnish, it's a topic we are currently exploring. I see there are multiple solutions that feel like they're a little overengineered for our case, which involve changing the whole ingress-controller to a varnish controller. We had no time to check on the community support on that and are currently trying with a simpler solution that uses plone.recipe.varnish. This solution doesn't allow Varnish to be scaled afaik, but we don't really need to.
Here it is, born these days with no Readme still, with a lot of testing and configuration still needed, but apparently working. It should be fairly configurable with the env vars that can be found in the docker-initialize.py file.

https://hub.docker.com/r/redturtletech/varnish-plone