Task queue depth warning and restarting of the containers

Hello community,

My Volto website is giving "Task queue depth" warning and then restarting the Docker containers.

I am working on this website project for a while using a subdomain and, until now, it was working without any problem. I have gone live today using the "www" domain name. After 10 minutes, I have realized that the website is very slow and eventually the Docker containers restarted themselves. When I checked the logs, I have found out the following warning.
I am also sharing my docker-compose below.

Has anyone experience a similar problem and suggest a solution for this?

Versions:

  • Volto 16.21.3
  • Plone 6.0.5
  • plone.restapi 8.39.1

Logs

`[waitress.queue:114][MainThread] Task queue depth is 90`
`plone_backend.2.5a4tyg4drrdh@ip-172-31-24-182    | 2024-07-03 10:22:58 WARNING [waitress.queue:114][MainThread] Task queue depth is 91`
`plone_backend.2.5a4tyg4drrdh@ip-172-31-24-182    | 2024-07-03 10:22:59 WARNING [waitress.queue:114][MainThread] Task queue depth is 92`
`plone_backend.2.5a4tyg4drrdh@ip-172-31-24-182    | 2024-07-03 10:23:00 WARNING [waitress.queue:114][MainThread] Task queue depth is 93`
`plone_backend.2.5a4tyg4drrdh@ip-172-31-24-182    | 2024-07-03 10:23:00 WARNING [waitress.queue:114][MainThread] Task queue depth is 94`
`plone_backend.2.5a4tyg4drrdh@ip-172-31-24-182    | 2024-07-03 10:23:00 WARNING [waitress.queue:114][MainThread] Task queue depth is 95`
`plone_backend.2.5a4tyg4drrdh@ip-172-31-24-182    | 2024-07-03 10:23:01 WARNING [waitress:277][MainThread] total open connections reached the connection limit, no longer accepting new connections`
`plone_backend.2.5a4tyg4drrdh@ip-172-31-24-182    | 2024-07-03 10:23:01 WARNING [waitress.queue:114][MainThread] Task queue depth is 96`

Docker-compose:

version: "3.3"

services:
  frontend:
    image: intkbv/centraalmuseum-frontend:sha-0e619ca
    environment:
      RAZZLE_INTERNAL_API_PATH: http://backend:8080/Plone
    depends_on:
      - backend
    networks:
      - nw-webserver
      - backend
    ports:
      - 3000:3000
    deploy:
      replicas: 2

  backend:
    image: intkbv/centraalmuseum-backend:sha-de4c000
    environment:
      SITE: Plone
      ZEO_ADDRESS: zeo:8100
    volumes:
      - /var/local/centraalmuseum:/app/logs
      - /var/local/centraalmuseum/import/:/app/import
    networks:
      - nw-webserver
      - backend
    ports:
      - 8080:8080
    deploy:
      replicas: 2

  backend-sync:
    image: intkbv/centraalmuseum-backend:sha-de4c000
    environment:
      SITE: Plone
      ZEO_ADDRESS: zeo:8100
    volumes:
      - /var/local/centraalmuseum:/app/logs
      - /var/local/centraalmuseum/import/:/app/import
    networks:
      - nw-webserver
      - backend
    ports:
      - 8081:8080
    deploy:
      replicas: 1

  zeo:
    image: plone/plone-zeo:latest
    volumes:
      - /var/local/centraalmuseum/data:/data
    networks:
      - backend
      - nw-webserver

volumes:
  vol-traefik-public-certificates:
    driver_opts:
      type: none
      device: /data/traefik/certificates
      o: bind
  vol-traefik-config:
    driver_opts:
      type: none
      device: /data/traefik/config
      o: bind

networks:
  nw-webserver:
    external: true
    driver: overlay
  backend:
    driver: overlay

Cihan Andac via Plone Community wrote at 2024-7-3 11:04 +0000:

...
I am working on this website project for a while using a subdomain and, until now, it was working without any problem. I have gone live today using the "www" domain name. After 10 minutes, I have realized that the website is very slow and eventually the Docker containers restarted themselves. When I checked the logs, I have found out the following warning.
I am also sharing my docker-compose below.

Has anyone experience a similar problem and suggest a solution for this?
...
Logs

`[waitress.queue:114][MainThread] Task queue depth is 90`
`plone_backend.2.5a4tyg4drrdh@ip-172-31-24-182    | 2024-07-03 10:22:58 WARNING [waitress.queue:114][MainThread] Task queue depth is 91`

Requests are received by waitress (a WSGI server)
and transfered to so called worker threads for processing.
If no worker is ready, the request is queued.
The "task queue depth" is the number of requests waiting
to be transfered to a worker.

That you see that high "task queue depth"s indicates a
severe problem with the request processing (there are requests
which take a very long time (and maybe never get finished).

To analyse situations like this, I use the "RequestMonitor"
from haufe.requestmonitoring. It supervises request processing
and can log tracebacks for requests taking excessive processing time.

1 Like

Take a look at Volto Waitress Queue

@dieter, Thank you so much for explaining the issue and recommending the tool for monitoring the processes. I will definitely check that.

@wesleybl Thanks for pointing to the related sources. I believe in that post, they have recommended setting up a cache server, I will consider this solution.