Server and performance monitoring: what are recommended stacks?

I have to track down performance problems of a Plone site.
There are some potential performance bottlenecks involved in our side which are discussed in this great blog post: How we got a 10x Performance boost at Radio Free Asia
But I suspect our server setup to also contain some bottlenecks (using a mixture of Docker, ZFS, NFS, QEMU VMs, etc)

Previously we were using munin to monitor our servers.

While working on a new monitoring solution I tried out Prometheus:

The results are promising so far. There is a node_exporter project which exposes a lot of useful system metrics which are collected by Prometheus. Also Prometheus was specifically build with cloud setups in mind, so Docker integration is also a thing.
But there is a lot more work to do: tracking performance of individual services like Varnish, PostgreSQL/RelStorage, haproxy, ... down to some concrete Plone performance metrics.

So what are other folks using to monitor their server environments and specifically Plone performance?

1 Like

We are happily using Datadog in two projects with minimal layer on top of haufe.requestsomething for monitoring some application specific data...I think we get the request response times from the Apache and Nginx integrations.

1 Like

Hey, thanks for the love!

Don't forget about this post also:

:smirk: