Pros and cons of relstorage

I'm looking for a good overview of the pros and cons of relstorage. I'd appreciate any guidance or thoughts.

I've found some articles already:

But I'm still looking for a good big picture treatment.

I don't have time to give a detailed explanation.

We use zeoserver with ZRS.

We experienced bad performance replicating and serving blob data out of RDMS. Also, ZRS is pretty simple to setup. Replication for postgresql is not as easy--at least not for me. If you don't want your blobs replicated with relstorage, you need to do a network share or something similar which I am also not interested in doing.

So at the end of the day, if you're heavily using blobs then RelStorage is a bad idea?

Sounds like RelStorage at scale may require cloudstorage for assets (S3 or similar).

That conclusion is too fast David. The scenario Nathan describes is where you use the RDMS to replicate not only the primary data but also the blob storage.

Other possible setups:

  • Primary data + blobs in RDMS without replication. I.e. no real-time failover but you can still use backups of course.

  • Primary data in RDMS, blobs on non-replicated local shared directory. Works only if all your instances are on the same host with access to the shared blob dir.

  • Primary data in RDMS, blobs on NFS share, non replicated. Same as above but allows you to run instances on multiple hosts.

  • Primary data in RDMS, blobs on NFS share with replication. HA (High Availability) variant of the above. You can use replication and even failover for the primary data. Blobs you can do a poor man's replication via cronned rsync. More advanced is to use DRBD

  • And then there is primary data + blobs in replicated RDMS, the variant David described as having performance issues on the blob replication.

IMO the issue is not about RelStorage, it's about how to achieve a Highly Available failsafe data replication for Plone - i.e. failsafe clustering. RelStorage without HA is a no-brainer, works like a charm. HA fault tolerance is intrinsically harder, and you can either use ZRS or use RelStorage to gain access to the replication facilities of Postgres or Mysql, with the primary data in the RelStorage RDMS and the blobs either in the RelStorage RDMS or on a replicated share.

See also my talk at PloneConf Brazil
(NB that setup cheats on blob replication by using a NetApp MetroCluster...)

1 Like

Thanks @gyst... this is comprehensive and opens up a lot for consideration. I agree HA is hard, I'm familiar with DRBD so it is something I might want to look back at.