ZODB out of the pickle jar

maethu · February 15, 2026, 3:40pm

Again, thanks so much for the effort here!! Super exciting!

I ran it with our stack, and the integration was flawless. It just worked.

Using `getTransferCounts`, I created a simple statistic to count the number of reads and writes in our test suite.

ZODB Loads: 372,323

ZODB stores: 414,717

I can see that the TCP latency to PG (compared to the in-memory ZODB) adds approximately 10%-15% to the runtime of the entire test suite. But that is expected and not an issue at all.

I’ll proceed and run the setup on our K8s cluster with a substantial amount of data.

jensens · February 16, 2026, 12:00am

ZODB's defining feature is transparent persistence - you read/write obj.foo (with foo a persistent object) and ZODB silently loads the object's state from storage via __getattribute__ / _p_activate(). In Python, you cannot await an attribute access. There's no __async_getattribute__. This means the single most important operation in ZODB (ghost -> loaded) is inherently synchronous.

A truly async ZODB would require rethinking the programming model - explicit loads instead of transparent activation - which sacrifices the very thing that makes ZODB elegant.

A connection pool of ZODB connections in dedicated threads, fronted by an async API, gives you async application code while keeping ZODB's proven internals intact. Pyramid + ZODB apps do this with gunicorn --worker-class uvicorn.workers.UvicornWorker . It works but it's async-around-ZODB, not native async ZODB.

jensens · February 16, 2026, 12:03am

Curios, did you use plone-pgcatalog? Since its experimental and still has some rough corners. Thus said, it works for standard/simple scenarios.

jensens · February 16, 2026, 12:13am

While working on this I found a flaw that led me to create an CMFCore

issue reindexObjectSecurity: tree-walk is 12-25x faster than catalog query · Issue #152 · zopefoundation/Products.CMFCore · GitHub
and PR Rewrite reindexObjectSecurity to walk containment tree by jensens · Pull Request #153 · zopefoundation/Products.CMFCore · GitHub

Having it would help to speed up zodb-pgjsonb/ plone-pgcatalog - but also any classic content creation and modification.

And the solution is that simple, I am pretty sure somethings is missing and there's a reason for the current behavior. Please comment over there.

maethu · February 16, 2026, 1:16pm

I did not, but I’m happy to try. The catalog is not super important in our stack, since we use Elasticsearch for the heavy lifting.

jensens · February 16, 2026, 2:00pm

As I do in many projects - and I wrote a whole addon stack for this purpose: collective.elastic.plone

But having both in your stack increases the maintenance effort.

My goal with plone-pgcatalog is to change this. PostgreSQL is well capable of replacing Elasticsearch. And if you don't want to be limited by the catalog's features, you can always build your own queries directly in SQL.

flipmcf · February 16, 2026, 3:20pm

This is…. wow…

jensens · February 16, 2026, 6:22pm

While at it, I made Relstorage's embedded zodbconvert a standalone tool zodb-convert (with dash to avoid conflicts and to confuse you all *eg*).

maethu · February 17, 2026, 12:49pm

@jensens

Looking at my first assessment and trying to implement plone-pgcatalog I realized I only applied my “DemoStorage bypass” patch to a subset of tests.

Hint: DemoStorage is entirely in-memory and was stacked on top of my PGLayer for most tests and is used to rollback the state after every test. To run true e2e tests in this case, you need to patch DemoStorage to route requests to the actual storage.

Well, after I figured that out. The overhead is a bit more, around 40%. Which again is not a Problem. I will go back to in-memory-db-testing eventually, but for now, I think it's a great way to test the integration of the new storage.

This being said. I got 2% failing tests and I’m investigating them right now. I cannot yet tell whether this is an issue with my test setup or with the storage itself.

Once that is resolved, I proceed with plone-pgcatalog.

Thanks for pointing out collective.elastic.plone.
Maybe you saw this: Collective.elasticsearch ES 8 support, AI and more - I’m not sure if I can just replace ES on my stack. We use many ES features, such as pipelines (data processing), vectors, and sophisticated query- and index-time boosting.

maethu · February 17, 2026, 5:01pm

@jensens I was able to fix all my errors by optimizing a “probably“ edge case within the zodb json codec.

github.com/bluedynamics/zodb-json-codec

Fix shared reference data loss: update memo after BUILD

main ← webcloud7:mle-multiple-instances-different-places

opened 04:51PM - 17 Feb 26 UTC

maethu

+157 -15

When the same non-Persistent Python object is referenced in two places in a pick…le, BINPUT clones the stack top into memo before BUILD transforms it (e.g. Reduce → Instance). BINGET then returns the stale Reduce clone, losing instance state on the second occurrence. After BUILD, scan memo for entries matching the old pre-BUILD value and replace them with the post-BUILD result. This mirrors CPython's pickle VM which uses object identity (shared references). **In Plone I have the problem under the following circumstance:** I have a `force login handler` register as `IBeforeTraverseEvent` hook: ``` <subscriber handler=".force_login.handler" for="Products.CMFCore.interfaces.ISiteRoot zope.traversing.interfaces.IBeforeTraverseEvent" /> ``` The handler runs: ``` event.request.post_traverse(...) ``` Which eventually might raise `Unauthorized` if a user is anoymous. We have this to protect the site from any anonymous traffic. **This results in:** The same object is referenced twice, once in `__before_traverse__` and once in `__before_publishing_traverse__`. I used Claude Opus 4.6 to fix the Rust code. I understand the problem, and I understand the fix, but I have no experience with Rust.

jensens · February 17, 2026, 10:48pm

merged and released, and a big thanks!

erral · February 19, 2026, 11:04am

Thank you, @jensens! This project looks comparable to the effort of upgrading Plone to support python3: the next big step to modernize the stack. Kudos!

I have been doing some tests following the instructions at the plone-pgcatalog repo and the initial project setup (just using the zodb-pgjsonb package) works as expected in the initial setup.

Nevertheless, each time I run the pgacatalog install, the site breaks with different error messages.

Right now, we are working on a Plone 4 -> Plone 6 (Classic) migration, and today I decided to give a try with the project and migrate it to zodb-pgjosnb storage (without any S3 storage settting) and install also plone-pgcatalog.

First things first, without installing plone-pgcatalog, the site works OK. I have some issue with a content-type where I am saving external data in an annotation instead of regular fields, and I see that some data that are persisted in ZODB filestorage are not being persisted in the zodb-pgjsonb storage. The annotation has deep nested dict and list structures, and some lists are persisted but some dicts are not. I will double-check this, because I may be saving items in a wrong way...

In all other terms the site is working OK, the content views, edit forms, workflows, etc. are working OK and as expected.

Second, when I install plone-pgcatalog I start receiving TypeError: ('Could not adapt', <CatalogSearchResults len=0 actual=0>, <InterfaceClass plone.app.contentlisting.interfaces.IContentListing>) like errors. We are using Plone listings here and there, so we may need to add the corresponding adapter from CatalogSearchResults to IContentListing to make those listings work.

Those are my findings until this moment. I will keep doing some more tests and report them if I find anything else.

Thank you!

gyst · February 19, 2026, 11:21am

Super nice, Jens!

jensens · February 19, 2026, 1:47pm

I did some work on it, I'll come up with an own post. I tested, pimped and hardened it a lot. Stay tuned.

ericof · February 25, 2026, 9:38am

Thanks @jensens for updating cookiecutter-zope-instance.
Later today I will update cookiplone-templates to support this as an additional option for persistence.

pbauer · February 25, 2026, 1:41pm

Can you share what steps you had to do to run your tests with pgcatalog? I tested the catalog and storage in a pretty complex project and found no more problems but I’d love to check if all tests pass.

maethu · February 25, 2026, 4:01pm

I’m happy to share the setup I use to let plone tests hit the actual PostgresDB and not the DemoStorage.

Base docker test layer docker_layer.py · GitHub
“The meat“: A postgres db layer, that patches the demo storage. pgjsonb_layer.py · GitHub Very much, super experimental. But that way, all my test cases are actually end-to-end tested.
“The dessert“ Functional testing based on pg snapshot strategy pg_testing.py · GitHub

When I try to plug in pgcatalog, I have various test setup issues. I’m still sorting them out.

Full suite test run - stats via pgadmin

rafaelbco · February 25, 2026, 11:31pm

Thanks for the great work @jensens !

I would say that this work is a testament to how well designed is ZODB, the ZCatalog and related Zope packages.

Do you agree with that or am I stretching it?

zopyx · February 26, 2026, 5:26am

Neither the ZODB nor the ZCatalog are well-designed components.
The ZODB has always been - a more or less - dumb pickle grave.
My understanding of a database is that you can store and query data.
That's why the ZCatalog has been dumped on top the ZODB for making data searchable. The separation of indexes is standard but the concept of brains and metadata is so rotten brain-dead...never ever seen something else in a real databases...beyond the fact that the capabilities of the ZCatalog have always been behind the options with a real-world databases...ok, Dieters AdvancedSomething...offered some relief but the architecture and - in particular - the implementation of the catalog.

Well, both lasted very long and served us well. However, this technology has been outdated very fast and both components should have been replaced in Zope and Plone within the first half of the first decade the new millennium.

jensens · February 26, 2026, 11:32am

Stretching. ZODB is done right. Its pragmatic and solved a problem of it's time. Python code is 90ies style. What to expect else?

ZCatalog. Well. It does its job, but should have been replaced by a sane index long time ago. Many years ago by Solr, in the last 5-10 years by ElasticSearch/Opensearch. But these days Postgresql features are more than sufficient for 99% of use cases.