ZODB out of the pickle jar

A long time ago in a galaxy far, far away

It is a period of agentic coding in the FOSS galaxy. Chats with AI are abused to produce flooding AI slop.

Striking from a cluster of senior developers hidden between the billion slop-coders of the big corp, agentic craftsmen of the FOSS community won the first victory in a battle with the powerful floodings of the outside fleet of coders. The big corp fears that another contribution could bring a thousand more free software projects into the agentic craftsmen's hands, and big corp control over the FOSS communities would be lost forever.

To crush free software once and for all, the big corp is concentrating computing power and draining the markets. Powerful enough to destroy an entire community, its competition spells certain doom for the champions of freedom.

So I got superpowers. As with all powers they can be used for good or bad. Also, be careful with the source of those powers. Who controls them? Those control you! Concentration is a problem and we need local distributed models at some point. We are not there yet.

Use the powers for good and support your free software community.

At least for me, agentic coding with Anthropic Claude Opus 4.6 enabled me to try something I never would have without it:

A new PostgreSQL-backed ZODB storage with JSONB transcoding and a replacement for the catalog fully querying PostgreSQL.

I started it a week ago.

It's still between early final and experimental for sure, but it works extremely well.

ZODB has served the Zope and Plone community for over two decades. But its storage model — opaque pickle blobs and BTree-based catalog indexes — hasn't aged well. You can't query your data with SQL. You can't inspect object state without unpickling. And the catalog is a black box that lives inside the very database it indexes.

Time to change that. Four modules, one mission — get ZODB out of the pickle jar:

  • zodb-json-codec — A Rust-powered transcoder that turns opaque pickles into queryable JSON. No code execution, no attack surface, just bytes in, JSON out.
  • zodb-pgjsonb — A full ZODB storage that keeps your objects in PostgreSQL JSONB instead of binary blobs. MVCC, undo, history, blobs — the works. Small blobs stay in PostgreSQL, large ones tier out to S3 (optional).
  • plone-pgcatalog — Replaces Plone's ZCatalog with pure PostgreSQL queries. No more BTree indexes in the database — let Postgres do what Postgres does best. Experimental.
  • zodb-s3blobs — Moves ZODB blobs to S3-compatible object storage with a local LRU cache. Works with any base storage.

Together: pickle bytes → JSON → PostgreSQL → queryable, modern, fast.

Want to try it? Check out

Contributions, feedback, and battle testing are very welcome.

25 Likes

/me checks to see if it is Christmas or April Fool’s Day.

Wow!

2 Likes

Uffff :flushed_face::flushed_face::flushed_face:

OMG, and I was planning to have a slow day today!

Now, let's test it

1 Like

Two things:

  • It seems this repo is not public accessible.
  • There is an issue with transaction.savepoint and S3 support. Importing content using plone.exportimport failed with an error message about the save point blob not being available (I promise to open a better issue when I'm back at my place)

ops, I forgot to publsih it. Done!

Alright, I did not test it with plone.exportimport, just created content
TTW. I'll have a look.

1 Like

zodb-s3blobs 1.0.1 released.

loadBlob now checks pending (in-transaction) blobs before S3/cache, delivering direct, preventing POSKeyError during savepoint commits.

This should solve plone.exportimport problems.

1 Like

Pretty amazing! I wonder if the code really is unencumbered though…

1 Like

Definitely not. It scanned the whole Zope and Plone codebase and RelStorage and based the plans on the findings.

Maybe I phrased my question badly. What I mean is, who owns that code? Can you, as the person making a PR, state categorically that the contained code does not belong to someone else or isn’t licensed for your inclusion in something else?

Tested with this release and the problem persists. I've created an issue #1 with the traceback and my configurations

1 Like

It's not how models work. There is not copied code. I think that is a misconception coming up often. It applies vectors stored in this multidimensional matrix and applies them in a different context via inference than in the one it came from. Even then with the same input you would get different output the next run - goodby deterministic computing. The code is boring, straightforward. Nothing you would not expect.

Actually I can not guarantee the code exists somewhere else already. As I can not when I write code. I could produce the exact same lines as someone else, because its the obvious solution for a problem.

1 Like

Actually I fixed the wrong bug (I found a related in zodb-s3blobs) but the staging problem is replicated in zodb-pgjsonb. PR#2 created, please test if this fixes it.

1 Like

Thanks for fixing it!

mbTaFQPK6iqEo

:astonished_face: :astonished_face: :astonished_face: :astonished_face: :astonished_face:

Might ruin my wife’s weekend :laughing:

Looks amazing. I’ll try it with our stack ASAP!!

This is very impressive! I love that Pack/GC is supposed to be 25x faster compared to RelStorage. Will try it asap.

Wonderful!

Just a question. In the perf table, it is:

Batch store 100 1.2x slower (JSONB indexing overhead)

but with plone-pgcatalog this should be absorbed in the cataloguing time, right?

It should, but it needs battle testing. Benchmarks are one thing, real world project usage is another.
Also size matters: small sites wont profit as much as large ones.
Next your Postgresql tuning becomes more important than ever. Fast storage, optimized parameters, all things to figure out. That said I tested a vanilla docker official image, no tuning at all.

1 Like

I did some security hardening and released the packages.

3 Likes

Next request. Async zodb.

1 Like