Best practice documentation on ZODB Debugging

A while ago I started writing some documentation about debugging ZODB issues. I never seemed to find a end so I decided to simply publish the current state on our blog:

I crosspost the whole thing here for 2 reasons:

  1. I plan to make the post into a pull-request for the Plone-docs and would like to collect some feedback and suggestions on how to improve it before doing so. I think discussions in blog-posts are stupid, here is a much better place for that.
  2. Many posts in community.plone.org discuss issues that are covered here and I hope it will help to raise visibility.

Beware: Posts here can't be longer than 32.000 characters so I had to edit out some of the post and will post the end as a response...

And now for some light reading:

The problem

The ZODB contains python objects serializes as pickles <https://docs.python.org/3.8/library/pickle.html>_. When a object is loaded/used a pickle is deserialized ("unpickled") into a python object.

A ZODB can contain objects that cannot be loaded. Reasons for that may be:

  • Code could not be loaded that is required to unpickle the object (e.g. removed packages, modules or classes)
  • Objects are referenced but missing from the database (e.g. a blob is missing)
  • The objects contains invalid entries (e.g. a reference to a oid that uses a no longer supported format)

The most frequent issues are caused by:

  • Improperly uninstalled or removed packages (e.g. Archetypes, ATContentTypes, CMFDefault, PloneFormGen etc.)
  • Code has changed but not all objects that rely on that code as updated
  • Code was refactored and old imports are no longer working

You should not blame the migration to Python 3 for these issues! Many issues may already exists in their database before the migration but people usually do not run checks to find issues. After a migration to Python 3 most people check their database for the first time. This may be because the documentation on python3-migration recommends running the tool zodbverify.

Real problems may be revealed at that point, e.g when:

  • Packing the Database fails
  • Features fail

You can check you ZODB for problems using the package zodbverify.
To solve each of the issues you need to be able to answer three questions:

  1. Which object is broken and what is the error?
  2. Where is the object and what uses it?
  3. How do I fix it?

In short these approaches to fixing exist:

  1. Ignore the errors
  2. Add zodbupgrade mappings
  3. Patch your python-path to work around the errors
  4. Replace broken objects with dummies
  5. Remove broken objects the hard way
  6. Find our what and where broken objects are and then fix or remove them safely

I will mostly focus on the last approach.

But before you spend a lot of time to investigate individual errors it would be a good idea to deal with the most frequent problems, especially IntIds and Relations (see the chapter "Frequent Culprits") below. In my experience these usually solved most issues.

Find out what is broken

Check your entire database

Use zodbverify <https://github.com/plone/zodbverify>_ to verify a ZODB by iterating and loading all records. zodbverify is available as a standalone script and as addon for plone.recipe.zope2instance. Use the newest version!

In the simplest form run it like this:

$ bin/zodbverify -f var/filestorage/Data.fs

It will return:

  • a list of types of errors
  • the number of occurences
  • all oids that raise that error on loading

zodbverify is only available for Plone 5.2 and later. For older Plone-Versions use the scripts fstest.py and fsrefs.py from the ZODB package:

$ ./bin/zopepy ./parts/packages/ZODB/scripts/fstest.py var/filestorage/Data.fs
$ ./bin/zopepy ./parts/packages/ZODB/scripts/fsrefs.py var/filestorage/Data.fs

The output of zodbverify might look like this abbreviated example from a medium-sized intranet (1GB Data.fs, 5GB blobstorage) that started with Plone 4 on Archetypes and was migrated to Plone 5.2 on Python 3 and Dexterity:

$ ./bin/zodbverify -f var/filestorage/Data.fs

[...]

INFO:zodbverify:Done! Scanned 163955 records.
Found 1886 records that could not be loaded.
Exceptions, how often they happened and which oids are affected:

ModuleNotFoundError: No module named 'Products.Archetypes': 1487
0x0e00eb 0x0e00ee 0x0e00ef [...]

[...]

ModuleNotFoundError: No module named 'Products.ResourceRegistries': 1
0x3b1311

You can see all different types of errors that appear and which objects are causing them. Objects are referenced by their oid in the ZODB. See the Appendix on how to deal with oids.

You can see that among other issues there are still a lot of references to Archetypes and PloneFormGen (I omitted the complete lists) even though both are no longer used in the site.

Before the summary the log dumps a huge list of errors that contain the pickle and the error:

INFO:zodbverify:
Could not process unknown record 0x376b77 (b'\x00\x00\x00\x00\x007kw'):
INFO:zodbverify:b'\x80\x03cProducts.PloneFormGen.content.thanksPage\nFormThanksPage\n[...]'
INFO:zodbverify:Traceback (most recent call last):
  File "/Users/pbauer/workspace/dipf-intranet/src-mrd/zodbverify/src/zodbverify/verify.py", line 62, in verify_record
    class_info = unpickler.load()
  File "/Users/pbauer/.cache/buildout/eggs/ZODB-5.5.1-py3.8.egg/ZODB/_compat.py", line 62, in find_class
    return super(Unpickler, self).find_class(modulename, name)
ModuleNotFoundError: No module named 'Products.PloneFormGen'

Inspecting a single object

In this case the object with the oid 0x376b77 seems to be a FormThanksPage from Products.PloneFormGen. But wait! You deleted all of these, so where in the site is it?

If the offending object is normal content the solution is mostly simple. You can call obj.getPhysicalPath() to find out where it is. But ore often than not editing and saving will fix the problem. In other cases you might need to copy the content to a new item and delete the broken object.

But usually it is not simply content but something else. Here are some examples:

  • A annotation on a object or the portal
  • A relationvalue in the relatopn-catalog
  • A item in the IntId catalog
  • A old revision of content in CMFEditions
  • A configuration-entry in portal_properties or in portal_registry

The hardest part is to find out what and where the broken object actually is before removing or fixing it.

The reason for that is that a entry in the ZODB does not know about it's parent. Acquisition finds parents with obj.aq_parent() but many items are not-Acquisition-aware. Only the parents that reference objects know about them.

A object x could be the attribute some_object on object y but you will not see that by inspecting x. Only y knows that x is y.some_object.

A way to work around this is used by the script fsoids.py on ZODB. It allows you to list all incoming and outgoing references to a certain object.

With this you will see that x is referenced by y. With this information you can then inspect the object y and hopefully see how x is set on y.

More often than not y is again not a object in the content-hierarchy but maybe a BTree of sorts, a pattern that is frequently used for effective storage of many items. Then you need to find out the parent of y to be able to fix x.

And so forth. It can a couple of steps until you end up in a item that can be identified, e.g. portal_properties or RelationCatalog and usually only exists once in a database.

To make the process of finding this path less tedious I extended zodbverify in https://github.com/plone/zodbverify/pull/8 with a feature that will show you all parents and their parents in a way that allows you to see where in the tree is it.

Before we look at the path of 0x376b77 we'll inspect the object.

Pass the oid and the debug-flag -D to zodbverify with ./bin/zodbverify -f var/filestorage/Data.fs -o 0x376b77 -D:

$ ./bin/zodbverify -f var/filestorage/Data.fs -o 0x376b77 -D

INFO:zodbverify:Inspecting 0x376b77:
<persistent broken Products.PloneFormGen.content.thanksPage.FormThanksPage instance b'\x00\x00\x00\x00\x007kw'>
INFO:zodbverify:
Object as dict:
{'__Broken_newargs__': (), '__Broken_state__': {'showinsearch': True, [...]}}
INFO:zodbverify:
The object is 'obj'
[2] > /Users/pbauer/workspace/dipf-intranet/src-mrd/zodbverify/src/zodbverify/verify_oid.py(118)verify_oid()
-> pickle, state = storage.load(oid)
(Pdb++)

Even before you use the provided pdb to inspect it you can see that it is of the class persistent broken, a way of the ZODB to give you access to objects even though their class can no longer be imported.

You can now inspect it:

(Pdb++) obj
<persistent broken Products.PloneFormGen.content.thanksPage.FormThanksPage instance b'\x00\x00\x00\x00\x007kw'>
(Pdb++) pp obj.__dict__
{'__Broken_newargs__': (),
 '__Broken_state__': {'_EtagSupport__etag': 'ts34951147.36',
                      [...]
                      'thanksEpilogue': <persistent broken Products.Archetypes.BaseUnit.BaseUnit instance b'\x00\x00\x00\x00\x007k\xaa'>,
                      'title': 'Danke'}}

If you now choose to continue (by pressing c) zodbverify it will try to disassemble the pickle. That is very useful for in-depth debugging but out of the scope of this documentation.

Inspect the path of references

Now you know it is broken but you still don't know where this ominous FormThanksPage actually is.

Continue to let zodbverify find the path to the object:

INFO:zodbverify:Building a reference-tree of ZODB...
[...]
INFO:zodbverify:Created a reference-dict for 163955 objects.

INFO:zodbverify:
This oid is referenced by:

INFO:zodbverify:0x376ada BTrees.IOBTree.IOBucket at level 1
INFO:zodbverify:0x28018c BTrees.IOBTree.IOBTree at level 2
INFO:zodbverify:0x280184 five.intid.intid.IntIds at level 3
INFO:zodbverify:0x1e five.localsitemanager.registry.PersistentComponents at level 4
INFO:zodbverify:0x11 Products.CMFPlone.Portal.PloneSite at level 5
INFO:zodbverify:0x01 OFS.Application.Application at level 6
INFO:zodbverify: 8< --------------- >8 Stop at root objects
[...]

You can see from the logged messages that the FormThanksPage is in a IOBucket which again is in a IOBTree which is in a object of the class five.intid.intid.IntIds which is part if the component-registry in the Plone site.

This means there is a reference to a broken object in the IntId tool. How to solve all these is covered below in the chapter "Frequent Culprits".

Decide how and if to fix it

In this case the solution is clear (remove refs to broken objects from the intid tool). But that is only one approach.

Often the solution is not presented like this (the solution to intid was not obvious to me until I spent considerable time to investigate).

The following six options to deal with these problems exists. Spoiler: Option 6 is the best approach in most cases but the other also have valid use-cases.

Option 1: Ignoring the errors

I do that a lot. Especially old databases that were migrated all the may from Plone 2 or 3 up to the current version have issues. If these issues never appear during operation and if clients have no budget or interest in fixing them you can leave them be. If they do not hurt you (e.g. you cannot pack your database or features actually fail) you can choose to ignore them.

At some point later they might appear and it may be a better time to fix them. I spent many hours fixing issues that will never show during operation.

Option 2: Migrating/Fixing a DB with zodbupdate

Use that when a module or class has moved or was renamed.

Docs: https://github.com/zopefoundation/zodbupdate

You can change objects in DB according to rules:

  • When a import has moved use a rename mapping
  • To specify if a obj needs to be decoded decode mapping

Examples from Zope/src/OFS/__init__.py:

zodbupdate_decode_dict = {
    'OFS.Image File data': 'binary',
    'OFS.Image Image data': 'binary',

    'OFS.Application Application title': 'utf-8',
    'OFS.DTMLDocument DTMLDocument title': 'utf-8',
    [...]
    'OFS.userfolder UserFolder title': 'utf-8',
}

zodbupdate_rename_dict = {
    'webdav.LockItem LockItem': 'OFS.LockItem LockItem',
}

You can specify your own mappings in your own packages.
These mappings need to be registered in setup.py so zodbupdate will pick them up.

Rename mapping example: https://github.com/zopefoundation/Zope/commit/f677ed7

Decode mapping example: https://github.com/zopefoundation/Products.ZopeVersionControl/commit/138cf39

Option 3: Work around with a patch

You can inject a module to work around missing or moved classes or modules.

The reason to want do this is usually because then you can safely delete items after that. They don't hurt your performance.

Examples in __init__.py:

# -*- coding: utf-8 -*-
from OFS.SimpleItem import SimpleItem
from plone.app.upgrade.utils import alias_module
from plone.app.upgrade import bbb
from zope.interface import Interface


class IBBB(Interface):
    pass


class BBB(object):
    pass


SlideshowDescriptor = SimpleItem


# Interfaces
try:
    from collective.z3cform.widgets.interfaces import ILayer
except ImportError:
    alias_module('collective.z3cform.widgets.interfaces.ILayer', IDummy)

[...]

# SimpleItem
try:
    from collective.easyslideshow.descriptors import SlideshowDescriptor
except ImportError:
    alias_module('collective.easyslideshow.descriptors.SlideshowDescriptor', SlideshowDescriptor)

try:
    from Products.CMFPlone import UndoTool
except ImportError:
    sys.modules['Products.CMFPlone.UndoTool'] = bbb

More: https://github.com/collective/collective.migrationhelpers/blob/master/src/collective/migrationhelpers/patches.py

Plone has plenty of these (see https://github.com/plone/plone.app.upgrade/blob/master/plone/app/upgrade/init.py)

Option 4: Replace broken objects with a dummy

If a objects is missing (i.e. you get a POSKeyError) or broken beyond repair you can choose to replace it with a dummy.

from persistent import Persistent
from ZODB.utils import p64
import transaction

app = self.context.__parent__
broken_oids = [0x2c0ab6, 0x2c0ab8]

for oid in broken_oids:
    dummy = Persistent()
    dummy._p_oid = p64(oid)
    dummy._p_jar = app._p_jar
    app._p_jar._register(dummy)
    app._p_jar._added[dummy._p_oid] = dummy
transaction.commit()

You shoud be aware that the missing or broken object will be gone forever after you didi this. So before you choose to go down this path you should try to find out what the object in question actually was.

Option 5: Remove broken objects from db

from persistent import Persistent
from ZODB.utils import p64
import transaction

app = self.context.__parent__
broken_oids = [0x2c0ab6, 0x2c0ab8]

for oid in broken_oids:
    root = connection.root()
    del app._p_jar[p64(oid)]
transaction.commit()

I'm not sure if that is a acceptable approach under any circumstance since this will remove the pickle but not all references to the object. It will probably lead to PosKeyErrors.

Option 6: Manual fixing

This is how you should deal with most problems.

The way to go

#. Use zodbverify to get all broken objects
#. Pick one error-type at a time
#. Use zodbverify with -o <OID> -D to inspect one object and find out where that object is referenced
#. If you use fsoids.py follow referenced by until you find where in the tree the object lives. zodbverify will try to do it for you.
#. Remove or fix the object (using a upgrade-step, pdb or a rename mapping)

Find out which items are broken

The newest version of zodbverify has a feature to that does the same task we discussed in Example 1 for you.
Until it is merged and released you need to use the branch show_references from the pull-request https://github.com/plone/zodbverify/pull/8

When inspecting a individual oid zodbverify builds a dict of all references for reverse-lookup. Then it recursively follow the trail of references to referencing items up to the root. To prevent irrelevant and recursive entries it aborts after level 600 and at some root-objects because these usually references a lot and would clutter the result with irrelevant information.

The output should give you a pretty good idea where in the object-tree a item is actually located, how to access and fix it.

If 0x3b1d06 is the broken oid inspect it with zodbverify:

$ ./bin/instance zodbverify -o 0x3b1d06 -D

2020-08-24 12:19:32,441 INFO    [Zope:45][MainThread] Ready to handle requests
2020-08-24 12:19:32,442 INFO    [zodbverify:222][MainThread]
The object is 'obj'
The Zope instance is 'app'
[4] > /Users/pbauer/workspace/dipf-intranet/src-mrd/zodbverify/src/zodbverify/verify_oid.py(230)verify_oid()
-> pickle, state = storage.load(oid)

(Pdb++) obj
<BTrees.OIBTree.OITreeSet object at 0x110b97ac0 oid 0x3b1d06 in <Connection at 10c524040>>

(Pdb++) pp [i for i in obj]
[<InterfaceClass OFS.EtagSupport.EtagBaseInterface>,
 [...]
 <class 'webdav.interfaces.IDAVResource'>,
 <InterfaceClass plone.dexterity.interfaces.IDexterityContent>,
 <InterfaceClass plone.app.relationfield.interfaces.IDexterityHasRelations>,
 [...]
 <SchemaClass plone.supermodel.model.Schema>]

The problem now is that obj has no __parent__ so you have no way of knowing what you're actually dealing with.

When you press c for continue zodbverify will proceed and load the pickle:

(Pdb++) c
2020-08-24 12:20:50,784 INFO    [zodbverify:68][MainThread]
Could not process <class 'BTrees.OIBTree.OITreeSet'> record 0x3b1d06 (b'\x00\x00\x00\x00\x00;\x1d\x06'):
2020-08-24 12:20:50,784 INFO    [zodbverify:69][MainThread] b'\x80\x03cBTrees.OIBTree\nOITreeSet\n[...]'
2020-08-24 12:20:50,786 INFO    [zodbverify:70][MainThread] Traceback (most recent call last):
  File "/Users/pbauer/workspace/dipf-intranet/src-mrd/zodbverify/src/zodbverify/verify.py", line 64, in verify_record
    unpickler.load()
  File "/Users/pbauer/.cache/buildout/eggs/ZODB-5.5.1-py3.8.egg/ZODB/_compat.py", line 62, in find_class
    return super(Unpickler, self).find_class(modulename, name)
ModuleNotFoundError: No module named 'webdav.interfaces'; 'webdav' is not a package

    0: \x80 PROTO      3
    2: (    MARK
    3: c        GLOBAL     'OFS.EtagSupport EtagBaseInterface'
   38: q        BINPUT     1
   40: c        GLOBAL     'Acquisition.interfaces IAcquirer'
   74: q        BINPUT     2
   76: c        GLOBAL     'plone.app.dexterity.behaviors.discussion IAllowDiscussion'
 [...]
 2503: q        BINPUT     55
 2505: c        GLOBAL     'plone.dexterity.schema.generated Plone_0_Document'
 2556: q        BINPUT     56
 2558: c        GLOBAL     'plone.supermodel.model Schema'
 2589: q        BINPUT     57
 2591: t        TUPLE      (MARK at 2)
 2592: q    BINPUT     58
 2594: \x85 TUPLE1
 2595: q    BINPUT     59
 2597: \x85 TUPLE1
 2598: q    BINPUT     60
 2600: \x85 TUPLE1
 2601: q    BINPUT     61
 2603: .    STOP
highest protocol among opcodes = 2

If you are into this you can read the pickle now :slight_smile:

If you press c again zodbverify will build the refernce-tree for this object and ispect if for you:

(Pdb++) c
2020-08-24 12:22:42,596 INFO    [zodbverify:234][MainThread] ModuleNotFoundError: No module named 'webdav.interfaces'; 'webdav' is not a package: 0x3b1d06
2020-08-24 12:22:42,597 INFO    [zodbverify:43][MainThread] Building a reference-tree of ZODB...
2020-08-24 12:22:42,964 INFO    [zodbverify:60][MainThread] Objects: 10000
2020-08-24 12:22:44,167 INFO    [zodbverify:60][MainThread] Objects: 20000
[...]
2020-08-24 12:22:48,665 INFO    [zodbverify:60][MainThread] Objects: 150000
2020-08-24 12:22:48,923 INFO    [zodbverify:60][MainThread] Objects: 160000
2020-08-24 12:22:49,037 INFO    [zodbverify:61][MainThread] Created a reference-dict for 163955 objects.

2020-08-24 12:22:49,386 INFO    [zodbverify:182][MainThread] Save reference-cache as /Users/pbauer/.cache/zodbverify/zodb_references_0x03d7f331f3692266.json
2020-08-24 12:22:49,424 INFO    [zodbverify:40][MainThread] The oid 0x3b1d06 is referenced by:

0x3b1d06 (BTrees.OIBTree.OITreeSet) is referenced by 0x3b1d01 (BTrees.OOBTree.OOBucket) at level 1
0x3b1d01 (BTrees.OOBTree.OOBucket) is referenced by 0x11c284 (BTrees.OOBTree.OOBTree) at level 2
0x11c284 (BTrees.OOBTree.OOBTree) is _reltoken_name_TO_objtokenset for 0x11c278 (z3c.relationfield.index.RelationCatalog) at level 3
0x11c278 (z3c.relationfield.index.RelationCatalog) is relations for 0x1e (five.localsitemanager.registry.PersistentComponents) at level 4
0x1e (five.localsitemanager.registry.PersistentComponents) is referenced by 0x11 (Products.CMFPlone.Portal.PloneSite) at level 5
0x11 (Products.CMFPlone.Portal.PloneSite) is Plone for 0x01 (OFS.Application.Application) at level 6
8< --------------- >8 Stop at root objects
[...]

From this output you can find out that the broken object is (surprise) a item in the RelationCatalog of zc.relation. See the chapter "Frequent Culprits" for information how to deal with these.

Example 1 of using fsoids.py

In this and the next example I will use the script fsoids.py to find out where a broken objects actually sits so I can remove or fix it. The easier approach is to use zodbverify but I discuss this approach here since it was your best option until I extended zodbverify and since it might help you to understand the way references work in the ZODB.

$ ./bin/zodbverify -f var/filestorage/Data.fs

INFO:zodbverify:Done! Scanned 120797 records.
Found 116 records that could not be loaded.
Exceptions and how often they happened:
AttributeError: Cannot find dynamic object factory for module plone.dexterity.schema.generated: 20
AttributeError: module 'plone.app.event.interfaces' has no attribute 'IEventSettings': 3
ModuleNotFoundError: No module named 'Products.ATContentTypes': 4
ModuleNotFoundError: No module named 'Products.Archetypes': 5
ModuleNotFoundError: No module named 'Products.CMFDefault': 20
[...]
ModuleNotFoundError: No module named 'webdav.EtagSupport'; 'webdav' is not a package: 16

Follow the white rabbit...

./bin/zopepy ./parts/packages/ZODB/scripts/fsoids.py var/filestorage/Data.fs 0x35907d

oid 0x35907d BTrees.OIBTree.OISet 1 revision
    tid 0x03c425bfb4d8dcaa offset=282340 2017-12-15 10:07:42.386043
        tid user=b'Plone xxx@xxx.de'
        tid description=b'/Plone/it-service/hilfestellungen-anleitungen-faq/outlook/content-checkout'
        new revision BTrees.OIBTree.OISet at 282469
    tid 0x03d3e83a045dd700 offset=421126 2019-11-19 15:54:01.023413
        tid user=b''
        tid description=b''
        referenced by 0x35907b BTrees.OIBTree.OITreeSet at 911946038

[...]

Follow referenced by ...

./bin/zopepy ./parts/packages/ZODB/scripts/fsoids.py var/filestorage/Data.fs 0x35907b

[...]
referenced by 0x3c5790 BTrees.OOBTree.OOBucket


./bin/zopepy ./parts/packages/ZODB/scripts/fsoids.py var/filestorage/Data.fs 0x3c5790

[...]
referenced by 0x11c284 BTrees.OOBTree.OOBTree
[...]

./bin/zopepy ./parts/packages/ZODB/scripts/fsoids.py var/filestorage/Data.fs 0x11c284

[...]
referenced by 0x3d0bd6 BTrees.OOBTree.OOBucket
[...]

./bin/zopepy ./parts/packages/ZODB/scripts/fsoids.py var/filestorage/Data.fs 0x3d0bd6

[...]
referenced by 0x11c278 z3c.relationfield.index.RelationCatalog
[...]

Found it!!!!!

Example 2 of using fsoids.py

In this example zodbverify found a trace of Products.PloneFormGen even though you think you safely uninstalled the addon (e.g. using https://github.com/collective/collective.migrationhelpers/blob/master/src/collective/migrationhelpers/addons.py#L11)

Then find out where exists in the tree by following the trail of items that reference it:

./bin/zopepy ./parts/packages/ZODB/scripts/fsoids.py var/filestorage/Data.fs 0x372d00
oid 0x372d00 Products.PloneFormGen.content.thanksPage.FormThanksPage 1 revision
    tid 0x03d3e83a045dd700 offset=421126 2019-11-19 15:54:01.023413
        tid user=b''
        tid description=b''
        new revision Products.PloneFormGen.content.thanksPage.FormThanksPage at 912841984
        referenced by 0x372f26 BTrees.OIBTree.OIBucket at 912930339
        references 0x372e59 Products.Archetypes.BaseUnit.BaseUnit at 912841984
        references 0x372e5a Products.Archetypes.BaseUnit.BaseUnit at 912841984
        [...]
    tid 0x03d40a3e52a41633 offset=921078960 2019-11-25 17:02:19.368976
        tid user=b'Plone pbauer'
        tid description=b'/Plone/rename_file_ids'
        referenced by 0x2c1b51 BTrees.IOBTree.IOBucket at 921653012

Follow referenced by until you find something...

./bin/zopepy ./parts/packages/ZODB/scripts/fsoids.py var/filestorage/Data.fs 0x2c1b51
oid 0x2c1b51 BTrees.IOBTree.IOBucket 1 revision
    [...]

Here I skip the trail of referenced by until I find 0x280184 five.intid.intid.IntIds:

./bin/zopepy ./parts/packages/ZODB/scripts/fsoids.py var/filestorage/Data.fs 0x280184
oid 0x280184 five.intid.intid.IntIds 1 revision
    tid 0x03d3e83a045dd700 offset=421126 2019-11-19 15:54:01.023413
        tid user=b''
        tid description=b''
        new revision five.intid.intid.IntIds at 8579054
        references 0x28018c <unknown> at 8579054
        references 0x28018d <unknown> at 8579054
    tid 0x03d3e90c4d3aed55 offset=915868610 2019-11-19 19:24:18.100824
        tid user=b' adminstarzel'
        tid description=b'/Plone/portal_quickinstaller/installProducts'
        [...]

That is the IntId-Catalog from zope.intid. The problem seems to be that similar to the zc.relation catalog rerefences to broken objects stay in the catalog and need to be removed manually.

Here is a example of how to remove all broken objects from the catalog in a pdb-session:

(Pdb++) from zope.intid.interfaces import IIntIds
(Pdb++) from zope.component import getUtility
(Pdb++) intid = getUtility(IIntIds)
(Pdb++) broken_keys = [i for i in intid.ids if 'broken' in repr(i.object)]
(Pdb++) for broken_key in broken_keys: intid.unregister(broken_key)
(Pdb++)
(Pdb++) import transaction
(Pdb++) transaction.commit()

After packing the DB the problem is gone. \o/

Other Options

Use zodbbrowser to inspect the ZODB.
It is Zope3 app to navigate a ZODB in a browser.
At least I had problems getting it to run with a Plone-ZODB.

Use zc.zodbdgc
This tool can validate distributed databases by starting at their root and traversing to make sure all referenced objects are reachable.
Optionally, a database of reference information can be generated.

Use collective.zodbdebug
A great tool to build and inspect reference-maps and backreference-maps of a ZODB.
So for it does not work with Python 3 yet.
Some if its features are also part of zodbverify.

Frequent Culprits

IntIds and Relations

The IntId-Tool and the relation-catalog are by far the most requent issues, especially if you migrated from Archetypes to Dexterity.

There may be a lot of RelationValues in these Tools that still reference objects that cannot be loadedif these removed objects were not properly removed.

The following code from collective.relationhelpers cleans up the IntId- and Relation-catalog but keeps relations intact. For large sites it may take a while to run because it also needs to recreate linkintegrity-relations.

.. code-block:: python

from collective.relationhelpers.api import cleanup_intids
from collective.relationhelpers.api import purge_relations
from collective.relationhelpers.api import restore_relations
from collective.relationhelpers.api import store_relations

def remove_relations(context=None):
    # store all relations in a annotation on the portal
    store_relations()
    # empty the relation-catalog
    purge_relations()
    # remove all relationvalues and refs to broken objects from intid
    cleanup_intids()
    # recreate all relations from a annotation on the portal
    restore_relations()

For details see https://github.com/collective/collective.relationhelpers/blob/master/src/collective/relationhelpers/api.py

Annotations

Many addons and features in Plone store data in Annotations on the portal or on content.

It's a good idea to check IAnnotations(portal).keys() after a migration for Annotation that you can safely remove.

Here is a example where wicked (the now-removed wiki-style-editing feature of Plone) stored it's settings in a Annotation:

def cleanup_wicked_annotation(context=None):
    ann = IAnnotations(portal)
    if 'plone.app.controlpanel.wicked' in ann:
        del ann['plone.app.controlpanel.wicked']

Another example is files from failed uploads stored by plone.formwidget.namedfile in a annotation:

def cleanup_upload_annotation(context=None):
    # remove traces of aborted uploads
    ann = IAnnotations(portal)
    if ann.get('file_upload_map', None) is not None:
        for uuid in ann['file_upload_map']:
            del ann['file_upload_map'][uuid]

TODO

Finish, merge and document https://github.com/plone/zodbverify/pull/8

9 Likes

Appendix

Migrating a ZODB from py2 to py3

Since people often encounter issues with their ZODB after migrating here is a quick dive into migrating a ZODB from Python 2 to Python 3.

The migration is basically calling the script zodbupdate in py3 with the parameter --convert-py3.

$ ./bin/zodbupdate --convert-py3

You need to pass it the location of the database, the defaul-encoding (utf8) and a fallback-encoding (latin1) for items where decoding to utf8 fails.

Example:

$ ./bin/zodbupdate --convert-py3 --file=var/filestorage/Data.fs --encoding=utf8 --encoding-fallback latin1

Updating magic marker for var/filestorage/Data.fs
Ignoring index for /Users/pbauer/workspace/projectx/var/filestorage/Data.fs
Loaded 2 decode rules from AccessControl:decodes
Loaded 12 decode rules from OFS:decodes
[...]
Committing changes (#1).

After that you should be able to use your ZODB in Python 3.

The process in a nutshell:

#. First, run bin/zodbupdate -f var/filestorage/Data.fs
So no python3 convert stuff yet!
This will detect and apply several explicit and implicit rename rules.

#. Then run bin/instance zodbverify.
If this still gives warnings or exceptions, you may need to define more rules and apply them with zodbupdate.
But you can still choose to migrate to py3 if this shows errors.

#. Using Python 3 run bin/zodbupdate --convert-py3 --file=var/filestorage/Data.fs --encoding utf8

#. For good measure, on Python 3 run bin/instance zodbverify.

Read the docs: https://docs.plone.org/manage/upgrading/version_specific_migration/upgrade_zodb_to_python3.html

See also: Zodbverify: Porting Plone with ZopeDB to Python3

Dealing with oids

Transforming oids from int to hex and text and vice versa:

>>> from ZODB.utils import p64
>>> oid = 0x2c0ab6
>>> p64(oid)
b'\x00\x00\x00\x00\x00,\n\xb6'

>>> from ZODB.utils import oid_repr
>>> oid = b'\x00\x00\x00\x00\x00,\n\xb6'
>>> oid_repr(oid)
'0x2c0ab6'

>>> from ZODB.utils import repr_to_oid
>>> oid = '0x2c0ab6'
>>> repr_to_oid(oid)
b'\x00\x00\x00\x00\x00,\n\xb6'

Get a path for blobs:

from ZODB.blob import BushyLayout
if isinstance(oid, int):
    # e.g. oid = 0x2c0ab6
    from ZODB.utils import p64
    oid = p64(oid)
return BushyLayout.oid_to_path(None, oid)

Load a obj by oid from ZODB in a pdb-prompt:

oid = 0x2c0ab6
from ZODB.utils import p64
app = self.context.__parent__
obj = app._p_jar.get(p64(oid))

Links

Quite a lot of people from the Plone/Zope communities have wriotten about this issue. I learned a lot from these posts:

10 Likes

Thank you Philip,
very nice article.

I wonder if there is something like fstest.py and fsoids.py but for relstorage?

I think it is should be rather easy to make zodbverify compatible with relstorage.

Until that is done you could simply transform your database to Filestorage with zodbconvert to be able to use it with zodbverify or the other scripts. See https://relstorage.readthedocs.io/en/latest/zodbconvert.html

A config I used recently is:

<relstorage source>
    shared-blob-dir false
    blob-dir ./blobs
    <postgresql>
        dsn dbname=foo user=bar host=localhost password=baz port=5432
    </postgresql>
</relstorage>

# Convert to Filestorage with blobs
<blobstorage destination>
  blob-dir ./var/blobstorage
  <filestorage>
    path ./var/filestorage/Data.fs
  </filestorage>
</blobstorage>

Call it with ./bin/zodbconvert convert_to_datafs.conf. Converting is really fast and you can easily convert back and forth.

1 Like

Thank you for quick reply,
I had been thinking about converting but was hoping that there is a better way to check database.

Thanks @pbauer for doing this great documentation! :bowing_man:

The code snippet to remove an oid though does not seem to work, at least for me :confused:

./bin/instance debug

from zope.component.hooks import setSite
from ZODB.utils import p64

setSite(app.Plone)
broken_oids = [0x0379663b, 0x0945d5bb]
del app._p_jar[p64(broken_oids[0])]
del app._p_jar[p64(broken_oids[1])]

I get a

TypeError: 'Connection' object does not support item deletion

any ideas? :man_shrugging:

I'm trying to delete the offending oid as running ./bin/zodbverify -f Data.fs -o 0x0379663b -D

Yields

INFO:zodbverify:The oid 0x0379663b is referenced by:

i.e. no object seems to reference this oid :thinking:

@gforcada: Yes, option 5 is probably a terrible idea anyway. The point is that during packing the database any objects that are not referenced should be deleted anyway.
What beats me though is why zodbverify tells you that the oid is not referenced. Did you by any change forget to pack the database before running zodbverify?

Oh, that was it :man_facepalming: the oids were pointing to a couple of plone registry entries. So I deleted them, but a re-run on zodbverify was still reporting them and then without any other object referencing them.

I just packed and re-run zodbverify and the oids were no longer reported :tada:

Maybe you should update the blog post to add a note about that: if you make changes and delete/fix stuff on the database, be sure to pack to ensure that the astray oids are gone. :+1:

1 Like