Zeopack fails raising KeyError

We have a site running distributed in two different servers; server A is running ZEO as ZRS master and server B is running ZEO as ZRS slave.

seems that a couple of weeks ago the zeopack command is not finishing; I just tried today in both server and command fails with KeyError like this:

$ bin/zeopack 0
Traceback (most recent call last):
  File "bin/zeopack", line 42, in <module>
    sys.exit(plone.recipe.zeoserver.pack.main(host, port, unix, days, username, password, realm, blob_dir, storage))
  File "/opt/plone/example.com/eggs/plone.recipe.zeoserver-1.2.9-py2.7.egg/plone/recipe/zeoserver/pack.py", line 58, in main
    _main(*args, **kw)
  File "/opt/plone/example.com/eggs/plone.recipe.zeoserver-1.2.9-py2.7.egg/plone/recipe/zeoserver/pack.py", line 39, in _main
    cs.pack(wait=True, days=int(days))
  File "/opt/plone/example.com/eggs/ZODB3-3.10.5-py2.7-linux-x86_64.egg/ZEO/ClientStorage.py", line 916, in pack
    return self._server.pack(t, wait)
  File "/opt/plone/example.com/eggs/ZODB3-3.10.5-py2.7-linux-x86_64.egg/ZEO/ServerStub.py", line 155, in pack
    self.rpc.call('pack', t, wait)
  File "/opt/plone/example.com/eggs/ZODB3-3.10.5-py2.7-linux-x86_64.egg/ZEO/zrpc/connection.py", line 768, in call
    raise inst # error raised by server
KeyError: '\x1dx'

any hints?

This looks like a traceback where the ZEO client is re-raising an exception that originally occurred on the server. To find the original traceback you'll need to look in the ZEO server log.

thanks, David, this is what I found on the master:

2017-03-04T17:02:07 (::ffff: pack(time=1488571327.685818) started...
2017-03-04T17:17:14 (1190) Error raised in delayed method
2017-03-04T17:17:14 (::ffff: disconnected
2017-03-04T18:00:58 Unexpected error
Traceback (most recent call last):
  File "/home/cartacapital/cartacapital.portal.buildout/eggs/ZODB3-3.10.5-py2.7-linux-x86_64.egg/ZODB/ConflictReso
lution.py", line 234, in tryToResolveConflict
    inst = klass.__new__(klass, *newargs)
TypeError: object.__new__(X): X is not a type object (BadClass)

on the slave:

2017-03-04T17:01:06 (::ffff: pack(time=1488571266.607441) started...
2017-03-04T17:16:00 (33903) Error raised in delayed method

the TypeError doesn't seem to be related with the pack.

is the fsrecover.py script an option here?

I don't know.

This looks like it might be related to trying to resolve a ConflictError if the class that conflicted is not present in the ZEO server's Python environment. Unfortunately the traceback doesn't indicate which class is involved (BadClass is some substitute created by the conflict resolving code). So you may need to add some debug logging in ConflictResolution.py. Or it may be enough to make sure that your project-specific package is included in the zeoserver's eggs.

I don't know much about the ZODB packing algorithm so don't know why that would trigger a ConflictError. If you ask on the ZODB list Jim may be able to point you in the right direction.

1 Like

thank you, very much, David; I opened a thread on the ZODB list and, according to Jim, seems we have some missing objects from the database.

he suggested to add pack-gc false to the <filestorage> section of the ZEO server configuration to disable garbage collection.

I did so, and I was able to pack the ZODB to 7GB from 9GB.

I was wondering if it's possible to add that to the buildout configuration so I don't lose it on the next update.

Jim also suggested me the following packages to try the garbage collection and see if we can get more information on the missing object:

but they lack decent end-user documentation; does anybody here has used them?

UPDATE: @alert just told me to use pack-gc = false on my ZEO server part according to plone.recipe.zeoserver.

1 Like

See here a good ref about zc.zodbdgc http://www.zodb.org/en/latest/articles/multi-zodb-gc.html

1 Like

thank you, very much, that document is awesome!