Upgrade to Plone 5.0.4 from 4.3.9 fails due to zombie pdfpeek

Long ago, I made the mistake of trying out the collective.pdfpeek add-on. I've since then never been able to fully remove it, and it has plagued all my Plone upgrades since then.

I recently added wildcard.fixpersistentutilities to my Plone 4 site, and after messing around with it (removing anything with "pdfpeek" in its name), I got to the point where my Plone 4 site would load without the collective.pdfpeek product installed. It seemed like I had fully eradicated it!

However, I then imported a copy of that cluster's filestorage and blobstorage into a new Plone 5 instance, and tried to run the migration for this site. It fails with the following error:

Upgrade aborted. Error:
Traceback (most recent call last):
  File "/opt/plone/buildout-cache/eggs/Products.CMFPlone-5.0.4-py2.7.egg/Products/CMFPlone/MigrationTool.py", line 268, in upgrade
    step['step'].doStep(setup)
  File "/opt/plone/buildout-cache/eggs/Products.GenericSetup-1.8.2-py2.7.egg/Products/GenericSetup/upgrade.py", line 166, in doStep
    self.handler(tool)
  File "/opt/plone/buildout-cache/eggs/plone.app.upgrade-1.3.24-py2.7.egg/plone/app/upgrade/v40/alphas.py", line 377, in cleanUpSkinsTool
    transaction.savepoint(optimistic=True)
  File "/opt/plone/buildout-cache/eggs/transaction-1.1.1-py2.7.egg/transaction/_manager.py", line 101, in savepoint
    return self.get().savepoint(optimistic)
  File "/opt/plone/buildout-cache/eggs/transaction-1.1.1-py2.7.egg/transaction/_transaction.py", line 260, in savepoint
    self._saveAndRaiseCommitishError() # reraises!
  File "/opt/plone/buildout-cache/eggs/transaction-1.1.1-py2.7.egg/transaction/_transaction.py", line 257, in savepoint
    savepoint = Savepoint(self, optimistic, *self._resources)
  File "/opt/plone/buildout-cache/eggs/transaction-1.1.1-py2.7.egg/transaction/_transaction.py", line 690, in __init__
    savepoint = savepoint()
  File "/opt/plone/buildout-cache/eggs/ZODB3-3.10.5-py2.7-linux-x86_64.egg/ZODB/Connection.py", line 1123, in savepoint
    self._commit(None)
  File "/opt/plone/buildout-cache/eggs/ZODB3-3.10.5-py2.7-linux-x86_64.egg/ZODB/Connection.py", line 623, in _commit
    self._store_objects(ObjectWriter(obj), transaction)
  File "/opt/plone/buildout-cache/eggs/ZODB3-3.10.5-py2.7-linux-x86_64.egg/ZODB/Connection.py", line 658, in _store_objects
    p = writer.serialize(obj)  # This calls __getstate__ of obj
  File "/opt/plone/buildout-cache/eggs/ZODB3-3.10.5-py2.7-linux-x86_64.egg/ZODB/serialize.py", line 422, in serialize
    return self._dump(meta, obj.__getstate__())
  File "/opt/plone/buildout-cache/eggs/ZODB3-3.10.5-py2.7-linux-x86_64.egg/ZODB/serialize.py", line 431, in _dump
    self._p.dump(state)
PicklingError: Can't pickle <class 'collective.pdfpeek.async.IQueue'>: import of module collective.pdfpeek.async failed
End of upgrade path, main migration has finished.
The upgrade path did NOT reach current version.
Migration has failed

Why does Plone still think collective.pdfpeek is in use? I removed all traces I could find of it in the Plone 4 site. Where is it hiding? How can I remove it? Has anyone else had this problem?

1 Like

Put pdb into the _dump() method and investigate the corresponding object to be pickled.

-aj

This is an interface.
Relations catalog store Interface objects.
Apart from that, what zopyx says makes sense.

And storing interfaces persistent is evil, because it makes upgrades, removals of packages and so on one painful headache. It would be much better to store just the dotted path, resolve them when needed and if this fails handle errors in a polite way.

3 Likes

The Zope Base package used by the plone relations package says so too.

If it gets 'too complicated' to get rid of it, you could try:

1 Like

Thanks all for the suggestions. I haven't had time to get back to this yet, but when I do, I guess I'll see if I can figure out what is being pickled here and where it comes from.

(Fwiw, I already tried using wildcard.fixmissing to remove this interface. This is after having done that.)

@nturner Maybe you can check some ideas from We need to stop using an add-on that registered a browserlayer. Which is the best way to do it and avoid pickling errors between upgradeSteps?, specially about the "module alias" hack.

Not exactly "pretty" having this kind of hack, but at least you can try to upgrade to Plone 5.

OK, I started my Plone 5 instance using python -m pdb bin/instance fg after hacking an exception handler that calls import pdb; pdb.set_trace(); raise into the relevant part of /opt/plone/buildout-cache/eggs/ZODB3-3.10.5-py2.7-linux-x86_64.egg/ZODB/serialize.py. Not sure if that's what folks had in mind.

From pdb at that breakpoint, I can see that the state argument to _dump contains a dict with an entry that looks like this:

(<class 'collective.pdfpeek.async.IQueue'>, 'collective.pdfpeek.conversion_aardvarque-site'): (<persistent broken collective.pdfpeek.async.Queue instance '\x00\x00\x00\x00\x00\x02\xd1v'>, u'', None),

I guess I need to figure out what in my Data.fs is causing this and how to get rid of it? I assume this involves using some kind of raw ZODB manipulation tool? Any clues?

Look at wildcard.fixmissing @espenmn linked. I'm pretty sure you can remove that interface with it, i did that days ago.

wildcard.fixmissing is a bit outdated but its basic features still work.


EDIT: wildcard.fixpersistentutilities is outdated.

Thanks for trying to help @pcdummy. If you read back through this topic, you may notice that I mention in 2 different places that I tried that before posting here.

[Edit: I am mistaken --- I was confusing wildcard.fixmissing with wildcard.fixpersistentutilities. I'll have a look at the former. Thanks.]