I have long ago moved from Archetypes to Dexterity, but there are some lingering artifacts in my ZODB that I would like to purge. (I know these might not need to be purged, but I'll get to my reasons why I think they might still be a problem later). I've put in a lot of work in eliminating some of this content from my dbs already but there are a couple tricky objects that are difficult to find.
First, here is the zodbverify result for one of my dbs after I did some cleanup:
INFO:zodbverify:Done! Scanned 146823 records.
Found 280 records that could not be loaded.
Exceptions and how often they happened:
ImportError: No module named Archetypes.BaseUnit: 8
ImportError: No module named Product: 2
AttributeError: 'module' object has no attribute 'IPersistentExtra': 102
ImportError: No module named Archetypes.ReferenceEngine: 31
ImportError: No module named ATContentTypes.tool.metadata: 126
ImportError: No module named ResourceRegistries.interfaces.settings: 9
ImportError: No module named ATContentTypes.content.document: 2
This is after I ran a script that cleaned up a couple locations. This cleans up three areas:
- Removes any object from zope version control repository in portal_historiesstorage that raises a POSKeyError or is an instance of Removed
- Removes any object from portal_historyidhandler shadow storage that cannot be retrieved, is broken, or removed
- Removes any broken persistent utilities (same behavior as wildcard.fixpersistentutilities)
script:
import plone.api
from Products.CMFCore.interfaces import IMetadataTool
from Products.CMFEditions.ZVCStorageTool import Removed
from ZODB.POSException import POSKeyError
from ZODB.broken import PersistentBroken, BrokenModified
from zope.component import getSiteManager
def remove_bad_utilities(context=None):
if context:
sm = context.getSiteManager()
else:
sm = getSiteManager()
subscribers = sm.utilities._subscribers[0]
adapters = sm.utilities._adapters[0]
if IMetadataTool in subscribers:
logger.info('deleting subscriber: {}'.format(subscribers[IMetadataTool]))
del subscribers[IMetadataTool]
sm.utilities._subscribers = [subscribers]
if IMetadataTool in adapters:
logger.info('deleting adapter: {}'.format(adapters[IMetadataTool]))
del adapters[IMetadataTool]
sm.utilities._adapters = [adapters]
def remove_bad_histories():
bad_repo_ids = set()
tool = plone.api.portal.get_tool('portal_historiesstorage')
for sequence in tool.zvc_repo._histories:
for version in tool.zvc_repo[sequence]._versions:
try:
obj = tool.zvc_repo[sequence].getVersionById(version)._data._object.object
except POSKeyError:
bad_repo_ids.add(sequence)
logger.warning('POSKey error on bad object from zvc repo: %s' % sequence)
except AttributeError:
if isinstance(tool.zvc_repo[sequence].getVersionById(version)._data._object, Removed):
logger.warning('Ignore removed object: %s' % sequence)
else:
logger.warning('Unknown error: %s' % sequence)
else:
if hasattr(obj, 'aq_base'):
obj = obj.aq_base
if isinstance(obj, PersistentBroken):
bad_repo_ids.add(sequence)
logger.warning('Removing broken object from zvc repo: %s' % obj.__module__)
for bid in bad_repo_ids:
del tool.zvc_repo._histories[bid]
# remove deleted items of deprecated class objects from shadow storage
deleted = []
total_hids = []
hidhandler = plone.api.portal.get_tool('portal_historyidhandler')
for hid in tool._getShadowStorage(autoAdd=False)._storage:
workingCopy = hidhandler.unrestrictedQueryObject(hid)
if not workingCopy:
try:
tool.retrieve(hid).object.object
except KeyError:
logger.warning('Could not retrieve history id %s, removing from shadow storage' % hid)
deleted.append(hid)
except BrokenModified:
logger.warning('Broken history id %s, removing from shadow storage' % hid)
deleted.append(hid)
except AttributeError:
logger.warning('Removed object %s, removing from shadow storage' % hid)
deleted.append(hid)
total_hids.append(hid)
logger.warning('Removing %d out of %d history ids in ZVC storage' % (len(deleted), len(total_hids)))
for hid in deleted:
del tool._getShadowStorage(autoAdd=False)._storage[hid]
So this cleans up quite a bit, and at this point the remaining objects do not seem to be accessible through the ZMI. How can I going about finding and removing them? Here is a sample of a particular record that could not be found related to ATContentTypes.
INFO:zodbverify:
Could not process unknown record '\x00\x00\x00\x00\x00\x01C\xb6':
INFO:zodbverify:'cProducts.ATContentTypes.tool.metadata\nMetadataTool\nq\x01.}q\x02(U\x12__ac_local_roles__q\x03}q\x04U\nadmin_ericq\x05]q\x06U\x05Ownerq\x07asU\x04DCMIq\x08(U\x08\x00\x00\x00\x00\x00\x01C\xfacProducts.ATContentTypes.tool.metadata\nMetadataSchema\nq\ttq\nQU\x05titleq\x0bU0Controls metadata like keywords, copyrights, etcq\x0cu.'
INFO:zodbverify:Traceback (most recent call last):
File "/sprj/btp_zope_plone5/plone-btp-dev-02/buildouts/eggs/zodbverify-1.0.1-py2.7.egg/zodbverify/verify.py", line 58, in verify_record
class_info = unpickler.load()
ImportError: No module named ATContentTypes.tool.metadata
portal_metadata tool would be the obvious answer but that has already been removed from the ZMI (for all 10 sites on this db) and the db packed. There was also something referencing it in the persistent utilities, but that also has been deleted. Where else might this object live?
I also tried using the alias_module function from plone.app.upgrade to render it harmless
try:
from Products.ATContentTypes.tool.metadata import MetadataTool
except ImportError:
alias_module('Products.ATContentTypes.tool.metadata.MetadataTool', SimpleItem)
Unfortunately this seems to cause the zcml condition "installed Products.ATContentTypes" to evaluate as True which leads to a host of other problems.
I know that this section of the migration guide says that the site may still work with these warnings. My concern is that some of these objects turned out to be problems that were not immediately noticeable. For instance, the history storage did not appear to be an issue until I went to portal_historiesstorage or attempted to edit certain pages. I'd much rather clean up the ZODB than risk leaving in stealth bugs.