Saving old revisions of content when migrating to Plone 5

When you migrate from Plone 4 to Plone 5 and your default content types are converted from Archetypes to Dexterity (plone.app.contenttypes), all history information and previous revisions are deleted.

For some organisations this can be a deal breaker. Has anybody looked into preserving (part of) the history on objects. I'm well aware that technically very difficult, because CMFEditions stores a pickled version of the old content object, restoring these is not something you want. But the changelog is also lost (who changed something when), and I think most of the value of the revisions is for text/wysiwyg fields anyway. Relations etc. are an even bigger problem.

A best effort solution could be to store a 'light' DX version of at least primay/text fields from the older revisions of an object, skip references and converting it into a dexterity type so that the portal difftool can still show changes made, but strictly disallow reverting to the older object/revision. the migration time would double/triple/etc also for any extra older revision you'd want to store.

I've only looked at the included migrations of the @@atct_migrator. Is there any experience with transmogrifier migrations that do preserve history/revisions?

1 Like

I made a custom version of collective.jsonify that exports a separate item for each version of each Document in the original site. When the json is imported into the new site, the history is then automatically rebuilt. It will work without any custom blueprints, but won't show the correct metadata unless you do add a custom blueprint (and I do have an example of that if you need it)

2 Likes

Tried a couple times to figure out how to write history, with no luck. Would love to see the example!

With the content export that includes a separate json file for each revision, the history should be automatically rebuilt on import, is this the part you are having trouble with?

The custom blueprint is just for fixing the metadata on the revisions after they've been imported (to show the correct date and owner)

Here's the blueprint. It needs to be run after the constructor.

class FixVersionHistory(object):
    """Properly update the version history
       so @@historyview shows the correct info
    """

    classProvides(ISectionBlueprint)
    implements(ISection)

    def __init__(self, transmogrifier, name, options, previous):
        self.previous = previous
        self.context = transmogrifier.context

    def __iter__(self):
        for item in self.previous:
            obj = self.context.unrestrictedTraverse(
                safe_unicode(item['_path'].lstrip('/')).encode('utf-8'),
                None)
            if obj is None:
                yield item
                continue

            if '_history' not in item:
                yield item
                continue
            repo_tool = api.portal.get_tool("portal_repository")
            history_metadata = repo_tool.getHistoryMetadata(obj)
            if not history_metadata:
                yield item
                continue
            retrieve = history_metadata.retrieve
            # get the last revision
            step_version = history_metadata.getLength(countPurged=False) - 1
            # get the _history step related to the current item
            item_history = item['_history']
            item_step = None
            for version in item_history:
                # we don't want the one that matches, but the previous step
                if item_step:
                    item_step = version
                    break
                if 'modification_date' not in item:
                    item_step = version
                    break
                if version['timestamp'][:16] == item['modification_date'][:16]:
                    item_step = version
            if item_step:
                # get the obj's last revision
                obj_step = retrieve(step_version, countPurged=False)['metadata']['sys_metadata']
                obj_step['principal'] = item_step['principal']
                ts = time.mktime(datetime.strptime(
                    item_step['timestamp'], "%Y/%m/%d %H:%M:%S %Z").timetuple())
                obj_step['timestamp'] = ts
                obj_step['comment'] = item_step['comment']
                obj_step['review_state'] = item_step['review_state']

            yield item

Oh ok, I ment which Python-methods to use to write an entry to a history (content-, or workflow-history). So I should probably dig into collective.transmogrifier instead, for finding that part, right?

There is a blueprint for workflow history in collective.blueprint.jsonmigrator

Aaah, now I get it, that's nice, gonna give it a try. Thank you!