Plone 4 -> Plone 5 migrations: Data.fs significantly larger

Meanwhile we migrated several Plone sites from small to large from Plone 4 to Plone 5.2/Python 3.
In all cases the sizes of the Data.fs (packed) has grown at least by a factor of 2 or 3 - with almost a constant blobstorage size...anyone with similar experiences?

Is this using ETL migration or using the inplace migration?

Always full export - import (UGent :slight_smile: )

We did a 80gb migration from plone 4.3 (Archetype) to plone 5.2.1. (Dexterity)
We did the migration with export and import (ETL).
We are at 19GB now.

We aggressively removed legacy stuff not used (so the schema files smaller now) - so the comparison not 1:1.
Also reduced on portal_catalog indexes to make system faster on saves and delete. Number of objects unchanged.

Of course your situation different - but still sharing since we ended lower.

I asked about in place vs ETL because you could have a not correctly emptied portal_historiesstorage where all the 'old' history objects from the archetypes versions are still dangling. But I think @pbauer solved this somewhere between 5.0 and 5.1 in plone.app.contenttypes.

We have had these 'where are those big content items in our Plone site' a few times in the last ten years. for Plone 3 we wrote mr.inquisition, but it was AT only.

I tried to patch it for dexterity items, but the default size reporting on Plone itself in the folder-contents is also rather broken. The size column relies/relied on some very low Zope webdav related get_size method, which might even be removed at the moment in Plone 5.2. You can could override get_size and build custom size-calculators for all different content types, where you know which fields can take up space.

Another source of storage comes from annotations, like the image scales stored on items. And then 'Zope' level objects like the Catalog as Niels mentions. and the history storage.

@zopyx I have not experienced this in any of my migrations.

@pbauer

Original site: 11.955.980 objects in main DB
Migrated site: 3.942.537 objects in main DB

Both sites have about 65.000 content objects in portal_catalog