Flame alert: the Plone Python 3 upgrade story

I've been hearing how the Django community is trying to make its upgrades "super slick easy" (and they're already pretty easy) and I feel a bit sad about our own Py3 story. The process is complicated and full of black magic. I say this knowing how much it has been worked on by so many of you.

Forcing our installed base to go through this in-place upgrade process is not very encouraging.

I think we should kill the in-place upgrade story and go with transmogrifier only.

/me stands back to avoid being burned

4 Likes

:popcorn:

6 Likes

Both solutions have their pros and cons.

The in-place migration is likely more suited for "smaller" sites where you want to preserve the existing configuration and content and when there are no changes in functionality, code or whatever. The main issue with in-place migration is the major pain in the a** factor in Plone projects called ZODB. The ZODB neither a decent database solution nor a reasonable database regarding migration. In a recent Plone 4.3 -> 5.2 migration we have high migration costs directly related to the ZODB. One part of the cost is related to an inefficient AT -> DX migration of Dexterity, the other part is related to the ZODB format migration. There is no offense included with this statement. I appreciated all work and efforts done here, in particular many thanks to Philipp.

An export-import migration is more suited when you have to reorganize your site anyway or doing large changes,

The overall problem (which is related to efforts and costs) is performance. I am currently helping with the migration of the University Ghent website. 90.000 content objects, 60 GB of data..some would call it "big data", I call this peanuts. Importing the data takes 36 hour - perhaps it might be possible bring the duration done with some optimizations. But out of the box performance without trickery is far away from being called "enterprise-level" CMS. Importing the same data (based on their JSON representation) into a document database takes 10 minutes.

So the decision of in-place vs. export-import migration depends on various factors. There is likely room for optimizations with the in-place migration and its complexity...but an export-import migration also is a complex operation.

3 Likes

I'd take simple-but-slow over complex-and-possibly-faster

I tend to agree that the official way should the lowest risk way. And the lowest risk way is not inplace unfortunately. Any reasonable size site we haven't done in place. There are just too many things that can bite you.
But transmogrifier is not simple either, or as pointed out, not fast. I suspect simplicity and speed could be improved on but not sure by who.
Things like turning off creation events and delaying indexing and other ways to simplify bulk creating objects might speed things up for large imports.

1 Like

I think it's a bit comparing two tomatoes to a fruit salad. Does that make sense? Mabye not :stuck_out_tongue:

Django is a low level framework, Plone an 'end product'. If you migrate a Django project, The 'Django Project' has a much smaller responsability footprint: they have/had to upgrade/fix their ORM, but the data is in a relational database, which was chosen by the end developers using Django in their project. Also everything built on top of the Django foudation in any Django project is the end developers task. Maybe you should compare the migration stories of DjangoCMS, FeinCMS and other cms'es built with Django.

Plone manages and is responsible to a very high level for your data with its own database.
And as Andreas correclty remarks, there is a lot of room for improvement there which we cannot share with other projects.

Curiosity: how fast is an in place migration if you already use Relstorage with Postgres/Mysql. Would it be much faster?

3 Likes

The complete Relstorage thing just moves the pain or burden down the street to a different layer. I would assume that a migration on Relstorage is slower due to a higher overhead. Nothing beats a ZEO server in simplicity, efficiency and stupidity.

3 Likes

Could I assume that converting sql(relstorage) db to filestorage will speed up migration?

unlikely in a significant way...

The migration to Python 3 is neither complicated nor full of magic. It consists of one command that you need to run:

./bin/zodbupdate --convert-py3 --file=var/filestorage/Data.fs --encoding=utf8 --encoding-fallback latin1

The migration itself is really fast and should not be a problem even for large Databases. If your database before the migration is ok it should be ok afterwards. There are some edge-cases that are covered in https://github.com/collective/collective.migrationhelpers/blob/master/src/collective/migrationhelpers/post_python3_fixes.py.

What you are really complaining about is three other issues:

  1. The ZODB is not forgiving if you change or remove code and do not properly update objects that reference that code. It's not simple to know all the possible places where references happen (IntIds and RelationCatalog!). https://www.starzel.de/blog/zodb-debugging discusses these problems in detail.

  2. The migration from Archetypes to Dexterity (especially folder with a lot of content) can take a very long time. But the code for that has been around and in use since 2013. Complaining about it in 2020 is a little weird. There are alternative options to do these migrations if a in-place migration is not required. In these cases I usually choose a export/import using restapi serializers and deserializers.

  3. Migrations between Plone-versions can be hard if you have addons that are no longer supported or have changed a lot. The reason is customization and addons that are used and written without properly thinking about uninstalling and upgrades. https://github.com/collective/collective.migrationhelpers contains helpers to deal with most issues that might arise there.

Imagine you want to migrate a Plonesite including configured Content-Rules, all users dashboard-configuration and custom workflow-variables created in the ZMI and everything else that might be in the site. Then your best and only option is a in-place migration. If at the same time you want to update from Plone 4 to 5.2 from AT to DX and from Py2 to Py3 then you should hire someone who knows what to do (wink) because your are asking to upgrade your car from diesel to electric and from red to yellow while the passengers are sitting in it.

8 Likes

Well, you would, I did. Migration on RelStorage is pretty fast compared to those I did on ZEO.

1 Like

No. Conversion is slow. Working on RelStorage is faster,

If you plan to convert a RelStorage from Python 2 to 3, prepare a DB configuration snippet like so (this one is meant to be used in my collective.recipe.template section):

%import relstorage
<relstorage>
    blob-dir ${:blobstorage}
    keep-history ${:history}
    <postgresql>
        dsn dbname='${:dbname}' host='${:dbhost}' user='${:dbuser}' password='${:dbpassword}'
    </postgresql>
</relstorage>
  • Pack your DB in Python 2, then immediatly stop the instance.
  • Run buildout with Python 3, but never start the instance.
  • Then run
    ./bin/zodbupdate --pack --convert-py3 --encoding utf8 -c relstorage.cfg
    

That's it.

4 Likes

My case: an organization with just one Plone site. But it has 100.000 content items and a lot of custom add-ons. We have to migrate to Plone 5 / Python 3 without losing any functionality.

I think in this case the in-place migration path is more recommend than transmogrifier.

In-place migration has its advantages and should not be killed. Please :grimacing:

1 Like

I do agree with Rafael. In-place migration is way better than Transmogrifier when you need to keep all the existing features, security settings, user information and so on.

@jensens, are you using Postgres or MySQL for Relstorage? Also, which Relstorage version?

Same here with a huge intranet, it works. But it was a lot of work. I have automate the migration from 2.5 to 5.2.2/py3, all inplace. 2 Weeks try and error with many hacks and many things that i learned.

1 Like

Any migration scenario that does not include all of the above is like saying that your car can be upgraded, but only if it has not left the assembly line yet.

The issue, in my opinion, is not the technical challenge, but how to price such a migration for the customer so that a) the migration project does not cost more than a brand new site and b) I don't end up working for free. In the end, we should aim to bring as many of the remaining existing Plone sites up to date as possible and that means eliminating as many roadblocks to the upgrade as possible, including cost.

It pains me to say this, but I have a couple of WordPress sites continuously running hands-off for almost 15 years, with lots of user accounts, and sometimes I forget I have them because they keep auto-upgrading themselves and I only ever had one or two very minor security (mostly spam) issues.

1 Like

The key point is risk.
inplace migration works until it doesn't and some weird left over data starts screwing things up.
i.e. starting with a clean site and a known set of data (ie just content + users) will be more work to get back to the same functionality but the chance of it biting you in the ass is less.
This also goes to @fulv point of quoting. Risk turns into a higher price for any fixed price quote. Upgrades are already hard to justify. You don't want to lose money on them as well.
Maybe its not a matter of "getting rid" of inplace but having an officially supported external migration story that is the more foolproof recommended way for newbies? ie you can have a painless riskfree migration of content only. Everything else you have to rebuild.

1 Like

are you using Postgres or MySQL for Relstorage? Also, which Relstorage version?

  • Postgres (had in past bad experience with mysql and relstorage, but thats been a while)
  • I used 1.6.x for Plone 4.3 then 2.x for 5.0/5.1 and now I am using 3.x for Plone 5.2.2
1 Like