Advice on a Plone 5.1 migration and upgrade

We have a Plone 5.1 installation running on AWS on Elastic Container Service using the official Docker image. For data storage, we're using an EFS mount and there are Plone containers running as clients, and one as a server. We've been asked to move to another AWS account for just Plone.

This means moving the infrastructure and the data. I've recreated the infrastructure in the new account just fine, but the problem is the data. I can log on to the containers and run the /plone/instance/bin backup problem without issues, the trouble is the restore. Since ZEO is running as a container on the backend, I can't just log into the container and run the restore, because shutting down zeo means killing the container. I've tried creating an AWS Batch process to run the restore command in its own container, but Batch is misbehaving.

To make things even more complicated, it's time to move to Plone 5.2 (along with a couple of addons), meaning even if I find a way to restore the data so that ZEO can read it, then I have to figure out how to upgrade.

Is there any other way to back up and restore? Should I just abandon the idea of using containers entirely and switch to just a plain old server? The number of steps to install looks really daunting, and the container made it really easy to test changes to Plone itself and then just redeploy it. I'm reluctant to move back to a client-server infrastructure but I'm running out of ideas. Any suggestions would be appreciated. Thank you.

You are describing several problems ar once in one mail. I don’t really understand which part of Plone or a normal ZEO setup are part of the issues. It seems to me you are struggling with Docker.

A container running ZEO should have it’s data.fs and blobstorage also mounted on a volume so that its data doesn’t get lost when the container is terminated. This is the same as you would use a single container for a Zope instance.

Is your problem that you have started Zope/ZEO in a container without persisting the data and the data is now in an ‘ephemeral’ layer and will be destroyed when the container is terminated?

The problem is that the data is persisted on a Docker volume, and now I need to move it to another ECS service. Then I need to upgrade. I guess my question is, what's the best way to do this?

My hack here is to start the container with /bin/sleep 1000000000, then log in, perform actions needed, logout, stop, restart without hack.
In docker compose or docker swarm you can add this to your YAML file as command: /bin/sleep 1000000000".

I fact I did exact this, update a Plone 5.1-Py2.7 site to 5.2.2-py-3.8. With my docker swarm setup it was not a big deal - so no AWS - I had full access to the storage and copying data around was probably easier. but docker cp should work on AWS as well.

The upgrade path itself is always the same and independent from hosting: make it work on Plone 5.2.2 Python 2.7 w/o any traces of Archetypes. Then make all work on Python 3 (without data). Next upgrade the database. Now all runs on Py3. Test it and then Go Live. The whole process is documented:
https://docs.plone.org/manage/upgrading/version_specific_migration/upgrade_to_python3.html

If any problems are coming up, ask here.

I managed to finally figure out how to get AWS Batch to restore the DB onto the EFS mount it was using. It's a complex setup: Log into the old container, back up the data with /plone/instance/bin/backup, copy the data to S3, log into an instance in the new account with EFS mounted, copy the data down from S3, run an AWS Batch job which runs a Plone container that mounts the EFS share and run /plone/instance/bin/restore. I like your idea better because although more hacky, it doesn't involve so much more infrastructure.

Regarding the upgrade... would you recommend the same procedure for upgrading the database? Create a container, force it to sleep, log in, then run zodbupdate?

We use host volumes. EFS for blobs and EBS for data.fs.
Much simpler and you know if everything falls apart with the containers then you have the data thats seperate.
All except for the zope startup issue whereby if the data.fs is missing but the blobs are there zope deletes all your blobs. Thats partly comes about by us using two different mounts for blobs and data.fs.

That's how I did it.