Script to change image (resolve) urls (when migrating)

I am making a new Plone 5 site for one that is currently in 4 (and came from 3).

There is not too much pages that I need to bring over, but there are a few 100 images. Porting the images went smooth.

Now, if I want to copy the documents, their links to images are using resolveuid.

Is there a smart way to deal with converting image paths

PS: The catalogs will tell me that <img src="097622a974a2943d177bda0792f0af75"> is actually 'imagexx.jpg'

097622a974a2943d177bda0792f0af75 * [/path/folder/imagexx.jpg]

What is the question? Converting UIDs to paths or vice versa?

uids to paths. Then I could just copy and paste and only add the images that are actually used.

Query the UID index. But why do you want to go from UIDs to paths? Why don't you preserve the original UID?

I just want to keep only some of the pages (two folders) and hopefully just the images that are used there. Especially with getting rid of the (far too many) images that not used I could not find another way of doing it.

Both is not mutual exclusive. We also preserve the UID through out a migration while omitting old or stale content.

How do you know / get rid of the images that are not used (anymore) ?

Because we use ArangoDB as migration database, it is just a database query (after enriching the output of collective.jsonify a bit).

Depending on how you are porting it over you may end up with new UIDs for the images. Like, if you have a script that is programmatically recreating these images in the new location instead of actually copying the db over, they will have new UIDs. So you may need to make some mapping of UID->path from the old database and use that to get the new UID from the path for the new location. But you definitely want to use UIDs to link to the images.

Side note, when you create an internal link in the wysiwyg it will save it as src="resolveuid/097622a974a2943d177bda0792f0af75" or something like that. You don't normally actually see a link that looks like this when viewing the page, because of a transform that happens under the hood. If you mouse over the link in the view page you will see that the link looks like "/path/folder/imagexx.jpg" or something instead. But that's just some prettiness on the surface, it's not actually save that way. It's saved based on UID because that's the only way to do link integrity.

Removing images that aren't used sounds like an unrelated question. But you could probably write a script to get all images and find if any of them have references using zc.relations

I did not know 'the correct way', so I took a copy of the site, then deleted all the images and ran a 'link checker' which gave me all the missing images.

After restoring the site (images), I deleted every image that did not 'have an url from that list'.