Database size increase and Plone 6

Hi all!

I have an interesting "bug": since we updated from Plone 5.2.14 to 6.0.7 we see that our PostgreSQL database using RelStorage (starting at 77Gb) is increasing on a daily basis 1Gb.

Note that the database is configured in history-free mode.

We moved to RelStorage back in November last year, and it was working fine, see graph below.

Notice that when we moved to Plone 6 in early March this year, the disk usage trend sky rockets :rocket:

Mid-March we started packing the database (that's the small drop on disk usage) but nonetheless, it still keeps growing :confounded:

We have lots of database conflicts regarding image scales, but a conflict should not add anything on the database, as it conflicted...

Is there any debugging tools to see what's being added on a daily basis to the database? :thinking:

Debugging the ZODB is still one of my weak points :sweat:

Any help will be highly appreciated! :bowing_man:t4:

Disclaimer: I do not have any relstorage site on production right now...

This rings me a bell, try to check:

IIRC the tables object_ref object_refs_added will grow quite a lot.
I think it is quite usual to truncate them from time to time.
They will be anyway recreated at the next pack (adding some more time to the packing process).

1 Like

Thanks! I'm already truncating some tables, otherwise the daily packing was not even working :person_shrugging:t4:

1 Like

This looks not like a packer problem, but more like some objects are still holding references, even if they were deleted. While history free mode does not keep the old version after an update of an object it still needs to pack, to get rid of deleted objects. But if some reference is left it is not possible to delete it. I would first look in this direction.

Yes, that's what I'm thinking as well, the question is how to notice that?

I'm trying to use the relstorage_packer, but it still hasn't finished building the inverted index :sweat_smile:

Any tips on finding objects that still hold references? Though they should be deleted?

Given that this started to happen with the Plone 6 upgrade, I'm guessing, no proof of course, that the new image_scales column might be involved?

Another interesting point is that an export of the database is 50 GB currently, but listing the tables' sizes, the biggest one is less than 5 Gb... so where are all those 45 GB coming from? :thinking:

@gforcada my 5cents, do you have the blobstorage within the database?

When you look at the size of the tables you are probably not counting the oid field of postgresql (the blob files), you are just measuring the file pointers...

So I think the increase in size may be related, as you assumed, to the scales of the images stored in the database.

1 Like

relstorage_packer does not help here. I made it once to get rid of the limitation of Relstorage to build the reverse reference tree in memory. This was a bad idea with our the project we had, with a large database it took ages and stopped with out of memory (we had 24GB).

Meanwhile Relstorage evolved in several aspects, so the packing is now no longer in memory. It now blows up by filling a table with the information (which can be truncated if you do not want to keep it: TRUNCATE object_refs_added;).

I agree with you and @mamico to have a look at image scale handling since there were some changes between 5.2 and 6.0. Storing blobs in ZODB is fine with Postgresql and latest RelStorage, it has advantages transaction timing wise. Anyway, if there is no other add-on with possible problems: this would be my first place too look.

The image_scales column is just some data, nothing binary involved. This should not blow up the database that much. :man_shrugging:

If you export/zodbconvert the database to filestorage with blobstorage separate, what is the ratio of fs/blobs there?

Yes, blobs are inside the database.

A compressed export of the database is 50Gb, so there has to be a lot of images there...

Oh, so then, I can stop the relstorage_packer ? :sweat_smile:

Good point of exporting/zodbconvert the database. I will set that up and report back :crossed_fingers:t4:

Disk Usage - PostgreSQL wiki has some queries that'll show what is taking up all the space. Should give you some pointers on where to look next.

1 Like

Thanks! :star_struck:

Indeed, there is a suspicious table named pg_largeobject with 67Gb.

Seems I can not do a simple SELECT 1 FROM pg_largeobject to poke what's inside :confused:

Looking at the image_scales being generated seems to be what I should look into :pick: let's dig :smile:

1 Like

pg_largeobject is where the blobs are stored.

Some progress has been made: :four_leaf_clover:

So, what we did?

  • remove image scales (that we don't use) and tweak the srcset generation on @@imaging-controlpanel (there were image scales that we also don't use)
  • removing old image scales (see below) provided a ~10Gb database size drop
  • VACUUM FULL ANALYZE provided a ~30Gb database size drop
  • an update to the latest Plone release (from 6.0.7 to 6.0.10.1) might also have helped? :thinking:
  • update to latest plone.scale version (4.1.2)

I very much expect these last two bullet points to also have helped in our ConflictError rate, as we had like +1k such errors on a daily basis, almost 99% of them being about plone.scale.storage.ScalesDict and, I very much guess, plone.scale==4.1.2 to be the hero here :tada: @wesleybl thanks a lot!! :star_struck:

As for "old image scales" I meant that if you look at an object annotations IAnnotations(obj)['plone.scale'] you can get three different kind of keys there:

  • new hashes (FIELD-WIDTH-HASH): like image-960-1d61aa218da10b6dcd06d6421c04746e
  • old hashes (HASH): like 1d61a-a21-8da10b-6dcd06d-6421c04746e (I put the dashes on the fly, I no longer have a valid example sorry, but you get the idea :sweat: )
  • tuples, (('width', XXX), ('height', XXX), ...): like... that's quite verbose, but you get it as well

If I understood it correctly, starting with Plone 6 all image scales are stored with new hashes, but the old ones were not removed, so basically that means that all image scales got duplicated :boom:

Crawling the website to remove those scales, packing and vacuuming the database claimed that extra storage back :crossed_fingers:t4:

I think we still have some extra storage to be dropped, but I'm not sure where exactly to look now :thinking:

If I have new findings, I will let you know :smile:

8 Likes