Database size increase and Plone 6

Hi all!

I have an interesting "bug": since we updated from Plone 5.2.14 to 6.0.7 we see that our PostgreSQL database using RelStorage (starting at 77Gb) is increasing on a daily basis 1Gb.

Note that the database is configured in history-free mode.

We moved to RelStorage back in November last year, and it was working fine, see graph below.

Notice that when we moved to Plone 6 in early March this year, the disk usage trend sky rockets :rocket:

Mid-March we started packing the database (that's the small drop on disk usage) but nonetheless, it still keeps growing :confounded:

We have lots of database conflicts regarding image scales, but a conflict should not add anything on the database, as it conflicted...

Is there any debugging tools to see what's being added on a daily basis to the database? :thinking:

Debugging the ZODB is still one of my weak points :sweat:

Any help will be highly appreciated! :bowing_man:t4:

Disclaimer: I do not have any relstorage site on production right now...

This rings me a bell, try to check:

IIRC the tables object_ref object_refs_added will grow quite a lot.
I think it is quite usual to truncate them from time to time.
They will be anyway recreated at the next pack (adding some more time to the packing process).

1 Like

Thanks! I'm already truncating some tables, otherwise the daily packing was not even working :person_shrugging:t4:

1 Like

This looks not like a packer problem, but more like some objects are still holding references, even if they were deleted. While history free mode does not keep the old version after an update of an object it still needs to pack, to get rid of deleted objects. But if some reference is left it is not possible to delete it. I would first look in this direction.

Yes, that's what I'm thinking as well, the question is how to notice that?

I'm trying to use the relstorage_packer, but it still hasn't finished building the inverted index :sweat_smile:

Any tips on finding objects that still hold references? Though they should be deleted?

Given that this started to happen with the Plone 6 upgrade, I'm guessing, no proof of course, that the new image_scales column might be involved?

Another interesting point is that an export of the database is 50 GB currently, but listing the tables' sizes, the biggest one is less than 5 Gb... so where are all those 45 GB coming from? :thinking:

@gforcada my 5cents, do you have the blobstorage within the database?

When you look at the size of the tables you are probably not counting the oid field of postgresql (the blob files), you are just measuring the file pointers...

So I think the increase in size may be related, as you assumed, to the scales of the images stored in the database.

1 Like

relstorage_packer does not help here. I made it once to get rid of the limitation of Relstorage to build the reverse reference tree in memory. This was a bad idea with our the project we had, with a large database it took ages and stopped with out of memory (we had 24GB).

Meanwhile Relstorage evolved in several aspects, so the packing is now no longer in memory. It now blows up by filling a table with the information (which can be truncated if you do not want to keep it: TRUNCATE object_refs_added;).

I agree with you and @mamico to have a look at image scale handling since there were some changes between 5.2 and 6.0. Storing blobs in ZODB is fine with Postgresql and latest RelStorage, it has advantages transaction timing wise. Anyway, if there is no other add-on with possible problems: this would be my first place too look.

The image_scales column is just some data, nothing binary involved. This should not blow up the database that much. :man_shrugging:

If you export/zodbconvert the database to filestorage with blobstorage separate, what is the ratio of fs/blobs there?

Yes, blobs are inside the database.

A compressed export of the database is 50Gb, so there has to be a lot of images there...

Oh, so then, I can stop the relstorage_packer ? :sweat_smile:

Good point of exporting/zodbconvert the database. I will set that up and report back :crossed_fingers:t4:

Disk Usage - PostgreSQL wiki has some queries that'll show what is taking up all the space. Should give you some pointers on where to look next.

1 Like

Thanks! :star_struck:

Indeed, there is a suspicious table named pg_largeobject with 67Gb.

Seems I can not do a simple SELECT 1 FROM pg_largeobject to poke what's inside :confused:

Looking at the image_scales being generated seems to be what I should look into :pick: let's dig :smile:

1 Like

pg_largeobject is where the blobs are stored.