Hi,
I have some Plone 4.3.x web sites with a very huge Data.fs (> 30Go), after analysing it, it figures that the most part of the volume is taken by images.
I thought that images where stored into the blobstorage, no ?
For example :
plone.app.imaging.scale.ImageScale 57.3%
OFS.Image.Pdata 39.1%
Is there is a way to reduce the Data.fs volume ?
Thanks
yurj
(Yuri)
July 8, 2020, 8:59am
2
Did you try to pack it, maybe locally using:
?
zopyx
(Andreas Jung)
July 8, 2020, 9:19am
3
We use this script to purge all image scales as part of a migration:
import transaction
from zope.annotation import IAnnotations
from zope.component.hooks import setSite
KEY = 'plone.scale'
site = app.plone_portal
setSite(site)
catalog = site.portal_catalog
query = dict(portal_type=['Image'])
for brain in catalog(**query):
obj = brain.getObject()
annos = IAnnotations(obj)
if KEY in annos:
print(obj.absolute_url(), 'SCALE REMOVED')
del annos[KEY]
else:
print(obj.absolute_url(), 'no SCALES found')
transaction.commit()
1 Like
espenmn
(Espen)
July 8, 2020, 10:38am
4
I dont think they always do (there are different 'field-types' used).
Personally, I consider it a bug if it is possible to have 'Image' saved as blob, but scales as part of Data.fs
Is there any use for that ? Maybe the 'icons and thumbs ???
djay
(Dylan Jay)
July 8, 2020, 10:52am
5
I believe a newer version does store scales as blobs.
zopyx
(Andreas Jung)
July 8, 2020, 11:20am
6
This is correct (at least in Plone 5.2).
Plone 4 clearly used to store scales within the ZODB. But I do not recall at which point the storage of scales switched to blobs.
zopyx
(Andreas Jung)
July 8, 2020, 11:21am
7
jihaisse:
OFS.Image.Pdata 39.1%
This looks like some very old cruft....
rafaelbco
(Rafael Oliveira)
July 10, 2020, 10:57am
8
Yes, they are (at least on Plone 4.3+). But the objects which references the actual image blob are kept Data.fs
. How many instances of ImageScale
do you have?
Also, you can use collective.zodbdebug to see if the image scale blobs are in fact stored in the blobstorage.
And, for reference you may also check these issues:
opened 11:41PM - 17 Sep 18 UTC
bug
I was checking annotations on one cover object and I found the following:
```… pypdb
>>> len([k for k in IAnnotations(obj) if k.startswith('plone.tiles.data')])
85
>>> len([k for k in IAnnotations(obj) if k.startswith('plone.tiles.configuration')])
61
>>> len([k for k in IAnnotations(obj) if k.startswith('plone.tiles.scale')])
200
```
that means we have annotations for 200 scales but only 28 of them seem to be related with existing tiles:
```pypdb
>>> scales = {k.split('.')[3] for k in IAnnotations(obj) if k.startswith('plone.tiles.scale')}
>>> tiles = {k.split('.')[3] for k in IAnnotations(obj) if k.startswith('plone.tiles.data')}
>>> len(scales & tiles)
28
```
opened 03:51AM - 22 Sep 18 UTC
05 type: question
I have an application where image scales are heavily used, and once they are gen… erated they are stored forever, even when they are not useful anymore.
So I'm thinking about implementing a new `IImageScaleStorage` implementation, where scales are stored only in RAM memory. If memory was infinite it would be easy. However I would like to store the scales in a cache, where least used scales would get purged eventually (probably Zope's RAM cache, but that's just an implementation detail). The problem is when a client of the storage requests an image by it's UID and it is not in the cache anymore. Example: when a user requests the scale by its UID based URL.
To solve this the storage itself could re-generate the scale, but it doesn't know what are the scaling parameters (width, height etc). Unless the parameters are contained in the UID!
In the README I found this statement:
> image scaling parameters should not be part of the generated URL. Since the number of parameters can change and new parameters may be added in the future this would create overly complex URLs and URL parsing.
However I thought of an easy way to solve the problems mentioned in the statement. I could encode the parameters as JSON and apply base64 encoding. For example:
```
>>> parameters = '{width:100,height:100,direction:'down',quality:88}'
>>> uid = base64(parameters)
e3dpZHRoOjEwMCxoZWlnaHQ6MTAwLGRpcmVjdGlvbjonZG93bicscXVhbGl0eTo4OH0K
>>> len(uid)
68
```
URL would be not that big (67 chars in the example), parsing and encoding are straightforward and it is futureproof regarding inclusion of new parameters.
So what do you think? Do you see any caveats I am missing? Do you think such a scale storage would be generally useful and worth inclusion in this package? I could open a PR for that.
PS.: This is actually not an issue nor a question, it is a request for thoughts about this idea. I thought about posting it on the Plone Community forum but I think here I have more chances of reaching people how knows more about scaling specifics. I hope this is OK.
1 Like
Hi,
Thanks for all your replies.
Finally, we have found another solution : removing orphaned revisions.
This leads to reduce from 30Go to 3Go the Data.fs !
4 Likes
rafaelbco
(Rafael Oliveira)
July 16, 2020, 12:19pm
10
How did you do that? I might have to do this too.
You can remove orphaned revisions manually in portal_historiesstorage
I think. But it is easier with collective.revisionmanager
, install this as add-on and you have a nice control panel.
Also, for image scales, this script with bin/instance run
on the command line can help.
Note that when you do any of this, you will not see a Data.fs or blobstorage size decrease yet. You first have to pack, keeping zero days.
2 Likes