Update the timestamp while getting an existing entry in the zope.ramcache cache

Hi,

I intended to post this as an "issue" for zope.ramcache but it is probably better to discuss it here before.

For now we patched the existing Storage but here is my observations:
When the cache is cleaned (at given cleanupInterval), removeStaleEntries will check for maxAge and remove entry older than given maxAge seconds. The "problem" is that the timestamp of the entry is the first time it is accessed and stored in the cache, but it could have been accessed last time 3 seconds before cache is cleaned...

So we changed the behavior by updating the timestamp of the entry in getEntry as well as (already the case) incrementing the access count. This way, a cleanup will remove really stale entries and not cache that is still correct but that was computed a long time ago...

Does that make sense to people knowing zope.ramcache?

I think we can not change the current behavior but probably add a new parameter to Storage.init to handle this?

Or is it a bad idea?

Thank you!

Gauthier

Hi!

in this way, some items could not be never refreshed. I had a similar problem in a python project. My strategy was:

if more than X minutes has passed (an interval like maxAge) from the last check, use a subrequest (actually it was a thread) to get the new value to update the cache, but returns immediately the last cached one.

So the user will get a fast response, and next time, the user will find the updated value. This strategy is ok for resources with frequent access.

You can also have a floor value, so if more than the floor value has passed, get the uncached value, cache it, and return it to the user. In this way, you're sure that not too much time has passed to get an updated value if any users have accessed the resource for some time. For example, the first access in the morning will get the updated value if nobody has accessed the resource in a floor time. It is quite uncommon on the web right now, so I think it is not so important to implement a floor limit.

@yurj thank you for response but not sure why some items could be never refreshed as if data not get thru getEntry, the date is not updated and the cleanup will remove it when maxAge is stale.

Today, the way plone.memoize handles it is to fix a maxAge of 86400 seconds (24 hours) which I understand as the date is not updated, and cleaning the cache mainly relies on the number of entries (maxEntries).

But what we want here is a behavior that we think more objective, is to set maxAge the 1800 seconds (30 minutes), cleanupInterval to 600 seconds (10 minutes) but maxEntries to 50.000 instead default 1000, this way the cache is full of really used entries, and a stale entry will be cleaned after 30 minutes max.

When using ram.cache intensively (relying on performance tests) and having many contents (our bigger clients has 1.000.000 (1 million) object in the catalog and 1200 users), setting maxEntries to 1000 and cleanupInterval to 300 (5 minutes), the default values, this configuration will just make 95% of the useful cache be cleaned every 5 minutes... Then the application is still more or less under charge.
Here is the setup we test:

from plone.memoize.ram import global_cache

global_cache.update(maxEntries=100000, maxAge=1800, cleanupInterval=1200)

Thank you for your time,

Gauthier