Automating ZODB Pack

I have a raw Zope5 running, no Plone, no ZEOServer, and am trying to figure out how I can automate ZODB Packs via CRON. With Zope2 there was the zodb-pack script but that no longer exists with Zope5.

I cannot use something like curl -X POST to the URL for two reasons

  1. The username and password would be in plain text in the crontab
  2. I have certificate authentication setup for the Zope Admin

I need to be able to do this from the command line.

Thanks!

kittonian via Plone Community wrote at 2023-2-12 16:48 +0000:

I have a raw Zope5 running, no Plone, no ZEOServer, and am trying to figure out how I can automate ZODB Packs via CRON. With Zope2 there was the zodb-pack script but that no longer exists with Zope5.

You can use an External Method (from package Products.ExternalMethod)
and there use:

from Zope2 import DB
DB.pack()

Alternatively, you can install ZEO, access your storage via ZEO
and pack with zeo_pack.

Dieter Maurer wrote at 2023-2-12 18:38 +0100:

kittonian via Plone Community wrote at 2023-2-12 16:48 +0000:

I have a raw Zope5 running, no Plone, no ZEOServer, and am trying to figure out how I can automate ZODB Packs via CRON. With Zope2 there was the zodb-pack script but that no longer exists with Zope5.

You can use an External Method (from package Products.ExternalMethod)
and there use:

from Zope2 import DB
DB.pack()

You can do this also in a view (registered via ZCML).

Appreciated but both of those solutions are not from the command line, and instead needing to access a URL which I cannot do due to client certificate authentication. I need a straight command line function to do this. Also, this is Zope5 as I mentioned, not Zope2.

kittonian via Plone Community wrote at 2023-2-12 18:35 +0000:

Appreciated but both of those solutions are not from the command line, and instead needing to access a URL which I cannot do due to client certificate authentication.

You decide whether the ExternalMethod/View requires authentication.

I need a straight command line function to do this. Also, this is Zope5 as I mentioned, not Zope2.

Zope2 is a Zope [4|5} subpackage; it does not mean Zope 2.

If you insist on a command line approach, you
can either use ZEO or shut Zope down and then execute
the code sequence shown for the ExternalMethod/view
in a zconsole run script.

There is an additional possiblity when you want packing with
a strict schedule: start a thread during Zope startup
which calls pack (as shown for `ExternalMethod/view)
periodically in this thread.

There is also z3c.offlinepack · PyPI which I was using before switching to ZEO. It has the disadvantage that it requires to stop Zope before being able to start packing. So I'd suggest to look into installing ZEO to be able to pack while Zope is running.

I'm confused. I don't need to stop Zope when packing. I just go to control panel and click pack.

As far as "insisting" on command line. This is for an automated backup script where the Zope DB needs to be packed prior to the script commencing.

Also, my Zope, running Waitress, is in a Docker container and is proxied to via Apache. Apache is what mandates the client certificates, so literally no URL will be allowed to access the website, unless the security measures are in place. Certainly I can lessen the security but I'm not looking to do that.

I don't need an offline pack solution. I was just looking for how to make this happen from the command line instead of the ZMI.

One idea I had, but wasn't sure was correct was:

  1. source /zope5/bin/activate
  2. Create a python script like
from ZODB.DB import DB
from ZODB.config import databaseFromString
from transaction import commit

db = databaseFromString("<zodb_config>")
storage = db.storage
storage.pack(None, referencesf)
  1. python scriptname

I'd really like the ability to specify how many days so I can pack to 1 day, 0 days, 3 days, etc.

A pure filestorage database - as I suppose you run it - can be opened by only one process, which is the WSGI Server (or ZServer in former times/older installations). Once opened it is locked for good reasons.

Given the above you can not run the pack in a separate script without stopping the server.

As mentioned above there are two possibilities to overcome this without stopping the server:

  1. Send a request to the server to perform the pack
  2. Use ZEO, to be able to connect with several clients to the database server.
1 Like

If that is true, how is it possible to run a pack through the ZMI while Zope is running without bringing it down?

Jens W. Klein via Plone Community wrote at 2023-2-13 16:59 +0000:

A pure filestorage database - as I suppose you run it - can be opened by only one process, which is the WSGI Server (or ZServer in former times/older installations). Once opened it is locked for good reasons.

Given the above you can not run the pack in a separate script without stopping the server.

As mentioned above there are two possibilities to overcome this without stopping the server:

  1. Send a request to the server to perform the pack

There is a variant to 1 -- already outlined in a previous comment:
arrange that the server performs the pack either periodically on its own
or based on an external request.
To do this, a special thread could be started on server startup
(e.g. in an event handler for the "database opened" event)
which handles the packing.

One implementation (out of a huge number) for an external pack request:
An external pack request could have the form touch <do_pack>
where "<do_pack>" is a communication file. The thread could monitor
this file and start a pack as soon as the file gets a new modification date
(by the ´touch`).
If it is important to know when the packing is finished, this could
be achieved with a second communication file.

kittonian via Plone Community wrote at 2023-2-13 18:29 +0000:

If that is true, how is it possible to run a pack through the ZMI while Zope is running without bringing it down?

Because this is the same process:
Accessing a storage requires synchronisation (to ensure
that modification from different activities are not merged uncontrolled).
FileStorage uses "in process" synchronization;
therefore, it cannot allow that different processes access it concurrently.

Right. I'm not looking to start another process to access it concurrently. I'm simply asking how to trigger it from the command line. If it can be triggered from the ZMI via clicking Pack, I would have to believe it's possible to trigger it from the command line because the Pack button is calling a function.

kittonian via Plone Community wrote at 2023-2-13 19:38 +0000:

Right. I'm not looking to start another process to access it concurrently. I'm simply asking how to trigger it from the command line. If it can be triggered from the ZMI via clicking Pack, I would have to believe it's possible to trigger it from the command line because the Pack button is calling a function.

The Zope process processes HTTP requests sent to it, among
them HTTP requests corresponding to a press of the "Pack" button.
Therefore, when you press the "Pack" button,
it is the Zope process which performs the packing.

On the command line, you are in a different process.
You must communicate with the Zope process to tell it that
it should call the "pack" function.

You can use HTTP for this -- but you do not want this.

You can enhance the Zope process such that it supports
different forms of communication (example in my previous reply) --
but apparently, you do not want this either.

You can use ZEO instead of directly using FileStorage.
But apparently, you do not want this either.

You can shut down the Zope process and then perform
the packing in a new process. But apparently, you do not want this, either.

This was my last trial. If you do not get it, I cannot help you.

1 Like

I understand exactly what you're saying. I also understand about processes. I think what we're missing here is that anytime there is a button a website, that button runs a function. I was trying to find a way to run that function without using a separate process, and without calling a URL. Basically initiating the process manually from the command line just like we used to be able to do with the zodb-pack script.

With regards to your "you don't want to do this", it's not that I don't "want" to, I can't.

  1. I cannot use curl because there is a client certificate involved and I would have to hard code the username and password authorized for this process which is a huge security risk. I would not encourage anyone to do anything like this if they are the least concerned about security.

  2. Perhaps I misunderstood what you were suggesting because this sounds like it could be a promising solution. What I understood is you suggesting an external method that wouldn't require a username/password and still calling to it via curl. If that's the case, refer to my answer to #1. If not, please explain in more detail.

  3. I am not interested in redoing our entire Zope install for ZEO when this solution has been in place for a number of years and is working flawlessly. The only thing I would gain, is the ability to pack the db.

  4. I don't know why anyone would want to shut down Zope in order to do a pack. You are taking all users offline and causing major disruptions. This is not a workable solution if you have a userbase.

Please let me know if I misunderstood your suggestion referenced in #2 above. Thanks.

kittonian via Plone Community wrote at 2023-2-13 21:36 +0000:

...

  1. I cannot use curl because there is a client certificate involved and I would have to hard code the username and password authorized for this process which is a huge security risk. I would not encourage anyone to do anything like this if they are the least concerned about security.

When I remember right, the client certificate is checked by
the web server before Zope.

For internal purposes, you typically access Zope directly --
i.e. not via a web server before Zope.
In this case, you might avoid the client certificate.

But even if not: you can issue HTTP requests with a client certificate.
The certificate must be accessible on the host.
But, almost all hosts have sensitive data and there are ways to
protect them.

  1. Perhaps I misunderstood what you were suggesting because this sounds like it could be a promising solution. What I understood is you suggesting an external method that wouldn't require a username/password and still calling to it via curl. If that's the case, refer to my answer to #1. If not, please explain in more detail.

Again: you can hide sensitive information (such as login/password
combinations) in files only readable
by very few (and trusted) people.
There is no need to put those directly in a crontab.

  1. I am not interested in redoing our entire Zope install for ZEO when this solution has been in place for a number of years and is working flawlessly. The only thing I would gain, is the ability to pack the db.

The difference is small.
The time of this discussion would have been enough for you
to learn how to use ZEO and set it up.

  1. I don't know why anyone would want to shut down Zope in order to do a pack
    Likely, I does not want; but he must when he does not like the alternatives.
    You will likely find that out yourself.
1 Like

You cannot enter in a running process. you can attach to it (basically stopping it in a particular step and let you continue from there.)

This topic is a few months old, but I would consider setting up ZEO for this purpose. It comes with a zeopack executable you can quite easily run in the command line. If you run just a single Zope/WSGI client connected to the ZEO server than the added overhead is minimal - you don't have to worry about configuring load balancing or anything. You can also run zeopack while your single Zope instance is running this way.