Content item missing? (But not really?) (BTree weirdness)

Hi there!

I've come across an issue that (to me at least) is pretty bizarre, and I was wondering if anyone here had some input?

First off, for context, this is a large site, that has existed for quite a while, and has been migrated from Plone 4 to 5 (and to Castle). However, the item in question has only existed since the site was moved to Castle.

I have a document, lets call it "doc".
Doc lives at /path/to/doc

If you navigate to /path/to/doc, it shows up, as expected.

If you enter a debug console, and do something like "Plone['path']['to']['doc']", you get the document back, as expected.

However, if you run "Plone['path']['to'].contentItems()", doc isn't listed. (or .items() or .objectIds() )

If you navigate the browser to /path/to/folder_contents, doc does not show up in the content listing.

If you attempt to go to /path/to/doc/manage, OR /path/to/manage, you get a "Index out of range" error.

If you browse the indexes in portal_catalog, you can find doc, listed as /path/to/doc

Doing a simple catalog query, like catalog(path={'query': '/path/to/'}) returns doc, along with the other folder contents.

Since this issue seems potentially catalog related, I did a site-wide catalog rebuild, and everything shows up exactly the same.

Theories? Options to try next?

Thanks in advance!

It might be a problem with the catalog, try a reindex of the object or if nothing helps the whole catalog, but that might take a while for a big site.

Check what Plone[‘path’][‘to’]._objects contains. The OFS methods use _objects to check which object should be there, but somehow this got messed up.

Also see https://github.com/zopefoundation/Zope/blob/master/src/OFS/ObjectManager.py#L410

Could it be that the rebuild did not finish ?

Perhaps the doc in question has an illegal id? Perhaps in the catalog it is doc/ and not just doc? This was my experience with a client site before. I just deleted and recreated the item to resolve the issue.

So, tried this, and it's returning ()...despite there still being several documents living in the folder, and showing up correctly? Hmm...maybe the issue is with the folder itself?

Eh, possibly, but I'm a bit skeptical. This happened both before and after the rebuild. I even tried manually reindexing all the content items involved.

Unfortunately, I tried both of these options, and it didn't change anything :frowning:

Is there anything that separates these content items from others ? (Exclude from navigation enabled , some behavior on them, that they once were migrated form ATContentTypes, some behavior on the 'folder'?

Does the same content-type work properly elsewhere in the portal ?

with f = folder at Plone[‘path’][‘to’] use
[x for x in f.objectIds(ordered=False)]
to work around a possible index problem.

BTW I think it's normal you get () for _objects, I see the same thing with an instance without any mysterious problem.

Might be a btreefolder? I think it's parent._tree in that case.

Check the parent.__dict__ as well. Maybe somehow the docs ended up in there, so traversal and indexing would still work, but 'doc' in obj won't (I guess).

1 Like

After Reading your post again, it seems that it has nothing to do with the catalog, as objectIds() does not use the catalog. I would inspect the container obj and it's attributes. Maybe there is something odd. If you get the obj with restricted traverse also check there is no redirect involved. But I think it's most likely something on the container and it's attributes.

So, not following you 100% (not really familiar with how btrees are used in Plone)....but, looked, and 'doc' is in parent._tree. For what it's worth 'doc' in obj did work too? Not sure what to make of any of this :slight_smile:

Try to pbp into parent.objectIds() to see what it is doing.
Though, an easy way out should be to just assign the doc to parent['doc']. My guess is that it would reset the underlying datastructure.

Initial thoughts:

  • Are you able to create a new Document with the same id? Or delete the document and recreate it?
  • Was there anything deleted from the parent folder that did not actually get completely deleted? Sometimes you can see these items in the ZMI. I've seen this happen, but it usually produces a KeyError with the name of the half-deleted page. Any KeyErrors in the error log?

After some more poking around, it seems there is a discrepancy between BTreeFolder2 and plone.folder in terms of how the number (and IDs) of contained items is tracked. In one way of looking at the container, there are 6 items; in another way there are 7. The BTreeFolder2 code contains comments that leave one less than reassured about the reliability of the implementation... Trying to call things like _setOb() start to work but fail in another layer that is notified of the change. There are way too many layers in there!

Dug in a little further. it appears that the "notifyAdded" event didn't get fired in plone.folder.ordered: https://github.com/plone/plone.folder/blob/master/src/plone/folder/ordered.py#L63

I went into a debug client, and manually ran that, with the id of the missing document, commited, and now everything is working as expected again!

The "Index out of range" errors were occuring because, for whatever reason, some areas of Plone rely on plone.folder.ordered.objectIds for the folder listing, but they pull the folder content count from the underlying BTree (which isn't related, and was correct, in this instance) so there was a mismatch between the expected # of content items, and the listing that was being provided.

As far as WHY this happened, I haven't the slightest clue. If anyone has any theories, I'd gladly hear them, but for now I'm counting this as "good enough".

Thanks for the help everyone!

1 Like

Ok, so the behaviour makes sense given what happend :slight_smile:

If the item was migrated, does the migration code perhaps add the item directly into the btree, instead of using _setOb?
Is the container class an OrderedBTreeFolderBase, actually?
I'm just wondering what could've happen here.

It wouldn't seem that it was migrated, as it is a relatively recently created item.

For what it's worth (aka to get Kim off my back :stuck_out_tongue_winking_eye: ), here's the script I ran to find any further broken folders. Hopefully someone can make use of it in the future.

from zope.component.hooks import setSite

import argparse
import json
import transaction


parser = argparse.ArgumentParser(
    description='...')
parser.add_argument('--site-id', dest='site_id', default='Plone')
args, _ = parser.parse_known_args()

user = app.acl_users.getUser('admin')  # noqa
newSecurityManager(None, user.__of__(app.acl_users))  # noqa
site = app[args.site_id]  # noqa
setSite(site)

cat = site.portal_catalog
folders = cat(portal_type='Folder')
for folder in folders:
    ob = folder.getObject()
    a = ob.objectIds()
    b = [x for x in ob._tree]

    if set(a) != set(b):
        print(ob.__dict__)
        import pdb; pdb.set_trace()

    else:
        print(ob.absolute_url())
2 Likes