Creating custom indexes on-the-fly...is it possible?

Question: Is it possible to register a single custom indexer to multiple adapters, and then, when the custom indexer is called, for it to detect which adapter caused it to be called?

Background: We built a Plone add-on with a Dexterity type with a few simple fields and then a jsonRepr field that stores all the object's information in JSON. This allows us to import any arbitrary "object" without having to create a new Dexterity type for each type of object. We'd like to be able to create custom indexes based on fields within the JSON field without having to write any new code specific to those new fields.

Scenario: Imagine that after importing some new objects, we display a list of all the JSON keys and let the (admin) user choose which need to be indexed.

I know how to use plone.indexer to create "virtual" field indexes on fields within the JSON field, like this:

@indexer(IOurDataObject)
def our_zipcode(object):
    metadata = json.loads(object.jsonRepr or u'{}')
    return metadata.get('zip_code', u'')

But that approach still means that I have to create a new @indexer for each "virtual" field that we need. So...

I'm imagining a solution where I create a generic custom indexer that gets mapped to any "virtual" field in the jsonRepr field that we need indexed. Something like this

@indexer(IOurDataObject)
def our_indexer(object):
    metadata = json.loads(object.jsonRepr or u'{}')
    #
    # ??? detect which adapter called this and use that to choose the "desired_field" to pull out of the JSON
    #
    return metadata.get(desired_field, u'')

I just don't know how to perform that important mystery step in the middle.

Yes, I'd still need to manually put lines for each new index into my configure.zcml, but I suspect I can solve that hurdle by registering the adapters in Python code. That's a separate challenge and I think I pretty much have a handle on that part.

Finally, please feel free to tell me I'm crazy and that I should be solving this in another fashion. I'm all ears!

Thanks in advance!

If I understood properly, you would like to be able to resolve the name that was used in configure.zcml adapter registration (<adapter name="desired_field" factory=".indexers.our_indexer" />).

I'm afraid that the name is not made available in any sensible way. Technically it is possible to figure it out by inspecting Python interpreter stack 28.13. inspect — Inspect live objects — Python 2.7.18 documentation and going back enough frames to see the name used in adapter lookup, but it is good to know that this would be Python version specific and would need to be reimplemented when you migrate to Python 3.

Would you like to open this requirement a bit more? Is your goal to be able to register new indexers (for new indexes) without writing new code nor requiring restart for the site? (It is technically possible, but belongs to area of "I know what I am doing", because of all the possible [maintenance]
traps along the way.)

1 Like

Thanks for your reply, Asko!

Yes, you understood my "adapter name" question perfectly. Thanks for the stack inspecting link. Hopefully I won't have to go down that road but good to know that it may be possible.

You also understand my goal. I want to register new indexers without writing new code or restarting the site.

Maybe there's another approach? I noticed that some of the custom indexer examples had a kw argument like this:

@indexer(IOurDataObject)
def our_indexer(object, **kw):

Do you have any idea if that kw is still used and what it's used for? For example, could it be used to pass an argument into the generic custom indexer of my dreams?

Thanks!

I have no idea about **kw`` in your indexers. plone.indexer does not pass them for sure.

But you could probably achieve the indexer of your dreams by working one level above those indexers:

  1. Implement a simple behavior for your content types to just add a simple custom marker interface for your content types.

  2. Inherit and register indexing wrapper from plone.indexer for objects with those interfaces: plone.indexer/plone/indexer/wrapper.py at 3d9036e51c6ce4552fec165a2d86caed6645ed20 · plone/plone.indexer · GitHub

Assuming your "on-the-fly-indexes" had known naming pattern, in your custom indexing wrapper __getattr__, you could detect your custom indexes, return whatever you want for them in dynamic fashion and the call the super class __getattr__ for the default behavior.

1 Like

Thank you again, Asko!

I have a feeling that your suggestion is a wise solution, but I have to admit that it might be a bit beyond my current Python abilities. (Give me a few more weeks. :slight_smile:)

Since my last post, I came up with a solution that I think is working. It admittedly is a slightly different direction than my original question, but it solves the same problem.

Instead of having a single custom indexer, I created a custom indexer factory like this:

def our_indexer_factory(key):
    @indexer(IOurDataObject)
    def our_indexer(object):
        metadata = json.loads(object.jsonRepr or u'{}')
        return metadata.get(key, u'')

    return our_indexer

Then, every time I need to create an on-the-fly index, I create one and use it like this:

provideAdapter(our_indexer_factory(desired_field), name=desired_field)

So, I end up with multiple dynamically generated custom indexers. It seems to be working in my initial tests, but I'm still putting it to use.

When I get more Python-proficient I'll try your likely more elegant solution.

Thanks!

I need exactly the same thing for c.ambidexterity and c.listingviews. I did notice in some cases adapter.__name__ was set to the name but I haven't found the code that did that so I'm not sure thats reliable. It would be a super useful thing to have added to the ZCA.

For dynamically created indexers that will work only as long as you have only one Plone server instance in use (not ZEO based setup) and you recreate and reregister your dynamically created indexers after every restart.

1 Like

Thanks for those warnings, Asko. I may tackle your solution sooner than later!

I recall that Grok did this for grokked adapters. But with plone.indexer even that would not help, because the eventually created adapter instance is out of the scope of the indexer function.

To be honest, Plone supports also persistent (dynamic) adapter (including indexer) registrations that get shared between ZEO instances, but even that could not persist dynamically created functions (as in your example), and use of “dynamically created persistent adapters” is usually a fast lane to broken site unless you really know, what you are doing. :slight_smile:

1 Like

The ZCA does not set the name on the adapter object -- and it cannot do so in the general case: Note that in some cases, the adapter can be an elementary object (an int, tuple, str, etc) which does not support attribute assignment.

In some cases, the returned adapter has a __name__ attribute, e.g. when the adapter is a function. This is the object's __name__ (e.g. the function name), not the registered name.

Dynamically created functions are (typically) not persistent - but there is often a simple workaround: Instead of a dynamically created function, use a class instance the __call__ method of which would do what the dynamically created function should have done. In order for this to resolve the persistency issue, the class must be defined at module level (and thus, not dynamically).

1 Like

Note: You'll still need to add the indexes to the catalog at some point.

Given that zcatalog indexes an attribute, why not implement __getattr__ instead?
Something like:

class JsonrepDX(plone.dexterity.content.Item):
    def __getattr__(self, key):
        try:
            return super().__getattr__(key)
        except AttributeError:
            return getattr(self, 'json_storage_fieldname', {}).get(key)
2 Likes

Agree that this the simplest answer.

1 Like

Roel, thank you so much. I love your brilliant solution but I have to admit that I can't seem to get it to work. I'm sure I'm doing something wrong. I suspect I'm putting the getattr definition in the wrong place.

I added your getattr() to our IOurDataObject, but if I understand correctly, IOurDataObject is simply used as the schema for plone.dexterity.content.Item (in the associated XML file), no actual IOurDataObject's are created, right?

<property name='schema'>our.app.ourdataobject.IOurDataObject</property>
<property name='klass'>plone.dexterity.content.Item</property>

So, are you suggesting that I extend plone.dexterity.content.Item (or maybe plone.dexterity.content as you show above), and then change my XML file to include the follow?

<property name='klass'>JsonrepDX</property>

Thanks!

Yep, create a custom class like mine and add the dotted path into klass, and add the __getattr__ to the class - like the example.
This would be something like out.app.outdataobject.JsonrepDX

Don't forget to add an index to the catalog, or the data won't be indexed.

1 Like

I'm implemented something like this in Plomino - https://github.com/plomino/Plomino/blob/ac48b02e53a480d3a0fcde54af41159ff982de18/src/Products/CMFPlomino/document.py#L1102

1 Like

Hi Roel, thanks again for your suggestion. In the last few weeks my dynamically created indexers have been working fine and I have them get recreated at restart, but I really want to do it the "right" __getattr__ way you suggested so I'm tackling that now.

I think I'm really close but I get an error when I try to manually create an object using my new custom class.

I created a custom class like this (basically what you wrote except I added json.loads):

from plone.dexterity.content import Item
class OurItem(Item):
    def __getattr__(self, key):
        try:
            return super(OurItem, self).__getattr__(key)
        except AttributeError:
            return json.loads(getattr(self, 'jsonRepr', {})).get(key)

Then I updated our our.app.ourdataobject.xml to include:

<property name='schema'>our.app.ourdataobject.IOurDataObject</property>
<property name='klass'>our.app.ouritem.OurItem</property>

Unfortunately, when I try to manually add an OurItem to the /Plone/debug/test/test2/ container, it throws this error:

  Module z3c.form.datamanager, line 91, in set
  Module z3c.form.datamanager, line 66, in adapted_context
TypeError: ('Could not adapt', <OurItem at /Plone/debug/test/test2/>, <SchemaClass our.app._base.IAbstractDataObject>)

Note error includes "IAbstractDataObject" because that is the parent class of IOurDataObject.

Any idea what I'm missing that causes the dreaded "Could not adapt" error?

Thanks so much for any pointers anyone can give.

More debugging info (insights???) on my "Could not adapt" error. I adding logging code like this:

class OurItem(Item):
    def __getattr__(self, key):
        _logger.info('Inside __getattr__ with ' + key)
        try:
            attr = super(OurItem, self).__getattr__(key)
            _logger.info('Just got super attr ' + key + ' = ' + attr)
            return attr
        except AttributeError:
            _logger.info('Except __getattr__ with ' + key)
            attr = json.loads(getattr(self, 'jsonRepr', {})).get(key)
            _logger.info('Just got our attr ' + key + ' = ' + attr)
            return attr

and the log showed this before throwing the error:

Inside __getattr__ with __provides__
Except __getattr__ with __provides__
Inside __getattr__ with jsonRepr
Inside __getattr__ with __provides__
Except __getattr__ with __provides__
Inside __getattr__ with jsonRepr
Inside __getattr__ with _v__providedBy__
Except __getattr__ with _v__providedBy__
Inside __getattr__ with jsonRepr
Inside __getattr__ with __conform__
Except __getattr__ with __conform__
Inside __getattr__ with jsonRepr
Inside __getattr__ with _v__providedBy__
Except __getattr__ with _v__providedBy__
Inside __getattr__ with jsonRepr
Inside __getattr__ with _v__providedBy__
Except __getattr__ with _v__providedBy__
Inside __getattr__ with jsonRepr

I'm guessing the repeats are because the add form has 8 fields???

So, since there are no "Just got" lines, I assume that means that the attempts to get the attribute in both the "try" and the "except" are both failing, right?

Any ideas?

Thanks!

The error has nothing to do with your __getattr__ logic. Instead, it is caused by a missing adapter registration or a missing implements/implementer declaration.

The error occurs when the infrastructure (z3c.form) tries to adapt your OurItem to IAbstractDataObject. Apparently, OurItem neither declares that it implements IAbstractDataObject nor is a corresponding adapter registered.

I assume that your class definition lacks the intended implements(IAbstractDataObject) (but better use the implementer class decorator to be compatible with upcoming Python 3). Should your class not implement this interface, then you would need to register a corresponding adapter.

1 Like