Problems with "api.content.create" and indexing

I am working on a synchronization script for taking the data from the API and create an object.

My script was able to create the objects successfully in a newly created instance, but when I tried it in the live instance (which already has many more imported objects), it gave error. The number of the objects that I want to import is about 6000 and there are multiple images for each of them.

The current error is about creating the images files but it also raising the same errors for creating the artwork objects (custom content type). It raises the following error when I run the script:

Module Products.PluginIndexes.unindex, line 213, in insertForwardIndexEntry
TypeError: '<' not supported between instances of 'int' and 'str'

The error seems to be related to the indexing of the database. Even though I deleted the previously created objects (the same custom type), there seems to be a conflict which causes it to give errors.

When i try to clean and rebuild the catalog using the Plone UI, it causes the Plone to restart itself. When I try to reindex, it also causes it to crush.

I tried re-cataloging with the debug mode: but it keeps giving the same warning for all of the objects:
WARNING:plone.app.contenttypes. indexers:Lookup of PrimaryField failed for Plone/nl/collectie-onderzoek/collectie/kunstwerken/nathan-phillips-square-a-winters-night-skating/456c0b48d32bfd7c01ade92b46b54a9c558efe7b-jpg If renaming or importing please reindex!

Has anyone seen this error before? If it is indeed about the indexing: Is there a better way to fix the indexing of the database other than using Plone UI? Here is the script I use for importing the images:

def import_images(container, images):
    MAX_RETRIES = 3
    DELAY_SECONDS = 5

    # Delete the existing images inside the container
    for obj in api.content.find(context=container, portal_type='Image'):
        api.content.delete(obj=obj.getObject())

    for image in images:
        primaryDisplay = image.get('PrimaryDisplay')
        retries = 0
        success = False

        # Tries MAX_RETRIES times and then raise exception
        while retries < MAX_RETRIES:
            try:
                with requests.get(
                    url=f"{IMAGE_BASE_URL}/{image.text}", stream=True, verify=False, headers=HEADERS
                ) as req:  # noqa
                    req.raise_for_status()
                    data = req.raw.read()

                    if "DOCTYP" in str(data[:10]):
                        continue
                
                    imagefield = NamedBlobImage(
                        # TODO: are all images jpegs?
                        data=data,
                        contentType="image/jpeg",
                        filename=image.text,
                    )
                    image = api.content.create(
                        type="Image",
                        title=image.text,
                        image=imagefield,
                        container=container,
                    )

                    if primaryDisplay == '1':
                        ordering = IExplicitOrdering(container)
                        ordering.moveObjectsToTop([image.getId()])
                    
                    success = True
                    break

            except RequestException as e:
                retries += 1
                if retries < MAX_RETRIES:
                    time.sleep(DELAY_SECONDS)
                else:
                    print(f"Failed to fetch image {image.text} after {MAX_RETRIES} attempts: {e}")

        if not success:
            print(f"Skipped image {image.text} due to repeated fetch failures.")

    return f"Images {images} created successfully"

Here is the full version of the errors:


Versions:

  • Plone 6.0.2 (6013)
  • Volto 16.21.0
  • CMF 2.7.0
  • Zope 5.8
  • Python 3.11.5 (main, Aug 24 2023, 12:23:19) [Clang 15.0.0 (clang-1500.0.40.1)]
  • PIL 9.4.0 (Pillow)
  • WSGI: On
  • Server: waitress 2.1.2

maybe this thread can give an hint. Seems you get the traceback on the Zope BTree "get" function. Maybe you've some key in the BTree of different type. If you can reproduce the live instance, you can put a breakpoint just before line 213 of Products.PluginIndexes.unindex and try to get the _index BTree keys and the key you're trying to get and see if they're of the same type.

Another test is to try to get a traceback when you run "Clear and Rebuild" instead of a restart. You can try to reindex a single index at time, to find which index is creating the problem, delete it and recreate the index.

Busy, so did not read all, but YES: I have seen this before.

Basically, it is when your field type is not the same when added via api as it is when added 'the normal way'.

So your field is a string, and you write an int from the api

so change somevalue to str(somevalue).


If that is not the case, it is the index that was created 'as int' and then changed to string later.

If so, clear and rebuild your catalog

1 Like

Cihan Andac via Plone Community wrote at 2023-10-6 10:52 +0000:

...
Module Products.PluginIndexes.unindex, line 213, in insertForwardIndexEntry
TypeError: '<' not supported between instances of 'int' and 'str'

(Unlike Python 2), Python 3 can no longer compare objects of different
types (unless they are "coercible" to a common type).

The error above tells you that you try to add different types
("int" and "str") into an index. With Python 3, this no longer works.
Ensure all values for an indexed field have the same type.

1 Like

Thank you so much for your answer. I was able to locate the problematic index value by following your advice; reindexing the fields one by one.

1 Like

Thank you @espenmn and @dieter for great explanation of the problem. It helped me a lot to understand why it happened in the first place.

For anyone who is reading this in future; it turns out I created some of the objects with an integer field value, and later when I changed the codes, I transformed the value to string before creating it. Which caused the error message. I spotted the problematic index field, cleared it and then use reindexing. Now everything is working fine.

2 Likes

The strange is the unexpected restart of the instance on the catalog "clear and rebuild", while deleting the single index worked. If reproducible, it would be good to post an issue in the zcatalog issue tracker.

It is indeed a strange behavior. It would be very hard to create the steps to reproduce this problem because I took over this project from another developer and I am also at dark about how some things got here. But if I ever face this problem, I will post it to the zcatalog issue tracker as you suggested.

1 Like

I have had this several times. If I remember roght, it is enough to make a field as int, make some content, make the field text and add more.

I think I had choice field with 1,2,3,... and then added 1,2,3 from api ( used choice field to show editor '1 urgent', '2 medium' etc)

1 Like

We have the same issue with a migration import into Plone 6 with an export from Plone 4.
This error is truly annoying and should/must not happen with a fresh site.

I agree with you, it is really annoying. @yurj 's advice worked like a charm for me. In case you haven't seen it; I cleaned and reindexed all the fields one by one, until I found the problematic one.

The problem in our case is that the customer has a field order in various content-types, used for different purposes. Some definitions are a TextLine, others are an Int. So the mismatch of types is explained (and can be fixed).

The fundamental problem ist ZCatalog. This module is almost 25 years old (I also had my fingers in the code for many years). At the time, the implementation was cool and working (but we did not know better). But with the eyes of today, the implementation is just broken, badly designed and just a mess. Unfortunately, nobody had the courage to replace this mess with a better implementation. ZCatalog stinks on a high level.

But works out of the box, supereasy and solve 99% of the use cases for a site.

We need a better model. For example facet support is missing. Facet queries can be done with the actual model but it is not trivial and not standard.

Maybe starting listing what is missing could help. For example autocompletion is missing OOTB. Text indexing could be improved, also language specific searches are missing.

1 Like

Completely off topic, but for some cases I have made indexers to update two indexes (from one field), saving it as an int in index 1 if it is an int and something else in index 2 if it is not an int.