Add an index for the new field

Is the documentation for adding indexing to new fields for dexterity objects (16. Behaviors – Mastering Plone 6 development — Plone Training 2023 documentation) still valid? If so, if you add field with an uploaded file (such as a PDF), what Index type do you assign to it so that the contents of the PDF are indexed? E.g.,

<?xml version="1.0"?>
<object name="portal_catalog">
  <!--<column value="my_meta_column"/>-->
</object>

... to ...

<?xml version="1.0"?>
<object name="portal_catalog">
    <index name="file" meta_type="??????????">
      <indexed_attr value="featured"/>
    </index>
</object>

Incidentally, can you do the same thing by adding a new index at (root)/portal_catalog/manage_catalogIndexes ?

Short version:

  • use plone.indexer for providing a customer indexer to Plone
  • inside your indexer, you can use portal_transforms for converting your PDF using the build-in conversion pipeline from PDF to plain text and provide the result as return value of your indexer