Hi, I have installed collective.documentviewer on Plone 5.1. I had the impression that it does OCR, scans documents for text a.s.o., in order to index the files into the catalog.
PDFs and DOCs are indexed. ODTs (LibreOffice) files are not.
I have visted portal_catalog/manage_objectInformation?rid= for both a DOC- and a ODT-file, the DOC has SearchableText filled, the ODT mentions the filename only.
If I run the commands:
/usr/local/bin/docsplit pdf /tmp/tmpO4OgEW/dump.odt --output /tmp/tmpO4OgEW
/usr/local/bin/docsplit text /tmp/tmpO4OgEW/dump.pdf --language eng --no-ocr --pages all --output /tmp/tmpO4OgEW/text
I get a file, which content matches the text of the odt. But this information does not get into the catalog.
There is an old issue https://github.com/collective/collective.documentviewer/issues/57 from 2015, which may or may not suggest that the AddOn shall add text to the catalog.
Should I expect collective.documentviewer to index the ODT into the catalog correctly?
collective.documentviewer = 5.0.4
repoze.catalog = 0.8.3
Products.ZCatalog = 3.0.3
ZEO = 5.1.1