Getting the search engine to see private docs

Hello,

I was reading the documentation and from what I understand, once you have a folder in "private" mode, including all its contents, the search engine does not index the files and pages you create in that folder. But if this is the case and I'm trying to build an intranet whereby documents are visible ONLY to those that are logged on and that are visible to the search engine, how do I go about creating such an environment?

Many thanks

Everything is being indexed independed of its state. The state of parent container is irrelevant for indexing.
Filtering is applied upon query time based on your current security context.

"private" is related to workflow (often publication workflow). Something in "private" mode is work under development, not yet ready to be seen by "normal" users. Typically, when the author has finished his work, he "published" the object. Depending on the workflow in effect, this may not yet directly lead to the publication but instead cause a "review" step (involving editors/reviewers) after which the object is either published (typically viewable by everyone) or rejected.

As Andreas has pointed out, every object is indexed (independent of its workflow state/mode). What objects are returned by searches depends on the current user's "View" permissions. The workflow typically is set up such that only very few users (typically the "Owner", maybe "Manager"s and "Site Administrator"s) have "View" permission on objects in "private" state (or mode).

Read the documentation about "workflow" to learn more.

Cheers zopyx and dieter. I read the workflow docs but I'm not sure I have digested it all. So in my example, where I have USER1, USER2, USER3 on my intranet which belong to GROUP1, if I make my top hierarchy folder "viewable" to GROUP1, then they will all be able to search against the context of that folder correct? But because the folder remains in "private" mode, although those in GROUP1 can see the content in the searches, the wider world can't because its not published.

I guess what I'm trying to understand is how to make all my content visible to GROUP1 without exposing it to non-logged users on the intranet. So in essence, what I'm trying to achieve is that everyone can view some content on the intranet, independantly of whether they are logged in or not, and GROUP1 can view a whole other bunch of additional content but only when they are logged in (for security purpses, i.e. i want to expose procedures to the wider intrante but invoices only to the finance team). Is there a clever way of doing this without leaving the folders/content for GROUP1 indefinately in "private" mode (this is the only way I have worked out to keep invoices visible to GROUP1 but not to everyone else.)

Thanks

Sorry I can not follow what you are doing. In doubt use the Intranet workflow for your content where you can "publish internally" and "publish externally".

-aj

Please take a closer look at https://docs.plone.org/working-with-content/collaboration-and-workflow/index.html and https://training.plone.org/5/workflow/index.html. As @zopyx already suggested, it would be better if you change the workflow to "Intranet" which gives you way more options than the default workflow and may better fit your usecase.

Ah, so quite clearly I've missed the bit where it refers to "publish internally" and "publish externally" which appears to be exactly what I'm looking for. Is this https://docs.plone.org/adapt-and-extend/config/content-settings.html?highlight=advanced%20workflow where i set it up? Yes, I accept that it sounded a bit convoluted but I think you understood where I was trying to get to.

So I would set up the intranet/extranet as the "default" workflow whereby the "published" status is limited to internal only? And then create a new workflow (still need to figure out how I create a new "default2" workflow) that allows me to publish externall & internally. Wow, this is incredibly flexible. I was banging my head trying to work this out when someone has already invented the wheel ...

Cheers

The search looks at the permissions on the object itself, not on those of ancestor objects. Thus, even if an ancestor is "private" a search may return (non private) objects below it.

In addition, workflow is usually set up to control some permissions (especially, the "View" permission) on the objects controlled by the the workflow. Special definitions are required to ensure that viewability of an ancestor are inherited by the objects further down the hierarchy; especially, a state change on the ancestor must ensure that the index related to the "View" permission (--> "allowed_roles_and_users" (or similarly spelled) is updated for the contained content objects). I am not sure that Plone's "Folder Workflow" is properly set up for this (in fact, I have my doubts). It is possible to set up a workflow which behaves as you seem to want it, but it is not trivial.

You might be interested in Products.CMFPlacefulWorkflow. It allows you to have "local" workflow associations. For "vanilla Plone", the workflow association is global (for the whole Plone portal); the above extension allows you to give subhierarchies their own workflow associations.

You could achieve this with a specialized workflow, more precisely, a specialized definition for the "View" permission. This definition would grant to "GROUP1" the "View" permission independent on the workflow state, but to "Anonymous" only for the "Published" state. The workflow definition works with "roles", not groups. Thus, you would need to create a new role and give your group this role.