I've got a use case in a project where the site managers want to have Files published in Plone, but not findable through any means, not through external search engines like google, but also not using the internal search page. (linking to individual PDF's though QR codes, but the info is only relevant for that product having the QR code on the packaging)
That is not trivial in Plone. You can hide the folder listing in various ways so that the individual Files cannot be traversed to publicly. What I had forgotten though is that we almost always have the sitemap.xml(.gz) activated. And the sitemap still happily includes the individual files for Google, bummer.
Fortunately, robots.txt is applied on top of the sitesmap.xml.gz. So we excluded the offending folders by adding them to the robots.txt. Google Search console now says that the items are found, but not indexed. (jay), but Files last indexed march 2020 were still on display.
The bazooka solution: exclude the subfolders with a 'block' from within the search console, but this only lasts for 6 months.
Now the internal search page. Luckily I found this thread on community from 4 years ago with the workaround tric (Hide folder from search): set all files/items you want to hide with an expire date set in the past. Ok. done. The only other option was to hide all Files from the search results, but there is a lot of information in PDF's we WANT to be found. Just not items from 2-3 folders.
So, mission accomplished, but I'm wondering if we need some more functionality for this in core or through an add'on. The expired trick feels like a work around since the content isn't expired and how do I remove items from the sitemap.xml.gz? When I started this quest I first had the idle hope that ticking 'exclude from navigation' would remove items from the sitemap as it's also a means of navigation. That could be labelled as a bug.
But an 'exclude from search and sitemap' boolean on the settings tab of content items through a behaviors seem a good solution.
Other input, use cases for this?