Plone root traversal other site root

iwolf · April 22, 2020, 7:25am

Hi all, I have a Plone installation that with multiple projects/websites.

/Plone
/Plone1

If I access the first via http://localhost:8180/Plone (direct into Plone without Proxy/Webserver) the website is displayed. If I enter http://localhost:8180/Plone/Plone1 the second website is display. How could I avoid this behavior? I would expect to see an "Site not found"-error message. When I create a page http://localhost:8180/Plone/Plone1 (Plone1 as Document) the message is displayed.

From my point of view this could be a nightmare for SEO if google for some reasons catches http://localhost:8180/Plone/Plone1.

I am using Plone 5.1.5

Any help would be very appreciated.

Best regards
Ingo

dieter · April 22, 2020, 8:22am

This is "normal" Zope behavior (called acquisition) - and is considered a feature. It allows resources, e.g. style sheets or Javascript resources, to be made available higher up in a hierarchy and nevertheless access them via simple relative links from lower down in the hierarchy.

You have already found one way to prevent this: objects at the correct place (in your case plone1 inside plone) take precedence over "acquired" objects.

You can use a SiteAccess "access rule" to implement a more general solution. Be warned, however, that such access rules are potentially dangerous; it is easy to prevent access you actually do not want to prevent.

iwolf · April 22, 2020, 8:58am

Thanks for the reply. I agree that this is a feature. In case of having two "Plone Sites" accessed via to domains from two different customers (opponents) I would prefer to prevent this behavior on root level.
Site one Id: domainone
Site two Id: domaintwo

When accessing the following, it would show the website of the opponent via the domain of the other.
www.domainone.org/domainone
www.domaintwo.org/domaintwo

I came across that issue when "googling" a test page - google somehow became aware of the example above and the search result showed the content of the other page.

Do you by any chance have an example/hint of how the access rule would be done to have the acquisition done within the plone site?

dieter · April 22, 2020, 9:45am

Nowadays, there may be other (more modern) ways to handle such use cases. Some objects (I think, Products.CMFCore.Folder.Folder objects and therefor Plone site objects) look for IBeforePublishing (or similarly spelled) subscription adapters and call them (look at the Folder source code). This has the advantage (over an "access rule") that a single subscription adapter registration can handle all your sites.

Whether you use a subscription adapter or an "AccessRule", I would register a post traversal hook (--> request.post_traverse, defined in ZPublisher.BaseRequest.BaseRequest). Such a hook is called after the traversal is finished. In this hook, I would (essentially) verify that request["PUBLISHED"] is in the context of the initial portal (--> aq_inContextOf, defined (and documented) in Acquisition). Note that the real published object may be a method; in this case, you cannot use the object itself for the "in context check" but must use its __self__.
No guarantees that following the recipe above works as expected. If you use a subscription adapter, then commenting the registration and restarting would allow you to recover.

There used to be an envvar to disable "AccessRules", but, apparently, this is no longer the case. This means that if you do something wrong with an "AccessRule", you may need to use non standard ways (e.g. bin/instance debug and low level API) to get rid again of the broken "AccessRule".

mauritsvanrees · April 22, 2020, 10:21am

You could try the collective.explicitacquisition add-on, which should disallow access on Plone/Plone1. I have not tried it. There might be side effects.

You could also do a redirect from Plone/Plone1 to Plone1 in your front end web server (Apache, nginx).

zopyx · April 22, 2020, 10:29am

Why and how would you traverse into a different Plone using your given URL? If you reference a different Plone site then reference this site using their proper canonical URL instead of relying on acquisition magic and traversal trickery.

iwolf · April 22, 2020, 11:27am

Thank a lot for the replys. I am currently testing collective.siteisolation (using post traversal hook) which seems to do the job. I just have to do some renaming on my plone site Ids to make sure there are no naming conflicts with other urls but the results are promising.
@zopyx: I don't want and I don't want a customer or google to it by mistake. I wanna make sure that www.domaintwo.org/domainone results in 404.

zopyx · April 22, 2020, 12:51pm

If you want to avoid such URLs, block them in Apache or whatever using a rewrite rule..

djay · April 24, 2020, 7:16am

if that doesn't work there are 2 more plugins mentioned in Site acquisition issues - collective.siteisolation not working