I had to monkeypatch plone.app.layout.viewlets.common.TitleViewlet.page_title to remove the use of html.escape that resulted in weird html-quoted titles in the browser. Is html.escape still required (in 2022)?
IIRC it is to avoid CSRF. But proof me wrong.
I did not know. I just read that escaping is useful to avoid CSRF for situations with submitted data (forms,xhr) , but for things like a title tag I doubt it's useful.
Title is in a list of "Safe HTML Attributes":
Does it mean that title tags should be escaped, or because they are "safe", no escaping is required?
There's an example on how to abuse the title tag;
Maybe sanitizing the text of a title tag would be a better method than escaping it?
Sanitizing is always error prone. But maybe our
safe_html transform is good enough?
Best open an issue and ping the security team for an opinion.
From what I see in
plone.app.layout used in Plone 5.2, we escape the page title in python code, but then in the
title.pt template use
structure to show it. This should mean that a
& in the title gets escaped to
& in Python but gets turned back to
& in the template. This should mean it is safe against any nastyness, but also shows the title as you would want it.
There may be other places where this is not the case, and which may be tricky to fix without reintroducing security problems. See this bug report:
I chose to use
Products.PortalTransforms.transforms.safe_html as a sanitizer, it should be good enough. More advices are welcome.
Edit: I switched to BeautifulSoup for sanitizing, using get_text().