I am migrating a website from FrontPage2003.
On the old site, there are thousands of folder which contains an index.htm file.
When migrating to Plone, these become the default views of their folders (which is good).
Unfortunately, the relative links 'misses one folder', so if the link in the index.htm is
It will instead point to
Is there any other option than renaming the index.htm files ? (and not making them the default view)
Not really, getting the text from html files into plone is already 'confusing'. I am importing them to Plone 5, but I might upgrade and then all types will be folderish anyway (?), so I dont want to 'use more add-ons' than neccesarry
( there are thousands of internal links (and links from other websites) to these index.html pages (but a rewrite rule could probably fix that, though)
from pathlib import Path
from lxml import html
directory = '/path/to/dir'
for path in Path(directory).rglob('*.html'):
with open(path, "r") as input_file:
page: str = input_file.read()
tree: html.HtmlElement = html.fromstring(page)
for a in tree.xpath("//a"):
# check if the href is valid/brocken
# evtl. convert absolute to relative paths or viceversa
# replace the a.attrib['href'] with your modified href
# a.attrib['href'] = "your/modified/href"
# write the modified tree to file
# with open(path, "r") as output_file:
The example above replaces the files in the file system. I'd consider to do it in the file system before you import it to Plone.
If you write to files run this in a copy of your data! Otherwise they will be overwritten.
This should work with the least amount of effort, assuming those index.html files are just the content listing.
You can use plone.api to grab all content object called 'index.html' and set the view on the parent.