Migrating default-zpublisher-encoding from iso-8859-1 to utf-8

I have successfully migrated to Zope 4 and now the real nightmare begins: Migration to Python 3.

It was horrible enough for a Zope instance that already was using pure unicode/utf-8 but now I have to migrate an instance with default-zpublisher-encoding iso-8859-1. Some tests suggest I should not even try and migrate the instance to unicode/utf-8 first:

I am using Zope 4.1.3 and successfully ran zodbupdate and zodbverify for Python 3. zope.conf starts with "default-zpublisher-encoding iso-8859-1".

-REQUEST.form contains unicode entries

-Database connections with "Connection character set" not set return latin1 encoded bytes. Thats good but the documentation says this is not supported

-Any literal string not marked with b will be now be a unicode string (naturally, I guess)

-And, finally, DTML-Templates in latin-1 cannot even be loaded in the ZMI:

2020-01-29 13:58:34 ERROR [waitress:363][waitress] Exception while serving /xxx/Db/TabellenErgebnis/manage_main
Traceback (most recent call last):
  File "/home/zope/z4.1.3/lib/python3.5/site-packages/waitress/channel.py", line 356, in service
  File "/home/zope/z4.1.3/lib/python3.5/site-packages/waitress/task.py", line 172, in service
  File "/home/zope/z4.1.3/lib/python3.5/site-packages/waitress/task.py", line 440, in execute
    app_iter = self.channel.server.application(environ, start_response)
  File "/home/zope/z4.1.3/lib/python3.5/site-packages/ZPublisher/httpexceptions.py", line 30, in __call__
    return self.application(environ, start_response)
  File "/home/zope/z4.1.3/lib/python3.5/site-packages/paste/translogger.py", line 69, in __call__
    return self.application(environ, replacement_start_response)
  File "/home/zope/z4.1.3/lib/python3.5/site-packages/ZPublisher/WSGIPublisher.py", line 338, in publish_module
    response = _publish(request, new_mod_info)
  File "/home/zope/z4.1.3/lib/python3.5/site-packages/ZPublisher/WSGIPublisher.py", line 256, in publish
  File "/home/zope/z4.1.3/lib/python3.5/site-packages/ZPublisher/mapply.py", line 85, in mapply
    return debug(object, args, context)
  File "/home/zope/z4.1.3/lib/python3.5/site-packages/ZPublisher/WSGIPublisher.py", line 62, in call_object
    return obj(*args)
  File "/home/zope/z4.1.3/lib/python3.5/site-packages/Shared/DC/Scripts/Bindings.py", line 335, in __call__
    return self._bindAndExec(args, kw, None)
  File "/home/zope/z4.1.3/lib/python3.5/site-packages/Shared/DC/Scripts/Bindings.py", line 372, in _bindAndExec
    return self._exec(bound_data, args, kw)
  File "/home/zope/z4.1.3/lib/python3.5/site-packages/App/special_dtml.py", line 214, in _exec
  File "/home/zope/z4.1.3/lib/python3.5/site-packages/DocumentTemplate/_DocumentTemplate.py", line 145, in render_blocks
    render_blocks_(blocks, rendered, md, encoding)
  File "/home/zope/z4.1.3/lib/python3.5/site-packages/DocumentTemplate/_DocumentTemplate.py", line 246, in render_blocks_
    block = block(md)
  File "/home/zope/z4.1.3/lib/python3.5/site-packages/DocumentTemplate/DT_With.py", line 85, in render
    return render_blocks(self.section, md, encoding=self.encoding)
  File "/home/zope/z4.1.3/lib/python3.5/site-packages/DocumentTemplate/_DocumentTemplate.py", line 145, in render_blocks
    render_blocks_(blocks, rendered, md, encoding)
  File "/home/zope/z4.1.3/lib/python3.5/site-packages/DocumentTemplate/_DocumentTemplate.py", line 167, in render_blocks_
    t = md[t]
  File "/home/zope/z4.1.3/lib/python3.5/site-packages/DocumentTemplate/_DocumentTemplate.py", line 376, in __getitem__
    return self.getitem(name, call=1)
  File "/home/zope/z4.1.3/lib/python3.5/site-packages/DocumentTemplate/_DocumentTemplate.py", line 388, in getitem
    e = e[key]
  File "/home/zope/z4.1.3/lib/python3.5/site-packages/DocumentTemplate/_DocumentTemplate.py", line 299, in __getitem__
    return str(self.inst)
  File "/home/zope/z4.1.3/lib/python3.5/site-packages/DocumentTemplate/DT_HTML.py", line 234, in __str__
    return self.quotedHTML()
  File "/home/zope/z4.1.3/lib/python3.5/site-packages/DocumentTemplate/DT_HTML.py", line 226, in quotedHTML
    if text.find(reg) >= 0:
TypeError: a bytes-like object is required, not 'str'

So how do I convert the ZODB so I can change default-zpublisher-encoding from iso-8859-1 to utf-8 while still running Zope with Python 2?

Hi, some suggestions:

Thats not true: if the literal start with b it is bytes. If it has no maker is is str in Python 2 and Python 3 but str has a different meaning. In Python 2 bytes is an alias for str but in Python 3 it is a different datatype.

It seems that that DT_HTML.String.raw needs to be str not bytes because reg is str.

How did you convert your ZODB to Python 3? (Did you use Python 3 to run the conversion? (There is an option for a default and even a fallback encoding when using Python 3!))
Maybe DTMLTemplate needs a conversion dict like https://github.com/zopefoundation/Products.PythonScripts/blob/570777294b1eeb4e36edbcab43ed6973fd84ac3a/setup.py#L68-L72. But this would not help you because it uses utf-8 and there is already a conversion rule here: https://github.com/zopefoundation/Zope/blob/aa7eebe58887e545bd96fe95b5deec64c9fec53d/src/OFS/init.py#L8

What class has self in DocumentTemplate/DT_HTML.py", line 234, in __str__: return self.quotedHTML()?

I hope this points you into the right direction.

I know - I was talking about Python 3 and about whether it makes sense to continue using latin1 with it. I don't think it should be done because mixing str and bytes in Python3 is not possible - so every literal in latin1 instance Python scripts would have to be prefixed with b. Also the MySQL connection objects seem to lack support for latin1 with Python 3.

../z4.1.3/bin/zodbupdate --pack -f var/Data.fs --convert-py3 --encoding utf-8 --encoding-fallback latin1


I tried it the other way round. In that case the conversion fails.

I am sorry for misleading you about my main question. I am convinced I should convert my Zope instance to unicode/utf8 first and my main question was about how to start:

After the migration to Python 3 all Python 2 str (besides the binary ones) should be converted to Python 3 str (aka unicode) so the problem should dissolve in the migration.

Thank you. So I will just use default-zpublisher-encoding utf-8 with Python3. This means I have to make two steps at once (migration to unicode and to Python3) but it should be manageable.