Language-tag casing in Plone, Nick and Volto: lessons learned during multilingual implementation of Nick + Volto

*I'll start with my own mistake. In our style guide I proposed using uppercase for country/region codes — the BCP-47 canonical form, zh-CN, pt-BR. It reads as "the correct form," and for display it is. But when I carried that rule everywhere — including the language-root ids on our Nick backend — multilingual pages started returning 404s. Investigating, I found I'd misunderstood the issue and, frankly, overlooked RFC 4647 §2"Matching of language tags to language ranges MUST be done in a case-insensitive manner." The fix wasn't more uppercase; it was realising a language tag wears a different casing on different surfaces, and that the one hard requirement — case-insensitive matching — is exactly the one I'd skipped. Here's the whole picture, from standing up multilingual on Nick + Volto, in case it spares you the same detour. A per-surface cheat-sheet is attached.*


The symptom

On a multilingual site, clicking the site logo (or visiting /) from a Chinese or Brazilian Portuguese page returned a 404, while the English and Swedish homes worked perfectly. The tell: only languages with a region subtag (zh-CN, pt-BR) broke; single-subtag languages (en, sv) never did.

That pattern is the whole story in miniature: the problem only appears where a tag has a second subtag whose case can differ between systems.

The half-truth: "tag case doesn't matter"

It's a common — and correct — observation that language-tag case carries no meaning: zh-CN and zh-cn are the same tag. The trap is reading "doesn't matter" as "you can ignore it." The standards say something more precise and more demanding:

RFC 5646 §2.1.1: "language tags and their subtags … are to be treated as case insensitive: there exist conventions for the capitalization of some of the subtags, but these MUST NOT be taken to carry meaning."

RFC 4647 §2 (the matching standard): "Matching of language tags to language ranges MUST be done in a case-insensitive manner."

So "case doesn't matter" doesn't free you to ignore it — it obligates you to match case-insensitively. The recommended canonical casing (language lowercase, region UPPER, script Title-case) is, per RFC 5646, a presentation recommendation, not a requirement.

The part that's easy to miss: it's four surfaces, not one

A single language ends up wearing several casings, each correct in its own surface. Conflating them is what bites:

Surface Form Why
Display / declaration tag<html lang>, react-intl, supportedLanguages, your DB language column zh-CN, pt-BR BCP-47 recommended casing (region UPPER, script Title) — a presentation context
gettext locale directorylocales/<x>/LC_MESSAGES/*.po zh_CN, pt_BR POSIX/gettext convention (underscore, region UPPER)
Plone language-root id / URL — the LRF folder, the path Volto navigates to zh-cn, pt-br Plone's own id convention — lowercase-hyphen
Matching — resolving a requested path/tag to content (any case) case-insensitive (RFC 4647 MUST)

The third row is the surprising one and worth pinning down, because it's where the 404 lives. The lowercase language-root id is the Plone convention, not an arbitrary choice — it's visible at three layers:

  • plone.i18n's combined language vocabulary keys are lowercase-hyphen (pt-br, zh-cn).

  • plone.app.multilingual creates the Language Root Folder with folderId = str(code) — i.e. exactly that lowercase code.

  • Volto codified it deliberately: toBackendLang() lowercases precisely so the frontend speaks the backend's id form (pt-br), while toReactIntlLang() keeps pt-BR for react-intl.

So a single language legitimately appears as zh-CN (display), zh_CN (locale dir), and zh-cn (root id) — all correct, all the same language.

Where it bit us (Nick + Volto)

We created our Nick language-root ids in the BCP-47 display form — zh-CN — which is a perfectly valid tag. But Volto's root redirect sends the browser to /${toBackendLang(lang)}/zh-cn, and the backend resolved that path with a case-sensitive lookup (WHERE id = 'zh-cn'), which didn't match the zh-CN folder → 404. Single-subtag languages had no region to re-case, so they never tripped it.

In other words: we'd put a display-cased tag into an identifier slot, and the resolver was case-sensitive — a direct brush with the RFC 4647 MUST. Two fixes are valid, and they sit at different layers:

  1. Align the id to the convention — create/rename region-subtag roots in the lowercase form (zh-cn). Matches stock Plone; smallest change.

  2. Make the resolver case-insensitive — the RFC-4647-pure fix; resolves any casing and is future-proof for every region/script variant at once.

We went with (1) for the immediate fix; (2) is the more standards-complete answer if you're touching the resolver anyway.

One edge worth flagging for script subtags

If you support script-subtagged languages (sr-Cyrl, sr-Latn, zh-Hans/zh-Hant), note that Volto's toReactIntlLang currently uppercases the second subtag unconditionally and keeps only the first two subtags. That turns sr-Cyrl into sr-CYRL (a script subtag should be Title-case per ISO 15924) and drops the third subtag of sr-Cyrl-RS. Region subtags (the common case) are unaffected; script subtags are. Worth keeping in mind — and probably worth a small upstream fix — if your language set includes them.

Takeaways

  • "Case is non-significant" is a requirement, not a permission — it means match case-insensitively (RFC 4647 MUST).

  • A language wears different casings on different surfaces — display tag (zh-CN), locale dir (zh_CN), Plone root id (zh-cn). Keep them straight per surface rather than picking one "winner."

  • The lowercase root id is the Plone convention, not an oddity — confirmed in plone.i18n, plone.app.multilingual, and Volto's own helpers.

  • When something 404s only for region/script-subtagged languages, suspect a case-sensitive match of an id that was minted in the wrong surface's casing.