Skip to content

Read l10n .properties files as UTF-8 with ISO-8859-1 fallback (#4883)#4885

Merged
shai-almog merged 1 commit into
masterfrom
utf8-properties-l10n-4883
May 8, 2026
Merged

Read l10n .properties files as UTF-8 with ISO-8859-1 fallback (#4883)#4885
shai-almog merged 1 commit into
masterfrom
utf8-properties-l10n-4883

Conversation

@shai-almog
Copy link
Copy Markdown
Collaborator

Summary

  • Fixes [BUG] UTF-8 characters are displayed incorrectly #4883 — UTF-8 characters in Bundle_*.properties files were displayed incorrectly (e.g. "à" rendered as "Ã ") because java.util.Properties.load(InputStream) reads bytes as ISO-8859-1.
  • New PropertiesUtil.loadUtf8WithFallback reads .properties files as UTF-8 first and falls back to ISO-8859-1 only when the bytes are not valid UTF-8 — mirrors the JDK 9+ PropertyResourceBundle behavior from JEP 226. Existing native2ascii (pure ASCII + \uXXXX) and legacy Latin-1 files keep working.
  • Wired into the three places that load locale .properties files: CN1CSSCLI (Maven build / CSS pipeline), L10NTask (Ant resource builder), and L10nEditor (Designer GUI import).

Test plan

  • CSSLocalizationTest extended with three new sub-tests (run via its main):
    • UTF-8 round-trip using the exact accented Italian string from the issue
    • Legacy ISO-8859-1 file (café written as Latin-1 bytes) — verifies fallback
    • \uXXXX-escaped native2ascii-style file — verifies escape decoding still works
  • mvn -pl designer compile -Plocal-dev-javase clean
  • Manual verification with the project from [BUG] UTF-8 characters are displayed incorrectly #4883 once merged

🤖 Generated with Claude Code

java.util.Properties.load(InputStream) reads files as ISO-8859-1, so
UTF-8-encoded bundles like Bundle_it.properties get mis-decoded -- "à"
(0xC3 0xA0) shows up as "Ã ". Users had to run native2ascii to work
around this.

New PropertiesUtil.loadUtf8WithFallback mirrors JDK 9+
PropertyResourceBundle (JEP 226): try strict UTF-8 first, fall back to
ISO-8859-1 if the bytes are not valid UTF-8 so legacy native2ascii /
Latin-1 files keep working. \\uXXXX escapes are honored in either mode.

Wired into the three places that load locale .properties files:
- CN1CSSCLI (Maven build / CSS pipeline)
- L10NTask (Ant resource-builder task)
- L10nEditor (Designer GUI properties import)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@shai-almog shai-almog merged commit 90acf24 into master May 8, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] UTF-8 characters are displayed incorrectly

1 participant