Opened 3 months ago
#24381 new defect
Undesirable normalization of whitespace characters
Reported by: | Owned by: | team | |
---|---|---|---|
Priority: | normal | Milestone: | |
Component: | Core | Version: | |
Keywords: | template_report | Cc: |
Description
JOSM normalizes all whitespace in tags during upload (see FixDataSpace in FixDataHook.java). This results in some undesirable changes:
- Line breaks are replaced by spaces. This was already reported in #24014. (Note: Even though #24014 was supposedly fixed, the fix doesn't work - it disables normalization when editing through a preset, but the normalization is still applied during upload)
- Non-breaking space is replaced by regular space. In the Czech Republic, some users use non-breaking spaces in street names in accordance with local typographic conventions. Normalizing whitespace when a user merely changes geometry tends to cause inconsistencies in name between different parts of a single street.
I believe these sorts of changes should never be done automatically when the user did not intend to change the value at all. What I would find acceptable is at most stripping leading and trailing whitespace. (But even then, silently changing something the user did not touch seems dubious.)
What steps will reproduce the problem?
- Download an object which has a tag containing unicode whitespace characters (other than ASCII space, e.g. linebreaks or non-breaking space)
- Make any change to the object (e.g. move to a different position)
- Upload changes
What is the expected result?
Tags that the user did not edit are not changed in any way.
What happens instead?
Tag value is silently changed just before upload. Any sequence of unicode whitespace characters is replaced by a single ASCII space.
Please provide any additional information below. Attach a screenshot if possible.
Note: This ticket is specifically about normalization performed during upload. I'm not asking for the ability to edit tags containing unicode whitespace, just preservation of existing data is enough for me.
Relative:URL: ^/trunk Repository:UUID: 0c6e7542-c601-0410-84e7-c038aed88b3b Last:Changed Date: 2025-06-26 08:18:38 +0200 (Thu, 26 Jun 2025) Revision:19418 Build-Date:2025-06-27 01:31:13 URL:https://josm.openstreetmap.de/svn/trunk Identification: JOSM/1.5 (19418 cs) Linux Arch Linux Memory Usage: 376 MB / 11408 MB (266 MB allocated, but free) Java version: 24.0.1, Arch Linux, OpenJDK 64-Bit Server VM Look and Feel: javax.swing.plaf.metal.MetalLookAndFeel Screen: :0.0 1920x1080x[Multi depth]@60Hz (scaling 1.00×1.00) Maximum Screen Size: 1920×1080 Best cursor sizes: 16×16→16×16, 32×32→32×32 Environment variable LANG: cs_CZ.UTF-8 System property file.encoding: UTF-8 System property sun.jnu.encoding: UTF-8 Locale info: cs_CZ Numbers with default locale: 1234567890 -> 1234567890 Desktop environment: GNOME VM arguments: [-Djosm.restart=true, -Djava.net.useSystemProxies=true, -XX:MaxRAMPercentage=75.0, --add-exports=java.base/sun.security.action=ALL-UNNAMED, --add-exports=java.desktop/com.sun.imageio.plugins.jpeg=ALL-UNNAMED, --add-exports=java.desktop/com.sun.imageio.spi=ALL-UNNAMED] Dataset consistency test: No problems found Plugins: + changeset-viewer (1746100587) + imagery_offset_db (36438) + openqa (113) Map paint styles: + https://josm.openstreetmap.de/josmfile?page=Styles/OsmcSKCZPL&zip=1 + ${HOME}/bicycle_routes.mapcss - https://josm.openstreetmap.de/josmfile?page=Styles/ConscriptionStreetnumber&zip=1 - https://josm.openstreetmap.de/josmfile?page=Styles/Lane_and_Road_Attributes&zip=1 - https://josm.openstreetmap.de/josmfile?page=Styles/LessObtrusiveNodes&zip=1 - https://josm.openstreetmap.de/josmfile?page=Styles/SidewalksAndFootways&zip=1 OSM API: https://api06.dev.openstreetmap.org/api