Modify

Opened 3 months ago

#24381 new defect

Undesirable normalization of whitespace characters

Reported by: filip.hejsek@… Owned by: team
Priority: normal Milestone:
Component: Core Version:
Keywords: template_report Cc:

Description

JOSM normalizes all whitespace in tags during upload (see FixDataSpace in FixDataHook.java). This results in some undesirable changes:

  • Line breaks are replaced by spaces. This was already reported in #24014. (Note: Even though #24014 was supposedly fixed, the fix doesn't work - it disables normalization when editing through a preset, but the normalization is still applied during upload)
  • Non-breaking space is replaced by regular space. In the Czech Republic, some users use non-breaking spaces in street names in accordance with local typographic conventions. Normalizing whitespace when a user merely changes geometry tends to cause inconsistencies in name between different parts of a single street.

I believe these sorts of changes should never be done automatically when the user did not intend to change the value at all. What I would find acceptable is at most stripping leading and trailing whitespace. (But even then, silently changing something the user did not touch seems dubious.)

What steps will reproduce the problem?

  1. Download an object which has a tag containing unicode whitespace characters (other than ASCII space, e.g. linebreaks or non-breaking space)
  2. Make any change to the object (e.g. move to a different position)
  3. Upload changes

What is the expected result?

Tags that the user did not edit are not changed in any way.

What happens instead?

Tag value is silently changed just before upload. Any sequence of unicode whitespace characters is replaced by a single ASCII space.

Please provide any additional information below. Attach a screenshot if possible.

Note: This ticket is specifically about normalization performed during upload. I'm not asking for the ability to edit tags containing unicode whitespace, just preservation of existing data is enough for me.

Relative:URL: ^/trunk
Repository:UUID: 0c6e7542-c601-0410-84e7-c038aed88b3b
Last:Changed Date: 2025-06-26 08:18:38 +0200 (Thu, 26 Jun 2025)
Revision:19418
Build-Date:2025-06-27 01:31:13
URL:https://josm.openstreetmap.de/svn/trunk

Identification: JOSM/1.5 (19418 cs) Linux Arch Linux
Memory Usage: 376 MB / 11408 MB (266 MB allocated, but free)
Java version: 24.0.1, Arch Linux, OpenJDK 64-Bit Server VM
Look and Feel: javax.swing.plaf.metal.MetalLookAndFeel
Screen: :0.0 1920x1080x[Multi depth]@60Hz (scaling 1.00×1.00)
Maximum Screen Size: 1920×1080
Best cursor sizes: 16×16→16×16, 32×32→32×32
Environment variable LANG: cs_CZ.UTF-8
System property file.encoding: UTF-8
System property sun.jnu.encoding: UTF-8
Locale info: cs_CZ
Numbers with default locale: 1234567890 -> 1234567890
Desktop environment: GNOME
VM arguments: [-Djosm.restart=true, -Djava.net.useSystemProxies=true, -XX:MaxRAMPercentage=75.0, --add-exports=java.base/sun.security.action=ALL-UNNAMED, --add-exports=java.desktop/com.sun.imageio.plugins.jpeg=ALL-UNNAMED, --add-exports=java.desktop/com.sun.imageio.spi=ALL-UNNAMED]
Dataset consistency test: No problems found

Plugins:
+ changeset-viewer (1746100587)
+ imagery_offset_db (36438)
+ openqa (113)

Map paint styles:
+ https://josm.openstreetmap.de/josmfile?page=Styles/OsmcSKCZPL&zip=1
+ ${HOME}/bicycle_routes.mapcss
- https://josm.openstreetmap.de/josmfile?page=Styles/ConscriptionStreetnumber&zip=1
- https://josm.openstreetmap.de/josmfile?page=Styles/Lane_and_Road_Attributes&zip=1
- https://josm.openstreetmap.de/josmfile?page=Styles/LessObtrusiveNodes&zip=1
- https://josm.openstreetmap.de/josmfile?page=Styles/SidewalksAndFootways&zip=1

OSM API: https://api06.dev.openstreetmap.org/api

Attachments (0)

Change History (0)

Modify Ticket

Change Properties
Set your email in Preferences
Action
as new The owner will remain team.
as The resolution will be set. Next status will be 'closed'.
to The owner will be changed from team to the specified user.
Next status will be 'needinfo'. The owner will be changed from team to filip.hejsek@….
as duplicate The resolution will be set to duplicate. Next status will be 'closed'. The specified ticket will be cross-referenced with this ticket.
The owner will be changed from team to anonymous. Next status will be 'assigned'.

Add Comment


E-mail address and name can be saved in the Preferences .
 
Note: See TracTickets for help on using tickets.