Modify ↓
Ticket #3290 (new defect)
Make sure an XML parser fully supporting UTF-16 is used by JOSM
| Reported by: | Gubaer | Owned by: | team |
|---|---|---|---|
| Priority: | major | Component: | Core |
| Version: | Keywords: | ||
| Cc: |
Description
See the discussion on dev and josm-dev:
- there's a problem with XML parsers which don't handle UTF-16 correctly. Apparently, they insert duplicates of surrogate code points in OSM keys or values while parsing. After a couple of IO operations even small OSM files/fragments can become very large. In OSM the problem was spotted because of gothic code points in name:got-tags.
- Xerces-J 2.6.2 seems to be affected
- Xerces-J 2.9.1 seems to be OK
- Woodstox StAX XML parser seems to be OK
JOSM should either ship a compliant parser with it's distribution or check/enforce on startup that a known compliant parser is on the classpath.
Attachments
Change History
Note: See
TracTickets for help on using
tickets.



Here the links to the two threads: