Opened 16 years ago
Last modified 10 years ago
#3290 closed defect
Make sure an XML parser fully supporting UTF-16 is used by JOSM — at Initial Version
| Reported by: | Gubaer | Owned by: | team |
|---|---|---|---|
| Priority: | major | Milestone: | |
| Component: | Core | Version: | |
| Keywords: | javabug 9 xml unicode stax | Cc: |
Description
See the discussion on dev and josm-dev:
- there's a problem with XML parsers which don't handle UTF-16 correctly. Apparently, they insert duplicates of surrogate code points in OSM keys or values while parsing. After a couple of IO operations even small OSM files/fragments can become very large. In OSM the problem was spotted because of gothic code points in
name:got-tags.
- Xerces-J 2.6.2 seems to be affected
- Xerces-J 2.9.1 seems to be OK
- Woodstox StAX XML parser seems to be OK
JOSM should either ship a compliant parser with it's distribution or check/enforce on startup that a known compliant parser is on the classpath.
Note:
See TracTickets
for help on using tickets.


