Opened 18 years ago

Last modified 12 years ago

#518 closed defect

[PATCH] Unicode normalization — at Version 3

Reported by: moyogo@… Owned by: framm
Priority: minor Milestone: 14.01
Component: Core Version: latest
Keywords: Cc:

Description (last modified by Don-vip)

JOSM should normalize strings at input. They should also be normalized when searching.

For example inputing name="Rue de l'École" should end up the same as name="Rue de l'École". The first has "É" as U+0045 LATIN CAPITAL LETTER E + U+0301 COMBINING ACUTE ACCENT while the second has "É" U+00C9 LATIN CAPITAL LETTER E WITH ACUTE.
Searching for one should match the other.

See http://unicode.org/faq/normalization.html for more info.

java.text.Normalizer.normalize(string, java.text.Normalizer.Form.NFC) can be used when required.
NFC is probably better because it's better supported than NFD due to legacy.

Change History (4)

by moyogo@…, 18 years ago

Attachment: josm-normalization.patch added

normalizing strings before comparison in SearchCompiler, and value in PropertiesDialog

comment:1 by stoecker, 17 years ago

Summary: Unicode normalizatin[PATCH] Unicode normalization

comment:2 by stoecker, 17 years ago

Resolution: fixed
Status: newclosed

Fixed in r1155.

comment:3 by Don-vip, 12 years ago

Description: modified (diff)
Milestone: 14.01
Priority: trivialminor
Resolution: fixed
Status: closedreopened

Normalization has been reverted in r1168 but only partially restored in r3556.

Entering name="Rue de l'École" or name="Rue de l'École" do not produce the same result, even if both are found thanks to search compiler.

Note: See TracTickets for help on using tickets.