Opened 7 years ago
Closed 7 years ago
#14858 closed enhancement (fixed)
"Similarly named ways" Should also detect names with accent and case variations
Reported by: | naoliv | Owned by: | team |
---|---|---|---|
Priority: | normal | Milestone: | 17.06 |
Component: | Core validator | Version: | |
Keywords: | Cc: |
Description (last modified by )
Have two highways, one named Rua São João
and another Rua Sao Joao
Validate the data and see a proper warning about Similarly named ways
Now change the last highway name from Rua Sao Joao
to Rua SAO JOAO
and validate again.
We have no warnings anymore (which is wrong).
It seems that the Similarly named ways
test should also catch such kind of variations (accents and case).
JOSM:
URL:http://josm.openstreetmap.de/svn/trunk Repository:UUID: 0c6e7542-c601-0410-84e7-c038aed88b3b Last:Changed Date: 2017-05-29 16:19:58 +0200 (Mon, 29 May 2017) Build-Date:2017-05-29 14:25:03 Revision:12275 Relative:URL: ^/trunk Identification: JOSM/1.5 (12275 en) Linux Debian GNU/Linux 9 (stretch) Memory Usage: 247 MB / 10206 MB (84 MB allocated, but free) Java version: 1.8.0_131-8u131-b11-2-b11, Oracle Corporation, OpenJDK 64-Bit Server VM Screen: :0.0 1600x900, :0.1 1280x1024 Maximum Screen Size: 1600x1024 Java package: openjdk-8-jre:amd64-8u131-b11-2 Java ATK Wrapper package: libatk-wrapper-java:all-0.33.3-13 VM arguments: [-Dawt.useSystemAAFontSettings=on] Program arguments: [--language=en] Dataset consistency test: No problems found
Attachments (0)
Change History (6)
comment:1 by , 7 years ago
Description: | modified (diff) |
---|---|
Summary: | "Similarly named ways" should probably be case insensitive → Should also detect names with accent and case variations |
comment:2 by , 7 years ago
Summary: | Should also detect names with accent and case variations → "Similarly named ways" Should also detect names with accent and case variations |
---|
comment:3 by , 7 years ago
Description: | modified (diff) |
---|
comment:4 by , 7 years ago
Comparing strings is a complex subject. Currently we raise a warning if the Levenshtein_distance is 1 or 2, with some normalization rules.
Basically it's the number of characters that differ. So it works well with your first example (two letters differ) but not with the second one because 5 letters differ.
Maybe we could add a normalization rule that handle the case where the difference concerns only the character case.
comment:5 by , 7 years ago
Milestone: | → 17.06 |
---|
(I promise I will better review what I write before filling a ticket on the next time)