Modify

Opened 6 months ago

Last modified 5 months ago

#20916 new defect

The SimilarNamedWays test reports false positive on Arabic street names

Reported by: selimachour@… Owned by: team
Priority: normal Milestone:
Component: Core validator Version:
Keywords: SimilarNamedWays Arabic i18n Cc:

Description

What steps will reproduce the problem?

  1. Assign two imaginary names to two streets : First ["name"="نهج الشمس"] (sun st.) and second: ["name"="نهج القمر"] (moon st.)
  2. Run validation

What is the expected result?

No warnings as the names are totally different ... when you do read Arabic :)

What happens instead?

I have a warning of two similar named ways: "نهج الشمس", "نهج القمر"

Please provide any additional information below. Attach a screenshot if possible.

URL:https://josm.openstreetmap.de/svn/trunk
Repository:UUID: 0c6e7542-c601-0410-84e7-c038aed88b3b
Last:Changed Date: 2021-04-27 20:35:33 +0200 (Tue, 27 Apr 2021)
Build-Date:2021-04-27 21:58:39
Revision:17833
Relative:URL: ^/trunk

Identification: JOSM/1.5 (17833 en) Linux Manjaro Linux
Memory Usage: 485 MB / 3536 MB (261 MB allocated, but free)
Java version: 1.8.0_292-b10, Oracle Corporation, OpenJDK 64-Bit Server VM
Look and Feel: javax.swing.plaf.metal.MetalLookAndFeel
Screen: :0.0 1920×1080 (scaling 1.00×1.00)
Maximum Screen Size: 1920×1080
Best cursor sizes: 16×16→16×16, 32×32→32×32
Environment variable LANG: en_US.utf8
System property file.encoding: UTF-8
System property sun.jnu.encoding: UTF-8
Locale info: en_US
Numbers with default locale: 1234567890 -> 1234567890
Desktop environment: X-Cinnamon
VM arguments: [-Djosm.restart=true]
Dataset consistency test: No problems found

Plugins:
+ FixAddresses (35640)
+ Mapillary (1.5.37.6)
+ apache-commons (35524)
+ apache-http (35589)
+ continuosDownload (91)
+ jna (35662)
+ mapwithai
+ utilsplugin2 (35691)

Map paint styles:
+ https://josm.openstreetmap.de/josmfile?page=Styles/Coloured_Streets&zip=1
- https://josm.openstreetmap.de/josmfile?page=Styles/Maxspeed&zip=1

Last errors/warnings:
- 00044.536 E: Failed to locate image 'regulatory--dual-lanes-cyclists-and-pedestrians--g1'
- 00044.981 E: Failed to locate image 'regulatory--texts--g1'
- 00044.981 E: Failed to locate image 'regulatory--texts--g2'
- 00045.058 E: Failed to locate image 'void--car-mount'
- 00045.058 E: Failed to locate image 'void--dynamic'
- 00045.059 E: Failed to locate image 'void--ego-vehicle'
- 00045.059 E: Failed to locate image 'void--ground'
- 00045.059 E: Failed to locate image 'void--static'
- 00045.152 E: Failed to locate image 'warning--kangaroo-crossing--g1'
- 00971.709 E: Invalid setting (Icon missing): org.openstreetmap.josm.plugins.fixAddresses.FixAddressesPreferences

Attachments (3)

20916_worksforme.osm (900 bytes) - added by Don-vip 5 months ago.
similar_named_ways.osm (131.3 KB) - added by selimachour@… 5 months ago.
20916.osm (7.4 KB) - added by Don-vip 5 months ago.
minimal test data to reproduce

Download all attachments as: .zip

Change History (10)

comment:1 Changed 6 months ago by skyper

Keywords: Arabic added
Summary: The SimilarNamedWays plugins reports false positive on Arabic street namesThe SimilarNamedWays test reports false positive on Arabic street names

comment:2 Changed 5 months ago by Don-vip

Keywords: i18n added

Changed 5 months ago by Don-vip

Attachment: 20916_worksforme.osm added

comment:3 Changed 5 months ago by Don-vip

Owner: changed from team to selimachour@…
Status: newneedinfo

Can't reproduce with the dataset I created based on your examples. Can you please provide a small sample showing the problem?

Changed 5 months ago by selimachour@…

Attachment: similar_named_ways.osm added

comment:4 Changed 5 months ago by anonymous

Strangely, starting from a blank layer and adding two streets with the example names didn't produce the bug, but downloading some osm data still leads to the same problem.
I attached the .osm file where my version 17919 still warns about similar named ways.

comment:5 Changed 5 months ago by anonymous

Owner: changed from selimachour@… to team
Status: needinfonew

comment:6 Changed 5 months ago by Don-vip

Milestone: 21.07
Owner: changed from team to Don-vip
Status: newassigned

comment:7 Changed 5 months ago by Don-vip

Milestone: 21.07
Owner: changed from Don-vip to team
Status: assignednew

Thanks, I can reproduce, but unfortunately I have no idea how to fix that. This is not linked to the name being in Arabic, the test works as expected.

1st name: U+0627 U+0644 U+0634 U+0645 U+0633
2nd name: U+0627 U+0644 U+0642 U+0645 U+0631

The Levenshtein distance is only 2, that's the threshold used to detect a similarity.

Last edited 5 months ago by Don-vip (previous) (diff)

Changed 5 months ago by Don-vip

Attachment: 20916.osm added

minimal test data to reproduce

Modify Ticket

Change Properties
Set your email in Preferences
Action
as new The owner will remain team.
as The resolution will be set.
to The owner will be changed from team to the specified user.
The owner will change to selimachour@gmail.com
as duplicate The resolution will be set to duplicate.The specified ticket will be cross-referenced with this ticket
The owner will be changed from team to anonymous.

Add Comment


E-mail address and name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.