Modify

Opened 5 years ago

Closed 5 years ago

Last modified 3 years ago

#10862 closed defect (fixed)

URL contains non-ascii characters warning ?

Reported by: anonymous Owned by: team
Priority: normal Milestone: 14.12
Component: Core validator Version:
Keywords: template_report Cc:

Description

What steps will reproduce the problem?

  1. Download object http://www.openstreetmap.org/node/3191949637
  2. modify any value
  3. Upload data to force a validation

What is the expected result?

No warning

What happens instead?

'contact:website': URL contains non-ascii characters (1)

Please provide any additional information below. Attach a screenshot if possible.

contact:website=http://золотаяцепь.рф is valid. You really could open this website

See: http://www.w3.org/TR/charmod/#URIs

Revision: 7777
Repository Root: http://josm.openstreetmap.de/svn
Relative URL: ^/trunk
Last Changed Author: Don-vip
Last Changed Date: 2014-12-09 18:14:49 +0100 (Tue, 09 Dec 2014)
Build-Date: 2014-12-09 17:17:36
URL: http://josm.openstreetmap.de/svn/trunk
Repository UUID: 0c6e7542-c601-0410-84e7-c038aed88b3b
Last Changed Rev: 7777

Identification: JOSM/1.5 (7777 en) Mac OS X 10.10.1
Memory Usage: 321 MB / 910 MB (105 MB allocated, but free)
Java version: 1.8.0_25, Oracle Corporation, Java HotSpot(TM) 64-Bit Server VM
Dataset consistency test: No problems found

Plugins:
- ImageWayPoint (30737)
- OpeningHoursEditor (30737)
- PicLayer (30762)
- cadastre-fr (30762)
- geotools (30762)
- gpxfilter (30738)
- imagery_offset_db (30808)
- jts (30762)
- measurement (30737)
- notes (v0.9.5)
- opendata (30806)
- reverter (30737)
- scripting (30702)
- tag2link (30719)
- undelete (30762)
- utilsplugin2 (30762)
- wikipedia (30780)

Last errors/warnings:
- W: Could not get presets icon Icon_raa.png
- E: Failed to locate image 'Icon_npu.png'
- W: Could not get presets icon Icon_npu.png
- E: Failed to locate image 'Icon_uzkb.png'
- W: Could not get presets icon Icon_uzkb.png

Attachments (0)

Change History (10)

comment:1 Changed 5 years ago by stoecker

NOTE: We don't support Non-ASCII URL's ATM. Real URL is this: xn--80akeqobjv1b0d3a.xn--p1ai .

Last edited 5 years ago by stoecker (previous) (diff)

comment:2 in reply to:  1 Changed 5 years ago by bastiK

Replying to stoecker:

NOTE: We don't support Non-ASCII URL's ATM.

Who is we and what do you mean by support? If the URL validation test is outdated and therefore broken beyond repair, it has to be removed.

Real URL is this: xn--80akeqobjv1b0d3a.xn--p1ai .

Nope, the real URL is the one with Cyrillic script. What you state is just the compatibility encoding currently necessary because the protocols are not designed to cope with non-ascii characters.

Anyway, http://xn--80akeqobjv1b0d3a.xn--p1ai/ results in validation warning 'URL contains an invalid authority: xn--80akeqobjv1b0d3a.xn--p1ai'.

comment:3 Changed 5 years ago by stoecker

Nope, the real URL is the one with Cyrillic script.

No. As far as I know you register the xn-- coded URL, not the other one. At least this was the case when I registered my domains. The encoding is client side work.

comment:4 in reply to:  3 Changed 5 years ago by bastiK

Replying to stoecker:

Nope, the real URL is the one with Cyrillic script.

No. As far as I know you register the xn-- coded URL, not the other one. At least this was the case when I registered my domains. The encoding is client side work.

Okay, technically yes. But the encoded domain name is quite cryptic and not very helpful for the user. If one insists on ASCII-only URLs, then JOSM and all other clients of OSM data would need a system to display the encoded domain name.

But I think it is more practical to have a relaxed definition of an URL. There is a well defined algorithm to convert from the ASCII form to Unicode and back. Every browser and downloader (wget, curl) can resolve this, so why bother with the cryptic xn-- form?

As this is a tagging issue, in the end, it has to be decided by the OSM community, not us.

comment:5 Changed 5 years ago by Don-vip

Component: CoreCore validator
Milestone: 14.12

comment:6 Changed 5 years ago by Don-vip

There's a ticket on Apache side (https://issues.apache.org/jira/browse/VALIDATOR-290) but it's not active so I'm working on a proper validation in JOSM.

comment:7 Changed 5 years ago by Don-vip

Resolution: fixed
Status: newclosed

In 7824/josm:

fix #10862 - proper validation of IDN (Internationalized Domain Name) URLs, both in their Unicode and ASCII form => patch of Apache DomainValidator routine

comment:8 Changed 5 years ago by stoecker

Did you submit this upstream?

comment:9 Changed 3 years ago by anonymous

Website tag containing non-ascii chars is still not accepted (there is a warning in the core validator). But it is real and can be used directly in a browser. JOSM Version 9329.

comment:10 Changed 3 years ago by Don-vip

In 10472/josm:

see #10862 - https://issues.apache.org/jira/browse/VALIDATOR-290 has been fixed, remove workaround

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain team.
as The resolution will be set.
The resolution will be deleted.

Add Comment


E-mail address and name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.