Modify

Opened 11 days ago

Closed 8 days ago

Last modified 8 days ago

#18360 closed defect (fixed)

validator : wikipedia language code are not up to date

Reported by: pyrog Owned by: Don-vip
Priority: normal Milestone: 19.11
Component: Core validator Version:
Keywords: template_report wikipedia Cc:

Description

Hi,

The static list of language code is different from the wikimedia commons api (that list effective wiki sites and their languages).

The wikipedia.mapcss allow 127 languages that don't have yet wikipedia website.
And 8 languages are missing.

What steps will reproduce the problem?

  1. add the following tag: wikipedia=sje:Dummy
  1. press the validation button

What is the expected result?

Got this warning "la clé wikipedia contient un préfixe de langue inconnu"
(And this is an error as the wiki website couldn't be reached)

What happens instead?

No warning/error messages are displayed by the validator.
When opening the link (with tag2link plugin), the web browser can't open https://sje.wikipedia.org/wiki/Dummy

It the Wikipedia plugin is installed, it display this error : "[Wiki] Unknown Wikipedia language prefix 'sje'!"

Please provide any additional information below. Attach a screenshot if possible.

https://josm.openstreetmap.de/browser/josm/trunk/data/validator/wikipedia.mapcss#L13

For information, the wikipedia plugin download dynamically the uptodate list of wikipedia languages.
https://gitlab.com/JOSM/plugin/wikipedia/blob/master/src/main/java/org/wikipedia/WikipediaApp.java#L58
https://gitlab.com/JOSM/plugin/wikipedia/blob/master/src/main/java/org/wikipedia/validator/WikipediaValueFormat.java#L97

BR,
Yves

URL:https://josm.openstreetmap.de/svn/trunk
Repository:UUID: 0c6e7542-c601-0410-84e7-c038aed88b3b
Last:Changed Date: 2019-11-24 21:23:35 +0100 (Sun, 24 Nov 2019)
Build-Date:2019-11-25 02:31:03
Revision:15541
Relative:URL: ^/trunk

Identification: JOSM/1.5 (15541 fr) Mac OS X 10.14.6
OS Build number: Mac OS X 10.14.6 (18G95)
Memory Usage: 596 MB / 1820 MB (126 MB allocated, but free)
Java version: 1.8.0_231-b11, Oracle Corporation, Java HotSpot(TM) 64-Bit Server VM
Screen: Display 69732928 1280x800
Maximum Screen Size: 1280x800
VM arguments: [-Djava.security.policy=file:<java.home>/lib/security/javaws.policy, -DtrustProxy=true, -Djnlpx.home=<java.home>/bin, -Djava.security.manager, -Djnlpx.origFilenameArg=${HOME}/Library/Application Support/Oracle/Java/Deployment/cache/6.0/31/583aa85f-4a297e61, -Djnlpx.remove=false, -Dsun.awt.warmup=true, -Djava.util.Arrays.useLegacyMergeSort=true, -Djnlpx.heapsize=NULL,2048m, -Dmacosx.jnlpx.dock.name=JOSM (development version), -Dmacosx.jnlpx.dock.icon=${HOME}/Library/Application Support/Oracle/Java/Deployment/cache/6.0/25/4c122699-72a21903.icns, -Djnlp.application.href=https://josm.openstreetmap.de/download/josm-latest.jnlp , -Djnlpx.jvm="<java.home>/bin/java"]
Dataset consistency test: No problems found

Plugins:
+ CADTools (1008)
+ PicLayer (35104)
+ SeaMapEditor (34908)
+ apache-commons (35092)
+ apache-http (34908)
+ cadastre-fr (35194)
+ ejml (35122)
+ geotools (35169)
+ jaxb (35014)
+ jna (34908)
+ jts (35122)
+ opendata (35179)
+ reverter (35226)
+ tag2link (35149)
+ utilsplugin2 (35230)
+ wikipedia (1.1.3)

Tagging presets:
+ https://josm.openstreetmap.de/josmfile?page=Presets/Towers&zip=1
+ https://raw.githubusercontent.com/OpenNauticalChart/josm/master/INT-1-preset.xml
+ https://josm.openstreetmap.de/josmfile?page=Presets/Telecom&zip=1

Validator rules:
+ https://github.com/Jungle-Bus/transport_mapcss/raw/gh-pages/transport.validator.zip
+ ${HOME}/Downloads/Rules_Pictures.validator.mapcss

Last errors/warnings:
- W: Identifiant de territoire inconnu: JA
- W: Identifiant de territoire inconnu: JA
- W: Identifiant de territoire inconnu: JA
- W: Identifiant de territoire inconnu: JA
- W: Identifiant de territoire inconnu: JA
- W: Identifiant de territoire inconnu: JA
- W: java.net.SocketTimeoutException: connect timed out
- W: Already here java.net.SocketException: Network is unreachable (connect failed)
- E: java.net.SocketTimeoutException: connect timed out
- W: org.openstreetmap.josm.io.OsmTransferException: Impossible de joindre le serveur. Veuillez vérifier votre connexion Internet.. Cause : java.net.SocketTimeoutException: connect timed out

Attachments (1)

wikipedia language code.csv (7.2 KB) - added by pyrog 11 days ago.

Download all attachments as: .zip

Change History (10)

comment:1 Changed 11 days ago by pyrog

can't send you a .CSV file that compare both list

comment:2 in reply to:  1 Changed 11 days ago by stoecker

Replying to pyrog:

can't send you a .CSV file that compare both list

Either gzip/zip it or leave the URLs out (or make them non-URLs). Better would be, when you already provide a patch containing the necessary changes.

Changed 11 days ago by pyrog

Attachment: wikipedia language code.csv added

comment:3 Changed 11 days ago by pyrog

I removed the URLs, thanks 😀

Write a patch is better, but I'am not familiar with TRAC, don't have a Java toolchain.
The good thing maybe not to edit the regex manually, but do this (semi) automatically. That a design issue and need JOSM/Java specialists 😉

comment:4 Changed 11 days ago by Klumbumbus

Component: CoreCore validator

comment:5 Changed 10 days ago by floscher

As of today, these are the available language prefixes:

ab|ace|ady|af|ak|als|am|an|ang|ar|arc|arz|as|ast|atj|av|ay|az|azb|ba|ban|bar|bat-smg|bcl|be|be-x-old|bg|bh|bi|bjn|bm|bn|bo|bpy|br|bs|bug|bxr|ca|cbk-zam|cdo|ce|ceb|ch|chr|chy|ckb|co|cr|crh|cs|csb|cu|cv|cy|da|de|din|diq|dsb|dty|dv|dz|ee|el|eml|en|eo|es|et|eu|ext|fa|ff|fi|fiu-vro|fj|fo|fr|frp|frr|fur|fy|ga|gag|gan|gcr|gd|gl|glk|gn|gom|gor|got|gu|gv|ha|hak|haw|he|hi|hif|hr|hsb|ht|hu|hy|hyw|ia|id|ie|ig|ik|ilo|inh|io|is|it|iu|ja|jam|jbo|jv|ka|kaa|kab|kbd|kbp|kg|ki|kk|kl|km|kn|ko|koi|krc|ks|ksh|ku|kv|kw|ky|la|lad|lb|lbe|lez|lfn|lg|li|lij|lmo|ln|lo|lrc|lt|ltg|lv|mai|map-bms|mdf|mg|mhr|mi|min|mk|ml|mn|mnw|mr|mrj|ms|mt|mwl|my|myv|mzn|na|nah|nap|nds|nds-nl|ne|new|nl|nn|no|nov|nqo|nrm|nso|nv|ny|oc|olo|om|or|os|pa|pag|pam|pap|pcd|pdc|pfl|pi|pih|pl|pms|pnb|pnt|ps|pt|qu|rm|rmy|rn|ro|roa-rup|roa-tara|ru|rue|rw|sa|sah|sat|sc|scn|sco|sd|se|sg|sh|shn|si|simple|sk|sl|sm|sn|so|sq|sr|srn|ss|st|stq|su|sv|sw|szl|szy|ta|tcy|te|tet|tg|th|ti|tk|tl|tn|to|tpi|tr|ts|tt|tum|tw|ty|tyv|udm|ug|uk|ur|uz|ve|vec|vep|vi|vls|vo|wa|war|wo|wuu|xal|xh|xmf|yi|yo|za|zea|zh|zh-classical|zh-min-nan|zh-yue|zu

Plus these prefixes, but these Wikipedias are closed:

aa|cho|ho|hz|ii|kj|kr|mh|mus|ng

They should not be extracted from the URL mentioned in https://josm.openstreetmap.de/browser/josm/trunk/data/validator/wikipedia.mapcss?rev=15473#L11 but rather from https://www.wikidata.org/w/api.php?action=sitematrix&formatversion=2 .
The former contains all languages for Wikimedia sites, but not for all of those languages there is a separate Wikipedia (e.g. de-formal is a language that you can select in the language picker, but there is no Wikipedia for it). The latter URL lists all languages, for which a Wikipedia exists.

comment:6 Changed 8 days ago by Don-vip

Keywords: wikipedia added
Milestone: 19.11
Owner: changed from team to Don-vip
Status: newassigned

comment:7 Changed 8 days ago by Don-vip

Resolution: fixed
Status: assignedclosed

In 15545/josm:

fix #18360 - fix list of wikipedia language prefixes

comment:8 Changed 8 days ago by floscher

At https://josm.openstreetmap.de/changeset/15545/josm it looks to me as if there are encoding issues in the file: e.g. абвабв

comment:9 Changed 8 days ago by Don-vip

In 15546/josm:

see #18360 - fix encoding issue

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain Don-vip.
as The resolution will be set.
The resolution will be deleted.

Add Comment


E-mail address and name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.