Modify

Opened 3 months ago

Closed 3 weeks ago

Last modified 7 days ago

#19180 closed enhancement (fixed)

false positives from tagchecker with single letter differences

Reported by: GerdP Owned by: team
Priority: normal Milestone: 20.07
Component: Core validator Version:
Keywords: tagchecker single letter Cc: Klumbumbus

Description (last modified by Klumbumbus)

Follow up for #17055

I've done the steps described in ticket:17055#comment:55 again on updated taginfo data to get this list:

Value 'y' for key 'x' is unknown, maybe 'z' is meant?

tag know value tagcount decision
access=customercustomers1024|102|910|12✔️ added to deprecated.mapcss and ignorelist
addr:inclusion=estimatedestimate1301|0|1301|0✔️ added to deprecated.mapcss and ignorelist
building=apartmentapartments2368|84|2284|0✔️ added to deprecated.mapcss and ignorelist
building=constructieconstruction3949|0|3949|0✖️ it's a right positive (even if these buildings are not under construction for real) and it appears only on one spot overpass turbo
building=homehotel, house1462|706|756|0✔️ added to ignorelist
building=partbarn, farm1192|322|828|42✔️ added to deprecated.mapcss and ignorelist
building=ruinruins1680|8|1672|0✖️ it's a right positive
building=tankbank2902|11|2890|1✔️ added to ignorelist
building=yes,yes2136|0|2136|0✔️ only in two villages, fixed in the osm database
generator:type=solar_photovoltaic_panelssolar_photovoltaic_panel6702|8|6679|15✔️ added to deprecated.mapcss and ignorelist
historic=farmfort2125|316|1632|177✔️ added to ignorelist
lamp_mount=bent mastbent_mast10347|10334|13|0✔️ added to deprecated.mapcss and ignorelist
lamp_mount=straight maststraight_mast13124|13108|16|0✔️ added to deprecated.mapcss and ignorelist
lamp_type=electricalelectric3678|3678|0|0✔️ added to deprecated.mapcss and ignorelist
lanes=-111820|0|1820|0✔️ already in numerical.mapcss, added to ignorelist
man_made=water_tankwater_tap4243|2433|1807|3✔️ already in deprecated.mapcss, added to ignorelist
resource=claycoal1111|2|1088|21✔️ added to presets
sport=skiingskating5857|1157|4502|198✔️ already in deprecated.mapcss, added to ignorelist
toilets:disposal=pit_latrinepitlatrine3993|3993|0|0✔️ was already fixed in the osm database (taghistory)
water=poolpond2609|0|2583|26✔️ added to ignorelist

I think these tags should be treated in presets or ignoretags.cfg:
historic=farm
resource=clay
sport=skiing

Maybe add a mapcss check for these which proposes a different tag?
building=part -> building:part=yes
building=tank -> man_made=storage_tank or water_tank or ?
water=pool -> amenity=swimming_pool

The building='yes,' objects are from hotosm projects, I've commented some CS.

Attachments (3)

19180-check-deprecated.patch (9.7 KB) - added by GerdP 3 weeks ago.
19180-check-deprecated.2.patch (10.6 KB) - added by GerdP 3 weeks ago.
handle also the "Key 'x' looks like 'y' " message
19180-perf.patch (4.3 KB) - added by GerdP 7 days ago.
performance improvement

Download all attachments as: .zip

Change History (51)

comment:1 Changed 3 months ago by GerdP

Cc: Klumbumbus added
Component: CoreCore validator

comment:2 Changed 3 months ago by skyper

Keywords: tagchecker single letter added
Summary: false positives from Validatorfalse positives from tagchecker with single letter differences

comment:3 Changed 5 weeks ago by Klumbumbus

Milestone: 20.07

comment:4 Changed 4 weeks ago by Klumbumbus

Description: modified (diff)

(convert into table)

comment:5 Changed 4 weeks ago by Klumbumbus

Description: modified (diff)

comment:6 Changed 4 weeks ago by Klumbumbus

(accidently commited 3 rules in r16752)

comment:7 Changed 4 weeks ago by Klumbumbus

Hm, I thought if a tag has an explizit deprecation warning in deprecated.mapcss then it would be removed from the "Value 'y' for key 'x' is unknown, maybe 'z' is meant" warning, but it is not, so we need to add it to the ignore list too.

comment:8 in reply to:  7 Changed 4 weeks ago by GerdP

Replying to Klumbumbus:

Hm, I thought if a tag has an explizit deprecation warning in deprecated.mapcss then it would be removed from the "Value 'y' for key 'x' is unknown, maybe 'z' is meant" warning

I think the validator rules in *.mapcss are ignored. No idea if it would be possible to use them, at least it would require new code.
TagChecker learns from presets, words.cfg and ignoretags.cfg. So if it suggest a deprecated value these sources should be checked.

comment:9 Changed 4 weeks ago by Klumbumbus

I always thought if a deprecation warning exists it was automatically removed from the "Value y for key x not in presets" info-warning, but it seems I'm wrong here too. Well it is a very minor problem, only duplicate warnings in severity warning and other.

comment:10 Changed 3 weeks ago by Klumbumbus

Description: modified (diff)

comment:11 Changed 3 weeks ago by Klumbumbus

In 16760/josm:

see #19180 - deprecate access=customer, addr:inclusion=estimated, building=apartment, lamp_mount="bent mast", lamp_mount="straight mast", lamp_type=electrical (common typos) and add some tags to ignore list

comment:12 Changed 3 weeks ago by Klumbumbus

Description: modified (diff)

comment:13 Changed 3 weeks ago by Klumbumbus

Description: modified (diff)

comment:14 Changed 3 weeks ago by Klumbumbus

In 16762/josm:

see #19180 - Add most tags from deprecated.mapcss to ignorelist to clean up a bit the severity other group of validator results

comment:15 Changed 3 weeks ago by Klumbumbus

Description: modified (diff)

comment:16 in reply to:  14 Changed 3 weeks ago by skyper

Replying to Klumbumbus:

In 16762/josm:

see #19180 - Add most tags from deprecated.mapcss to ignorelist to clean up a bit the severity other group of validator results

Mmh, did you check how this works if you have some tests/groups one the ignore list or even disabled in preferences. I always thought it is a good idea to only have one warning for one issue but lately, I started to change my mind and rather have duplicates instead of missing some cases.

comment:17 Changed 3 weeks ago by GerdP

Ah, come on, if you like duplicates you don't put something into the ignore list.

comment:18 Changed 3 weeks ago by Klumbumbus

Yes, also the other way round, if someone disables tests or puts them on ignore list it is his intention to not get a warning so he probably does not like the info-warning too. Also it is clear if you disable/ignore tests you don't get the "full validator support".

comment:19 Changed 3 weeks ago by Klumbumbus

But we have another problem I was not aware before. The ignorelist contains good and bad tags/keys. And the test "Value 'y' for key 'x' is unknown, maybe 'z' is meant?" takes all entries from the ignore list as good values. So if you tag highway=fard the validator says Value 'fard' for key 'highway' is unknown, maybe 'ford' is meant?" but highway=ford is deprecated.
r16762 made this problem even worse.

Could we remove ignoretag.cfg as dictionary for good values or what was the reason for that?

comment:21 Changed 3 weeks ago by GerdP

ignoretags.cfg should contain those tags which are often used but not contained in presets. Maybe we should implement a new key D: for deprecated tags so that TagChecker can show a different message like
Value 'y' for key 'x' is unknown, maybe deprecated 'z' is meant?

comment:22 in reply to:  21 ; Changed 3 weeks ago by Klumbumbus

Replying to GerdP:

a new key D: for deprecated tags

That sounds like a solution.

so that TagChecker can show a different message like
Value 'y' for key 'x' is unknown, maybe deprecated 'z' is meant?

I think the valdator should not suggest a deprecated value. Better would be in this case if tagchecker falls back to the message "Value y for key x not in presets".

Value fard for key highway not in presets
is better than
Value 'fard' for key 'highway' is unknown, maybe deprecated 'ford' is meant?

comment:23 Changed 3 weeks ago by GerdP

OK, I'll try to code this.

comment:24 in reply to:  22 Changed 3 weeks ago by Klumbumbus

Replying to Klumbumbus:

Replying to GerdP:

a new key D: for deprecated tags

That sounds like a solution.

Ah, no "bad" entries need the distinction between K E S and F too. See e.g. the deprecated entries:

K:highway=ford
E:power_source
S:color:
F::color

comment:25 Changed 3 weeks ago by Klumbumbus

We could define 4 new letters or create an own file for the deprecated entries?

comment:26 Changed 3 weeks ago by GerdP

Looking at the rules in deprecated.mapcss there are also cases where a tag is only deprecated when it is not used with another tag.

*[place_name][!name] {
  throwWarning: tr("{0} should be replaced with {1}", "{0.key}", "{1.key}");
  fixChangeKey: "place_name => name";
}

We cannot code that with the simple format used in igoretags.cfg. Maybe I can change the code so that TagChecker can call its own instance of MapCssTagChecker to verify that the suggested tag(s) is(are) not flagged as deprecated.

comment:27 in reply to:  26 Changed 3 weeks ago by Klumbumbus

Replying to GerdP:

Looking at the rules in deprecated.mapcss there are also cases where a tag is only deprecated when it is not used with another tag.

Yes I noticed too that this requires some "human thinking". Some rules are split (e.g. with and without autofix) or not for all object types. Some rules use e.g. node as object selector because the tag is only deprecated on nodes and some because it appears only on nodes (* was not used as selector there for performance reasons).
Thats why I added the hand-picked list in r16762.

Maybe I can change the code so that TagChecker can call its own instance of MapCssTagChecker to verify that the suggested tag(s) is(are) not flagged as deprecated.

If that would be possible that would be good. Else I think we can live with a hand-picked list?

comment:28 Changed 3 weeks ago by GerdP

BTW: Your latest changes contain some typos:

2020-07-14 16:40:23.123 SEVERE: Invalid line in resource://data/validator/ignoretags.cfg : 'generator:location' does not contain '='
2020-07-14 16:40:23.124 SEVERE: Invalid line in resource://data/validator/ignoretags.cfg : 'monitoring:river_level' does not contain '='
2020-07-14 16:40:23.124 SEVERE: Invalid line in resource://data/validator/ignoretags.cfg : 'fire_hydrant:flow_capacity' does not contain '='
2020-07-14 16:40:23.124 SEVERE: Invalid line in resource://data/validator/ignoretags.cfg : 'fire_hydrant:water_source' does not contain '='
2020-07-14 16:40:23.125 SEVERE: Invalid line in resource://data/validator/ignoretags.cfg : 'is_in' does not contain '='
2020-07-14 16:40:23.125 SEVERE: Invalid line in resource://data/validator/ignoretags.cfg : 'building:type' does not contain '='
2020-07-14 16:40:23.125 SEVERE: Invalid line in resource://data/validator/ignoretags.cfg : 'recycling:metal' does not contain '='

comment:29 Changed 3 weeks ago by Klumbumbus

In 16763/josm:

see #19180 - fix typos in ignorelist, remove police=yes from ignorelist (has warning but is also in presets for an unknown police facility), remove deprecated vending=photos from presets

Changed 3 weeks ago by GerdP

comment:30 Changed 3 weeks ago by GerdP

The patch is not complete yet, it only suppresses the "Value 'y' for key 'x' is unknown, maybe 'z' is meant?" messages.
I'll do the same for "Key 'x' looks like 'y'."

Last edited 3 weeks ago by GerdP (previous) (diff)

Changed 3 weeks ago by GerdP

handle also the "Key 'x' looks like 'y' " message

comment:31 Changed 3 weeks ago by GerdP

  • For highway=fard we now get Info - Presets do not contain property value - Value 'fard' for key 'highway' not in presets.
  • For hihgway=ford we now get Info - Presets do not contain property key - Key 'hihgway' not in presets. instead of Warning - Misspelled property key - Key 'hihgway' looks like 'highway'.
  • For hihgway=stop we get still get Warning Misspelled property key - Key 'hihgway' looks like 'highway'.

I think I prefer the 1st patch which would produce the warning also for hihgway=ford

comment:32 in reply to:  31 Changed 3 weeks ago by Klumbumbus

Replying to GerdP:

I think I prefer the 1st patch which would produce the warning also for hihgway=ford

Yes that's OK as it doesn't directly suggest a deprecated tag.

comment:33 Changed 3 weeks ago by GerdP

Wait a minute. Why do we have highway=ford in ignoredtags.cfg when we also have a test in deprecated.mapcss?
I think my patch is completely unnecessary, we just have to remove the entries from ignoredtags.cfg.

comment:34 in reply to:  33 Changed 3 weeks ago by Klumbumbus

Replying to GerdP:

Why do we have highway=ford in ignoredtags.cfg when we also have a test in deprecated.mapcss?

To suppress the unnecessary info-warning "value ford for key highway not in presets".

comment:35 Changed 3 weeks ago by GerdP

Well, with my patch and the entry in ignoretags.cfg we still have this warning. I coded it like that because of your suggestion
in comment:22

Better would be in this case if tagchecker falls back to the message "Value y for key x not in presets".

So, best solution would be to produce no message for a tag which is flagged as deprecated, right?
Means, we have to change the logic of the patch and we can remove deprecated tags from ignoretags.cfg

comment:36 Changed 3 weeks ago by GerdP

No, sorry, I should not make assumptions that late in the night. Tested again the 1st patch applied on r16780 with all 4 variants:

  • highway=ford gives Warning - deprecated tagging - highway=ford is deprecated, use ford=yes instead
  • highway=fard gives Other - Presets do not contain property value - Value 'fard' for key 'highway' not in presets.
  • hihgway=ford gives Warning - Misspelled property key - Key 'hihgway' looks like 'highway'.
  • hihgway=fard also gives Warning - Misspelled property key - Key 'hihgway' looks like 'highway'.

Without the patch:

  • highway=ford same
  • highway=fard gives wrong Warning - Unknown property value - Value 'fard' for key 'highway' is unknown, maybe 'ford' is meant?
  • hihgway=ford same
  • hihgway=fard same

Without highway=ford in ignoredtags.cfg we see two warnings for highway=ford

  • OK: deprecated tagging - highway=ford is deprecated, use ford=yes instead
  • KO: Unknown property value - Value 'ford' for key 'highway' is unknown, maybe 'road' is meant?

The other variants are as with the patch.

So, yes, the 1st patch seems to be the best solution so far. Commit it?

comment:37 Changed 3 weeks ago by Klumbumbus

I just noticed, "hihgway" might not be the best example mispelled key for testing as it is in words.cfg too.

other example which is not in words.cfg:

with patch v1 highwoy=ford gives Other - Presets do not contain property key - Key 'highwoy' not in presets. which is fine.

So I think patch v1 is fine to commit.

comment:38 Changed 3 weeks ago by GerdP

In 16784/josm:

see #19180: false positives from tagchecker with single letter differences

  • avoid to produce "Value 'y' for key 'x' is unknown, maybe 'z' is meant?" when x=z produces a deprecated message. It uses all *.mapcss rules which are stored in files ending with deprecated.mapcss, so this will fail if user changed the name to something else, but it should work with e.g. my_deprecated.mapcss

comment:39 in reply to:  38 Changed 3 weeks ago by skyper

Replying to GerdP:

In 16784/josm:

see #19180: false positives from tagchecker with single letter differences

  • avoid to produce "Value 'y' for key 'x' is unknown, maybe 'z' is meant?" when x=z produces a deprecated message. It uses all *.mapcss rules which are stored in files ending with deprecated.mapcss, so this will fail if user changed the name to something else, but it should work with e.g. my_deprecated.mapcss

The correct ending of a local validator file is .validator.mapcss. At least, that is the ending in the file manager when adding a new rule.

comment:40 Changed 3 weeks ago by GerdP

I mentioned this because for a developer it is comfortable to configure the path to the local svn repo. This means that a change to the file in the repo triggers the reload of the rules.
If one changes the local file name to something like deprecated_experimental.mapcss it may cause confusion.

comment:41 Changed 3 weeks ago by Klumbumbus

Description: modified (diff)

comment:42 Changed 3 weeks ago by Klumbumbus

Description: modified (diff)

comment:43 Changed 3 weeks ago by Klumbumbus

In 16787/josm:

see #19180

  • deprecate generator:type=solar_photovoltaic_panels and building=part
  • ignore building=home, building=tank, water=pool, historic=farm
  • add resource=clay
  • adjust fixme icons to avoid possible rendering artifacts

comment:44 Changed 3 weeks ago by Klumbumbus

Description: modified (diff)

comment:45 Changed 3 weeks ago by Klumbumbus

Resolution: fixed
Status: newclosed

comment:46 Changed 7 days ago by GerdP

I've just noticed that r16784 introduced a performance problem when data contains some ways with e.g. natural=shoal.
I work on a fix which requires no changes in i18n strings. Is it OK to commit that before new JOSM is released?

Last edited 7 days ago by GerdP (previous) (diff)

Changed 7 days ago by GerdP

Attachment: 19180-perf.patch added

performance improvement

comment:47 Changed 7 days ago by GerdP

The patch

  • avoids to modify the original object as this requires a writeLock and probably causes trouble with locked files
  • avoids to call the rather complex filterDeprecatedTags() method by checking LEVENSHTEIN_DISTANCE first

comment:48 Changed 7 days ago by GerdP

Committed with r16812, see #19598.

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain team.
as The resolution will be set.
The resolution will be deleted.

Add Comment


E-mail address and name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.