#11774 closed enhancement (fixed)
[Patch] Warn about obvious misspelled tag keys
Reported by: | mdk | Owned by: | team |
---|---|---|---|
Priority: | normal | Milestone: | 15.09 |
Component: | Core validator | Version: | latest |
Keywords: | Cc: | Klumbumbus |
Description
Inspired by http://www.openstreetmap.org/user/marczoutendijk/diary/35512 mentioned in the current Wochennotiz Nr. 263 I extend the validator now for tag keys as I already did for tag values (see #11498):
A lot of the (faulty) keys I found are of the uppercase/lowercase type:
Name
whenname
was meant for instance. Almost any regular key (amenity
,shop
,tourism
,highway
,landuse
etc) appears in a misspelled version in the database (tourims
,land-use
etc). Also added interpunction (name;
orname,
orname-
) counts for quiet a number of those one-time-only keys.
My patch don't cover all typos, but all uppercase/lowercase and some of the cases with additional and missplaced interpunction like land-use
, name;
, name,
or name-
.
More general, I first normalize all keys found in presets:
- convert to lower case
- replace
-
,:
and SPACE with_
- remove all leading and trailing
-
,_
,;
,:
and,
When during validation a key would trigger the "Presets do not contain property key" warning, I look now, if this key (also normalized) will match one of the normalized preset keys. If I found a match, I produce a warning like Key 'Building' looks like 'building'.
with an auto fix to replace the key with the non normalized key I found in the presets.
See patch validateKeys1.diff.
I also add an alternative patch, where I merge this check with the existing spell checking feature using data/validator/words.cfg
and the other optional dictionaries found by Main.pref.getCollection(PREF_SOURCES, DEFAULT_SOURCES)
.
See patch validateKeys2.diff.
BTW only the second patch warns about Key 'land-use' looks like 'landuse'.
, bacause words.cfg contains
+landuse -land_use
and land-use
is normalized to land_use
:)
With the second patch, we could also reduce the size of words.cfg by eleminating all missspelled keys which are covered by the generic approach. We could also cover the tourims
case by adding this to words.cfg, but this is a different story...
Attachments (3)
Change History (11)
by , 10 years ago
Attachment: | validateKeys1.diff added |
---|
comment:1 by , 10 years ago
Cc: | added |
---|
follow-up: 3 comment:2 by , 10 years ago
Hi, thank you for your contributions. Some remarks:
- some tests are very welcome, see attachment:TagCheckerTest.java (to be placed in
test/unit/org/openstreetmap/josm/data/validation/tests/TagCheckerTest.java
) for a starting example org.openstreetmap.josm.data.validation.tests.TagChecker#prettifyKey
is rather aharmonizeKey
?org.openstreetmap.josm.data.validation.tests.TagChecker#addKey
did not output a single warning on the default presets. Are all those tests needed?
by , 10 years ago
Attachment: | TagCheckerTest.java added |
---|
follow-up: 4 comment:3 by , 10 years ago
Replying to simon04:
Hi, thank you for your contributions. Some remarks:
- some tests are very welcome, see attachment:TagCheckerTest.java (to be placed in
test/unit/org/openstreetmap/josm/data/validation/tests/TagCheckerTest.java
) for a starting example
I wasn't able to execute the tests. I always get the error:
ERROR: java.io.IOException: Failed to open input stream for resource 'resource://data/preferences.xsd'
and the preferences.xml
is replaced by an "empty" version. What is the correct configuration for Eclipse to run these tests?
org.openstreetmap.josm.data.validation.tests.TagChecker#prettifyKey
is rather aharmonizeKey
?
Yes. But then we should also rename prettifyValue.
org.openstreetmap.josm.data.validation.tests.TagChecker#addKey
did not output a single warning on the default presets. Are all those tests needed?
This is a paranoid test to detect errors in presets and/or spelling files. Lets assume one preset uses the wrong key Landuse
, and an other one the correct landuse
, this test will detect such errors.
follow-ups: 5 6 comment:4 by , 10 years ago
Replying to mdk:
I wasn't able to execute the tests. I always get the error:
ERROR: java.io.IOException: Failed to open input stream for resource 'resource://data/preferences.xsd'and the
preferences.xml
is replaced by an "empty" version. What is the correct configuration for Eclipse to run these tests?
How did you run the tests? Running ant clean dist && ant test
seems to work for me. See also InstallNotes#Compiling.
Yes. But then we should also rename prettifyValue.
Yes.
This is a paranoid test to detect errors in presets and/or spelling files. Lets assume one preset uses the wrong key
Landuse
, and an other one the correctlanduse
, this test will detect such errors.
I'm not aware of any typo of this kind in the presets. I personally would be reluctant when testing conditions which hardly ever will occur …
comment:5 by , 10 years ago
Replying to simon04:
Replying to mdk:
I wasn't able to execute the tests. I always get the error:
ERROR: java.io.IOException: Failed to open input stream for resource 'resource://data/preferences.xsd'and the
preferences.xml
is replaced by an "empty" version. What is the correct configuration for Eclipse to run these tests?
How did you run the tests? Running
ant clean dist && ant test
seems to work for me. See also InstallNotes#Compiling.
From the context menu "Run As" -> "JUnit Test"
This is a paranoid test to detect errors in presets and/or spelling files. Lets assume one preset uses the wrong key
Landuse
, and an other one the correctlanduse
, this test will detect such errors.
I'm not aware of any typo of this kind in the presets. I personally would be reluctant when testing conditions which hardly ever will occur …
Yes. Thats what my tests shows too :-)
We could remove this test easily:
private static void addKey(String harmonizedKey, String key) { String otherKey = harmonizedKeys.get(harmonizedKey); if (otherKey == null) { harmonizedKeys.put(harmonizedKey, key); // Main.debug(prettyKey + " -> " + key); } }
But keep in mind: I check the presets and the spelling files together. And the harmonizedKeys map is is only created on first validator run.
comment:6 by , 10 years ago
Replying to simon04:
This is a paranoid test to detect errors in presets and/or spelling files. Lets assume one preset uses the wrong key
Landuse
, and an other one the correctlanduse
, this test will detect such errors.
I'm not aware of any typo of this kind in the presets. I personally would be reluctant when testing conditions which hardly ever will occur …
We had typos in the internal preset for color/colour
and protected_class
(https://josm.openstreetmap.de/ticket/10691#comment:14)
comment:8 by , 10 years ago
Milestone: | → 15.09 |
---|
simple patch