Opened 6 months ago
Last modified 3 months ago
#8211 reopened enhancement
Automatic data corrector functionality
| Reported by: | stoecker | Owned by: | team |
|---|---|---|---|
| Priority: | normal | Component: | Core |
| Version: | Keywords: | ||
| Cc: | bastik, Don-vip |
Description
A suggestion caused by the color-->colour change.
We already have a "drop useless tags" option in JOSM. Another problem with OSM is the increasing number of diverging tagging methods and also spelling mistakes.
I would suggest an option to auto-correct these when uploading, same like the dropping keys option. With each upload we then would unify the database a little more, fix spelling mistakes, remove beginning or ending spaces, fix lowercase/uppercase issues and also remove our own errors like "color".
It is clear that adding new entries to such a list must be done careful, but also it should be clear that we would go the normal JOSM way, which means we accept suggestions from outside and wiki, but the final decision is our own.
Changes should be grouped into "always" (beginners, default) and "advanced" (only in expert mode, check changes with user).
We wont stop the fact that crap will accumulate in db, but at least we can reduce it a bit.
Attachments (1)
Change History (20)
comment:1 Changed 6 months ago by simon04
comment:2 Changed 5 months ago by stoecker
- Resolution set to fixed
- Status changed from new to closed
In 5621/josm:
comment:3 follow-up: ↓ 6 Changed 5 months ago by bastiK
- Resolution fixed deleted
- Status changed from closed to reopened
We should only do these silent fixes for very undisputed cases. Maybe I've missed some discussions, but type=multipolygon && boundary=administrative used to be the common way to map boundaries in some countries.
If users choose to tag in some way (and are aware of the options) we shouldn't force them use whatever JOSM developers happen to prefer. Of course we can make suggestions. As Simon mentioned, this is basically what the DepricatedTags does.
comment:4 Changed 5 months ago by akks
In 5623/josm:
comment:5 Changed 5 months ago by akks
I have fixed NPE in r5621 (could not upload natural=wood, type=multipolygon).
Not sure about type=multipolygon && boundary=administrative autofixing. For my country it is good, but what for the others?
comment:6 in reply to: ↑ 3 Changed 5 months ago by stoecker
Replying to bastiK:
We should only do these silent fixes for very undisputed cases. Maybe I've missed some discussions, but type=multipolygon && boundary=administrative used to be the common way to map boundaries in some countries.
These type has mainly been automatic imports. And after 3 years now the stats show it was never really accepted. There have not been negative comments when I finally stated the deprecated state in the wiki and also Frederik has not really a counter-argument (but still hoping for a area primitive).
While I agree with Frederik that mass-retagging is not a good idea such silent changes are acceptable in my eyes. And I wanted to include a little bit controverse tag, so that we see if
- People actually notice and
- if they care about it.
It's so silent about JOSM lately...
comment:7 Changed 5 months ago by OverQuantum
Only if in expert mode JOSM will reask user about changes.
It would be better, if in expert mode JOSM will allow user to load unchanged data - after additional confirmation or so.
comment:8 follow-up: ↓ 9 Changed 5 months ago by Ivan Komarov
I think that silent automatic correction of manually entered tags is unacceptable. A user should be asked if he agrees with these changes and should have an option to suppress it.
comment:9 in reply to: ↑ 8 ; follow-up: ↓ 11 Changed 5 months ago by bastiK
Replying to Ivan Komarov:
I think that silent automatic correction of manually entered tags is unacceptable. A user should be asked if he agrees with these changes and should have an option to suppress it.
Well, the idea is that the user always wants these changes. Or at least should want them. :)
@stoecker: I had in mind the initial version with roles enclave and exclave which was not as powerful/consistent as multipolygon. Now the standard seems to be: Use multipolygon syntax (roles inner and outer) but simply with another name (type=boundary). So there are no strong reasons to prefer the old/alternative tagging type=multipolygon && boundary=administrative because it is basically the same. Still I think this autofix is quite bold, but you seem to be aware of that...
comment:10 Changed 5 months ago by skyper
As I am in favour of using type=boundary, I will be able to blame JOSM, now. Maybe this will bring some noise. Personally, I think this will be a big change and needs to be well documented.
Dirk did change the wiki page in English but all other western european languages versions totally contradict as they all still recommand to use type=multipolygon for boundaries.
Why is only boundary=administrative changed. E.g. postal_code and LEZ should be changed, too.
comment:11 in reply to: ↑ 9 Changed 5 months ago by stoecker
@stoecker: I had in mind the initial version with roles enclave and exclave
Some years ago I helped a bit to unify the two styles. enclave and exclave aren't used really anymore and I also thought about changing the few remaining to inner/outer, but the code was not yet ready for role changing (tags are easier). Feel free to add it.
Still I think this autofix is quite bold, but you seem to be aware of that...
;-)
Either nobody cares or we get a discussion. In both cases I know what people think about that. If I start a discussion by asking in a forum the result will be that lots of people who like discussing more than real work will give their opinion and the real users stay silent. This has no use.
Dirk did change the wiki page in English but all other western european languages versions totally contradict as they all still recommand to use type=multipolygon for boundaries.
Feel free to update. These recommendations haven't have been in the English version previously (I had an eye on that, as I was against a multipolygon recommendation since the beginning).
Why is only boundary=administrative changed. E.g. postal_code and LEZ should be changed, too.
These aren't documented at all, so I wanted to be on the safe side and don't change them (yet).
comment:12 Changed 5 months ago by Sergey Astakhov
Presence of boundary=administrative should not be only condition to change relation type from multipolygon.
For example, in Russia we actively use two types of boundary - boundary=administrative and place=city/town/etc (description in russian). In some cases they are equals, so there is boundary=administrative and place=* on the same relation. type=boundary may be ok for boundary=administrative, but not for place=*.
If these tags should not appears on the same object - then in first place there is should be error is validation, but not silent changing from one type of relation to another.
comment:13 Changed 5 months ago by simon04
I would like the software do what I want it to do. This is why I use Linux. Also do I expect from JOSM not to make any hidden changes (probably except for completely undisputed cases as created_by and odbl). I consider the DeprecatedTags test perfectly suited for the use-case. It worked for a long time without those automatic changes.
comment:14 Changed 5 months ago by skyper
Thought there is an extra message about these changes in expert mode but it does not show up and changes are made.
comment:15 follow-up: ↓ 16 Changed 3 months ago by anonymous
I stumbled across this issue as I tried to change type=boundary to type=multipolygon for admin_level=10 boundaries in The Netherlands.
We just finished completing the boundaries for all places. 2173 of them are tagged type=multipolygon, which is the Dutch standard. The other 327 'still' have type=boundary. I was very surprised and annoyed when Josm kept changing them back.
In my opinion, it's a bit bold to just add this kind of functionality without even a notification in the release notes. The boundary=administrative/type=multipolygon convention is mainly used in Germany and The Netherlands, so it couldn't be so hard to consult at least those two communities.
Gertjan Idema
By the way, a automatic silent change from "type=boundary" to type="multipolygon" for administrative boundaries would be a nice feature ;-)
comment:16 in reply to: ↑ 15 Changed 3 months ago by g.idema@…
The previous comment was anonymous by mistake.
comment:17 Changed 3 months ago by stoecker
Well, I still think current solution is correct, but one comment is right: We forgot to document it in the changelog.
comment:18 Changed 3 months ago by stoecker
P.S. The suggested patch is really wrong. It removes "boundary=administrative"!
comment:19 Changed 3 months ago by stoecker
For current stats: November last year: 120.000 type=boundary and 30.000 type=multipolygon, today 160.000 to 36.000. Give the code some time and it will be 200.000 to 0 and we finally have one unique style. :-)



The DeprecatedTags validaton test is somewhat related.