Opened 5 years ago
Closed 5 years ago
#19136 closed defect (fixed)
Validator no longer raises issues for old-style multipolygons
Reported by: | sebastic | Owned by: | team |
---|---|---|---|
Priority: | normal | Milestone: | 20.05 |
Component: | Core validator | Version: | tested |
Keywords: | old-style multipolygon | Cc: | Klumbumbus, Don-vip |
Description
At least two scenarios don't trigger validator issues:
- Create outer way
- Create inner way
- Select both ways
- Use Create Multipolygon tool
- Upload (No validator issues)
Expected an issue about the untagged relation as it only has type=multipolygon.
- Create outer way
- Create inner way
- Select both ways
- Use Create Multipolygon tool
- Add tags to outer way (e.g. natural=water)
- Upload (No validator issues)
Expected an issue about old-style multipolygons as only the outer ways has descriptive tags.
While the area project was deemed completed, old-style multipolygons are still created on a daily basis. It used to be mostly iD users, but with the JOSM validator no longer catching these issues they are increasingly responsible for them as well.
Attachments (10)
Change History (58)
comment:1 by , 5 years ago
Cc: | added |
---|
follow-up: 4 comment:3 by , 5 years ago
Hmm, that will produce false positives. The area style depends on OUR map style.
Why is this a problem? Old style multipolygons aren't rendered anywhere, so the feedback is there simply by the fact that it works nowhere.
follow-up: 5 comment:4 by , 5 years ago
Replying to stoecker:
Hmm, that will produce false positives. The area style depends on OUR map style.
Yes, that can be a problem.
Why is this a problem? Old style multipolygons aren't rendered anywhere, so the feedback is there simply by the fact that it works nowhere.
We produce a warning for an untagged node or way, so we should do the same with multipolygons.
comment:5 by , 5 years ago
Replying to GerdP:
Replying to stoecker:
Hmm, that will produce false positives. The area style depends on OUR map style.
Yes, that can be a problem.
Why is this a problem? Old style multipolygons aren't rendered anywhere, so the feedback is there simply by the fact that it works nowhere.
We produce a warning for an untagged node or way, so we should do the same with multipolygons.
That's valid and will not produce false positives.
comment:6 by , 5 years ago
We could use this method in class OsmPrimitive
:
public final boolean hasAreaTags() { return hasKey("landuse", "amenity", "building", "building:part") || hasTag("area", OsmUtils.TRUE_VALUE) || hasTag("waterway", "riverbank") || hasTagDifferent("leisure", "picnic_table", "slipway", "firepit") || hasTag("natural", "water", "wood", "scrub", "wetland", "grassland", "heath", "rock", "bare_rock", "sand", "beach", "scree", "bay", "glacier", "shingle", "fell", "reef", "stone", "mud", "landslide", "sinkhole", "crevasse", "desert"); }
comment:7 by , 5 years ago
We could give it a try.
Totally untagged multipolygon could be error. The other one maximum warning.
follow-up: 9 comment:8 by , 5 years ago
Totally untagged multipolygon could be error.
There will always be the type=multipolygon tag ;) Should we check AbstractPrimitive.getInterestingTags()
?
Or really r.getNumKeys() == 1
?
comment:9 by , 5 years ago
Replying to GerdP:
Totally untagged multipolygon could be error.
There will always be the type=multipolygon tag ;)
I assumed everybody knows that ;-)
Should we check
AbstractPrimitive.getInterestingTags()
?
I'd think so.
Or really
r.getNumKeys() == 1
?
No. That would restrict the result too much.
by , 5 years ago
Attachment: | 19136.1.patch added |
---|
chnages for the test. The unit test file needs more work, there are 24 "No area style for multipolygon" message with this patch
comment:10 by , 5 years ago
I'd leave the info level style based warning (probably minus the "old style" handling for outer ways).
comment:11 by , 5 years ago
Not sure what you mean. The unpatched code still looked at the area style of the outer way to find the style for the MP. That's old style and considered wrong.
comment:12 by , 5 years ago
I'd leave the warning about "No area style for multipolygon" based on the style (minus the outer way part) and would add a new warning based only on tags. Not mixing these two.
comment:13 by , 5 years ago
Like this?
Think so. Let's test it :-)
Can probably be optimized a bit, as the last line in the patch is the opposite of the if before, so it should be an else and the boolean seems a bit useless?
comment:14 by , 5 years ago
Yes, I have to dig into that part. I think there are still tests which are for old style MP.
by , 5 years ago
Attachment: | 19136.3.patch added |
---|
comment:15 by , 5 years ago
My current thinking regarding old style MP:
- the message "With the currently used mappaint style the style for outer way mismatches the area style" only makes sense for closed ways. Same for "With the currently used mappaint style the style for inner way equals the multipolygon style" and "Area style on outer way". A ring that is assembled from multiple ways doesn't have a style, does it?
- We should also change the test
UnclosedWays
so that it doesn't ignore unclosed ways with area style tags like e.g. natural=wood just because the way is a member in atype=multipolygon
ortype=boundary
relation. The latter one was never a good reason for that, the former would be old style MP tagging. - I've extracted the new tag tests into method tag
tagTest()
, I am unsure if we should better use mapcss rules for them.
Unit tests depend on this.
by , 5 years ago
Attachment: | 19136.4.patch added |
---|
comment:16 by , 5 years ago
19136.4.patch differs to 19136.3.patch:
- tagTest() is also performed for relations with incomplete members
- modified unit tests
@Klumbumbus: Do you see an easy way to do the new tests in mapcss? Would it be better?
comment:17 by , 5 years ago
I can ony find a mapcss test for the "no further tag" :
relation[type=multipolygon][eval(number_of_tags()) = 1] { throwError: tr("{0} without further tags", "{0.tag}"); group: tr("missing tag"); assertMatch: "relation type=multipolygon"; }
So, unless we can replace all the style related tests there is not much to gain here, and we can do that in a different ticket.
follow-up: 20 comment:18 by , 5 years ago
There might be two outer rings with natural=wood
and different types of leaf_type=*
which are in one multipolygon natural=wood
, leaf_type=mixed
cause it is one forest with a single name but split by a highway.
follow-up: 21 comment:19 by , 5 years ago
I just got a warning No area style for multipolygon
for area:highway=*
. Looks like this tag needs to be additional allowed
comment:20 by , 5 years ago
Replying to skyper:
There might be two outer rings with
natural=wood
and different types ofleaf_type=*
which are in one multipolygonnatural=wood
,leaf_type=mixed
cause it is one forest with a single name but split by a highway.
Yes, why do you mention this?
follow-up: 22 comment:21 by , 5 years ago
Replying to skyper:
I just got a warning
No area style for multipolygon
forarea:highway=*
. Looks like this tag needs to be additional allowed
You should not get a warning with this text, only an information. The message depends on the style(s) you use.
Or did you use the patch and got a warning No area tag for multipolygon
? This would be plausible because the method mentioned in comment:6 doesn't contain area:highway=*
follow-up: 23 comment:22 by , 5 years ago
Replying to GerdP:
Replying to skyper:
There might be two outer rings with
natural=wood
and different types ofleaf_type=*
which are in one multipolygonnatural=wood
,leaf_type=mixed
cause it is one forest with a single name but split by a highway.
Yes, why do you mention this?
Cause, this was always a problem of the style test, as there are no different styles for natural=wood
or landuse=forest
depending on secondary tags.
Replying to GerdP:
Replying to skyper:
I just got a warning
No area style for multipolygon
forarea:highway=*
. Looks like this tag needs to be additional allowed
You should not get a warning with this text, only an information. The message depends on the style(s) you use.
Or did you use the patch and got a warningNo area tag for multipolygon
? This would be plausible because the method mentioned in comment:6 doesn't containarea:highway=*
Sorry, information warning, you are right. Think, the method might need some fine tuning but area:highway=*
seems to be completely unsupported by JOSM.
comment:23 by , 5 years ago
Replying to skyper:
Replying to GerdP:
Replying to skyper:
There might be two outer rings with
natural=wood
and different types ofleaf_type=*
which are in one multipolygonnatural=wood
,leaf_type=mixed
cause it is one forest with a single name but split by a highway.
Yes, why do you mention this?
Cause, this was always a problem of the style test, as there are no different styles for
natural=wood
orlanduse=forest
depending on secondary tags.
OK, the patch doesn't change the results for the default style: Warning Area style on outer way (2)
I think the tagging is at least suspicious. The two outer rings are different, one should not collect them in one multipolygon. I think it is good that JOSM shows a warning message, but it would be better to make it independent from the style.
Replying to GerdP:
Replying to skyper:
I just got a warning
No area style for multipolygon
forarea:highway=*
. Looks like this tag needs to be additional allowed
You should not get a warning with this text, only an information. The message depends on the style(s) you use.
Or did you use the patch and got a warningNo area tag for multipolygon
? This would be plausible because the method mentioned in comment:6 doesn't containarea:highway=*
Sorry, information warning, you are right. Think, the method might need some fine tuning but
area:highway=*
seems to be completely unsupported by JOSM.
I thought there was a special style for area:highway
, but I can't find it. The tag is checked in UnclosedWays
, but it doesn't appear in method hasAreaTags()
(comment:6). There are a few more discrepancies between these methods. This looks wrong to me, esp. a check for amenities=*
was probably meant to be amenity=*
.
follow-up: 28 comment:24 by , 5 years ago
The change introduces some false positives:
- unclosed way
boundary=*
is flagged byUnclosedWays
even when it is a member of atype=boundary
relation. - MP with shop=mall is flagged
No area tag for multipolygon (1)
. I think we should treatshop
likebuilding
inhasAreaTags()
. - MP with
tourism=hotel
is flaggedNo area tag for multipolygon (1)
.UnclosedWays
checks all ways withtourism=*
except fortourism=attraction
andtourism=artwork
.
I fear the No area tag for multipolygon
test will produce lots of false positives. I think we have to collect all tags which have a meaning on closed ways, that would be a very long list and it is difficult to produce.
I see two ways to produce it:
- Load all map paint styles and check which tags are rendered as area.
- Load all existing type=multipolygon relations not using one of the tags already listed in
hasAreaTags()
and decide which should be added.
In UnclosedWays
we only have to list those tags which must not be used with unclosed ways.
So, maybe change the message to Other: Tag describing the area might missing for multipolygon
?
Or remove it completely?
comment:26 by , 5 years ago
I made an josm_19136_multi_with_same_style_sample.osm example file.
It seems to be really problematic, as in my eyes, we need to look at more than just the primary tag. "Create multipolygon" even produces strange results as it only moves the primary tag but leaves secondary tags with different values untouched, leading to areas without a primary tag. New ticket ?
follow-up: 30 comment:27 by , 5 years ago
Yes, please create a new ticket and describe why type=multipolygon is used instead of type=site.
comment:28 by , 5 years ago
Replying to GerdP:
I fear the
No area tag for multipolygon
test will produce lots of false positives. I think we have to collect all tags which have a meaning on closed ways, that would be a very long list and it is difficult to produce.
I see two ways to produce it:
- Load all map paint styles and check which tags are rendered as area.
- Load all existing type=multipolygon relations not using one of the tags already listed in
hasAreaTags()
and decide which should be added.
Is there a way to only use the primary tags defined in map paint styles or/and presets and leave out all the rest. Better no warnings for unknown tags than false positives.
- We will always have unsupported/unknown tags
- We need to manually update the list which will not work without helper scripts
comment:29 by , 5 years ago
This ticket is about multipolygons without a primary tag describing the area. We cannot produce such a warning without risking false positives.
comment:30 by , 5 years ago
comment:31 by , 5 years ago
@sebastic: I tried to find some examples of multipolygons without a tag describing the area. The seem to be very rare, just two in a recent dump of the Netherlands. Can you point me to an area where this is really a problem?
comment:32 by , 5 years ago
You don't find any because I've been fixing old-style multipolygons on a daily basis since the area project started.
You can monitor the old-style.osm.pbf from the area project to gather examples, that's also the input for my QA work.
This issue was created during a message thread with wireguy triggered by my changes in Changeset: 83911051, he used JOSM 16239 to create the multipolygon but the validator did not notify him of the old-style issues. To quote his reply:
I am using josm, and did use create multipolygon.
But i wasn't warned by josm validator.
steps I did:
- created outer portion
- created inner portion
- selected both portions and picked create multipolygon
- selected the outer
- added tags area=yes; highway=footway
validated with no issues (should josm have caught this?)
uploaded (got no errors)
Because I remembered the validator issues JOSM used to raise for old-style multipolygons I tested whether those were still triggered, since they weren't this issue was created.
comment:33 by , 5 years ago
OK, thanks. So Jochen Topf should have code to find those old style multipolygons, right?
comment:34 by , 5 years ago
Yes, he has some osmium based code and custom tools for that.
Basically the criteria are relations with type=multipolygon no descriptive tags and no fixme=*.
A common issue in old-style.osm.pbf are relations with only type=multipolygon that have one or more new inners for an existing relation.
Relations from which the tags were removed instead of deleting the relation are also quite frequent.
The classical old-style multipolygon with the descriptive tags on outer way instead of on the relations are still quite frequent as well, but those are just a subset of the issues you'll find in old-style.osm.pbf.
comment:35 by , 5 years ago
Background history of the info-warning "With the currently used mappaint style the style for inner way equals the multipolygon style":
Main ticket #6363. As the validator depends on the used style in this case it was downgraded to info level in r8045.
So far when tickets with false positives of this test came up we used to sligthly change the colors, which for sure is not a perfect solution. Related color changes are in r7409, r8045, r8143, r11153, r13979
comment:36 by , 5 years ago
I've tried some methods to compile a list of tags which would suppress the "found no area tag" message.
I fear we cannot use
- presets because they don't tell us if the preset describes an area
- mappaint styles because they don't render all areas and boundaries with a fill colour
Besides both these sources contain only a small subset of possible tags.
A possible small solution would be this:
- if
getInterestingTags()
contains onlytype=multipolygon
-> errorNo further tags for multipolygon
- if
getInterestingTags()
contains onlytype=multipolygon
and one other tagname=*
orarea=yes
orsurface=*
-> warningNo area tag for multipolygon
I fear that this solution might force users to find a work around like adding a surface=* tag to the name=* tag.
The complex solution means a long list this (never complete), either as java code or in a new config file.
private static boolean hasAcceptedAreaTagForMultipolygon(OsmPrimitive p) { return p.hasKey("landuse", "amenity", "building", "building:part", "area:highway", "shop", "boundary", "place") || p.hasTag("highway", "pedestrian", "services", "rest_area") || p.hasTag("public_transport", "platform") || p.hasTag("railway", "platform") || p.hasTag("waterway", "riverbank") || p.hasTag("man_made", "bridge", "tower") || p.hasTagDifferent("tourism", "attraction", "artwork") || p.hasTagDifferent("leisure", "picnic_table", "slipway", "firepit") || p.hasTag("natural", "water", "wood", "scrub", "wetland", "grassland", "heath", "rock", "bare_rock", "sand", "beach", "scree", "bay", "glacier", "shingle", "fell", "reef", "stone", "mud", "landslide", "sinkhole", "crevasse", "desert", "dune") || p.hasTag("indoor", "corridor", "room") || p.hasTag("seamark:type", "anchorage", "separation_zone", "fairway", "harbour", "restricted_area"); }
comment:37 by , 5 years ago
Jupp, I don't think it can be better than your proposal. One question thought: Why a new function and not extend the already hadAreaTags()?
comment:38 by , 5 years ago
I'll probably change that again. I didn't like that hasAreaTags() returns true for area=yes
. To avoid side effects I used a new method for my experiments. There are lots of MP with area=yes and highway=* or surface=* . I think this is a bad variant of area:highway, but maybe we should not complain because it is used often.
I've loaded all MP from Netherlands to start with. First got > 2000 warnings, with the longer list I'm down to ~380.
comment:39 by , 5 years ago
Milestone: | 20.04 → 20.05 |
---|
Will take some more days to compile a useful list ...
comment:40 by , 5 years ago
Some special cases, comments are welcomed:
- just two tags
type=multipolygon
and one that is used for linear ways, e.g.barrier=fence
orhighway=service
orhighway=footway
ornatural=tree_row
. Candidate for "suspicious tag combination" incombinations.mapcss
? - What should I do with e.g. r10229231:
type=multipolygon area=yes highway=residential name=* ...
I would usetype=multipolygon place=square name=* ...
instead. Candidate fordeprecated.mapcss
?
by , 5 years ago
Attachment: | 19136.5.patch added |
---|
follow-up: 43 comment:41 by , 5 years ago
highway=service
on a multipolygon is valid, it turns it into an area (e.g. a large driveway with a fountain or building in the middle), same for highway=footway
. barrier=fence
& natural=tree_row
is only valid on the ways.
area=yes
on a multipolygon is not needed, that's implicit for (most) linear features tagged on the relation. Update Multipolygon drops this when moving tags from the outer way to the relation.
comment:42 by , 5 years ago
Changes implemented in 19136.5.patch:
MultipolygonTest
: ignore tags of outer ways when looking for area style of multipolygon relationTagChecker
add checks to find multipolygon relations which (might) have no area type tag
The method hasAcceptedPrimaryTagForMultipolygon()
is now accepting a lot more existing relations.
checkstyle doesn't like, gives a [BooleanExpressionComplexity]. I'll probably use multiple if (...) return true;
blocks.
Some people will hate us for this test. It probably complains about lots of objects which might be correct, but rarely used
in multipolygon relations. So, it woud be good to find a message which makes clear that JOSM cannot know all correct tags.
The patch uses multipolygon relation: found no area type tag
. I think this is too strong.
Maybe multipolygon relation: area type tag might be missing
is better?
follow-up: 44 comment:43 by , 5 years ago
Replying to sebastic:
highway=service
on a multipolygon is valid, it turns it into an area (e.g. a large driveway with a fountain or building in the middle), same forhighway=footway
.barrier=fence
&natural=tree_row
is only valid on the ways.
The highway=* tag is really problematic. Some mappers expect a multipolygon with highway=residential to be routable, others use it with the same meaning as area:highway=*. We have e.g. 144 multipolygons with highway=path
.
I should probably allow all highway=* which are not for nodes only in this test.
Still, I think there should be a test in combinations.mapcss
.
area=yes
on a multipolygon is not needed, that's implicit for (most) linear features tagged on the relation. Update Multipolygon drops this when moving tags from the outer way to the relation.
OK. We have a mapcss test that says area is unnecessary for shop
. We should have somethingh similar for area=yes and type=multipolygon.
comment:44 by , 5 years ago
Replying to GerdP:
I fear we cannot use
- presets because they don't tell us if the preset describes an area
Looking at type
with value "closedway"
(and "multipolygon"
) but not "way"
should give you the area tags, though, you are right, that we'll miss a lot.
Replying to GerdP:
Replying to sebastic:
highway=service
on a multipolygon is valid, it turns it into an area (e.g. a large driveway with a fountain or building in the middle), same forhighway=footway
.barrier=fence
&natural=tree_row
is only valid on the ways.
The highway=* tag is really problematic. Some mappers expect a multipolygon with highway=residential to be routable, others use it with the same meaning as area:highway=*. We have e.g. 144 multipolygons with
highway=path
.
I should probably allow all highway=* which are not for nodes only in this test.
Still, I think there should be a test incombinations.mapcss
.
Personally, I think, using highway
as area is a incorrect mapping and area:highway
should be used but, sadly, that train departed a while ago.
JOSM allows only service
and pedestrian
as multipolygon. footway
and path
might be ok, as otherwise, people will use pedestrian
which is wrong in many cases. Still, I do not like it at all and it reminds me about the mess, we have with landuse
even though landcover
was proposed and used years in advance but was never supported by many applications including JOSM.
Often this is mapping for the renderer as width
nor area:highway
are used or correctly rendered. Speaking about pedestrian
, this even hides a lot of other features as it is rendered on top of them without transparency.
Routing is another problem and including a multipolygon in a route relation does not work at all.
So a warning might at least lead the people to reflect that kind of mapping and eventually might help to improve the mapping of areas.
area=yes
on a multipolygon is not needed, that's implicit for (most) linear features tagged on the relation. Update Multipolygon drops this when moving tags from the outer way to the relation.
OK. We have a mapcss test that says
area is unnecessary for shop
. We should have something similar for area=yes and type=multipolygon.
+1
by , 5 years ago
Attachment: | 19136.7.patch added |
---|
comment:45 by , 5 years ago
I think the list implemented with 19136.7.patch is a good start. I've used this overpass query in JOSM https://overpass-turbo.eu/s/Tlw to download ~45.000 incomplete relations (worldwide) and ran validator without mapcss checks(1).
New messages with the attached patch:
+ Errors + Multipolygon tags - tag describing the area is missing (20927) + Warnings + Multipolygon tags - tag describing the area might be missing (18467) + Other + Multipolygon tags - only surface tag (1031)
Please suggest better texts.
The code uses a mix of
- ignoring relations with special tags like
fixme
ordescription
- accepting various keys without looking at the value, e.g.
landuse=*
orshop=*
- accepting some keys unless the value is not usable for areas
- accepting some tags like
railway=platform
orharbour=yes
- accepting relations with at least one tag key using a lifecycle prefix (e.g.
abandoned:building
ordisused:shop
)
If the relation has a surface
tag it an information is produced
For the remaining relations the tags are filtered to remove tag keys which are often used in multipolygon relations but don't describe the area: "type", "name", "area", "ref", "access", "operator"
I've also used the harmonizeKey() method to remove all keys with upper case characters or one of [ :-_]
. (2)
If the remaining list of tags is empty it is very likely that there is no tag describing the area -> error
For the rest a warning is produced.
(1) The geometry.mapcss test is very slow because relations without any complete member have no bounding box. This results in lots of useless sequential searches.
(2) Might be to restrictive and should be replaced by a regex.
by , 5 years ago
Attachment: | 19136.8.patch added |
---|
small improvements (more lifecycle keys, regex instead of harmonizeKey()
by , 5 years ago
Attachment: | 19136.9.patch added |
---|
forgot to include modified unit test data nodist\data\multipolygon.osm
comment:47 by , 5 years ago
Milestone: | 20.05old → 20.05 |
---|
This is (partly) a regression from #17886, at least for the 2nd sample. The first one didn't produce a warning or error even before r15244. I think we can treat both cases the same. The informational message
"No area style for multipolygon" should be an error, at least a warning.