#23621 closed enhancement (fixed)
[patch] Check for trees that are too big
Reported by: | anonymous | Owned by: | team |
---|---|---|---|
Priority: | normal | Milestone: | 24.07 |
Component: | Core validator | Version: | |
Keywords: | natural tree size | Cc: |
Description
According to the Guinness World Records, the greatest circumference for a tree is 43m. I suggest implementing a check for trees with circumferences exceeding this value, as they are likely to be errors.
For instance, we currently have 2001 nodes in the database with a circumference of 45: https://overpass-turbo.eu/s/1JUU. Osmose could suggest to check them and:
- delete the value if plainly wrong or
- fix 'circumference=45' to 'circumference=0.45' or 'circumference=45 cm'
Another check could be on height. Trees taller than Hyperion (tallest tree in the world with 116m) are likely to be errors.
I suggested this check in Osmose and they redirect me here:
https://overpass-turbo.eu/s/1JWd
At least 62638 trees worldwide affected by the circumference limit of 43m (excluding those with units or decimals).
I suspect this might be interesting for JOSM, for their numerical validation rules (which we synchronize with Osmose). That way it benefits more people and people who add those trees get a warning prior to uploading it in the first place.
Attachments (1)
Change History (30)
comment:1 by , 7 months ago
Keywords: | natural tree size added |
---|
comment:3 by , 7 months ago
I would opt for the first version, easier to understand and maintain.
Also the circumference
and diameter_crown
should be added to the numeric validator to warn cases such as circumference=10m
-> circumference=10 m
.
comment:4 by , 7 months ago
Is there really no method to ask for the - possibly converted - value in m using mapcss?
We have a similar test for towers also using a very complex regex:
*[man_made=communications_tower][height][height =~ /^((7[0-4]|[1-6]?[0-9])(\.[0-9]*)?( m)?|(2(4[0-5]|[0-3][0-9])|1?[0-9]?[0-9])((\.[0-9]*)?( ft|\')|\'(11|10|[0-9])(\.[0-9]*)?\"))$/] { /* 75m ~ 246ft ~ 246' */ throwWarning: tr("{0} together with {1}", "{0.tag}", "{1.tag}"); suggestAlternative: "man_made=tower + tower:type=communication + height"; group: tr("suspicious tag combination"); assertMatch: "node man_made=communications_tower height=58"; assertMatch: "node man_made=communications_tower height=\"74 m\""; assertMatch: "node man_made=communications_tower height=0.8"; assertMatch: "node man_made=communications_tower height=245'"; assertMatch: "node man_made=communications_tower height=\"224.22 ft\""; assertMatch: "node man_made=communications_tower height=231'10.22\""; assertNoMatch: "node man_made=communications_tower height=\"75 m\""; assertNoMatch: "node man_made=communications_tower height=75.72"; assertNoMatch: "node man_made=communications_tower height=\"328.22 ft\""; assertNoMatch: "node man_made=communications_tower height=4358'8\""; assertNoMatch: "node height=4358'"; }
comment:5 by , 7 months ago
In Rules/Runways I've used some MapCSS hack (to_int()
) to get the value, but it does not look better.
follow-up: 7 comment:6 by , 7 months ago
There is a ticket about supporting unit conversion with a library but I cannot find it right now. Maybe it was closed as wontfix
as the size of the library was to big.
comment:8 by , 7 months ago
I think it would be a 'nice to have' to support all units (either as a plugin or via a complicated mapcss regex or ...), but I don't think it should be a blocker for this issue, not?
I mean, if I quickly search for any circumference
tag value containing a ', ", "in" or "ft", there's just 317 worldwide.
I suspect the majority of the cases with 'too large' circumferences will be due to people who assume the wrong default unit, thus for instance entering a value in mm without adding mm
to the value. (For such cases the simple numerical rule in comment:2 would be enough)
comment:9 by , 7 months ago
FYI: Yesterday I created a unit conversion plugin. So far it's capable converting metric lenght
values, the imperial ones not yet.
The biggest work would be to collect and categorize all the tags which contains units.
Regarding the ticket, I'm fine with the proposed rule from comment:2.
comment:11 by , 7 months ago
Yeah, but is there a better option? It wouldn't make into core because it's using a lib.
comment:12 by , 7 months ago
Writing a function to convert about 10 different units shouldn't be so complicated that it needs a lib. Supporting strange units is anyway a bad idea. There is reason why SI units exist.
follow-up: 17 comment:15 by , 6 months ago
Priority: | normal → major |
---|
First impressions seems to be good, but I'm a bit puzzled on some things and have some questions on other things:
1) why does it return -
on invalid inputs? This makes it fairly prone to errors like:
node[height ^= "-"] { throwError: tr("Surprisingly low value of {0} m", to_float(siunit_length(tag(height)))); }
which crashes JOSM if validating an object with height=-1 (or gives a bad message if to_float is omitted, just included it here to showcase a worst case scenario). Why not returning null
or so? The example mapcss is hypothetically by the way, not a suggestion :)
2) why not supporting negative numbers? For validation purposes, this makes a lot of sense, for instance the key ele
has a lot of negative values.
3) as it's a conversion, wouldn't to_SIunit_length
or so be a better name than siunit_length
?
4) probably it should also support km
and nmi
as they're explicitly listed on the wiki. Even though they're the defaults and could hypothetically be omitted, some people do include them :)
5) as it's always a Double, why not return it as a Double instead of a String 0:) ?
Bumping the priority a bit since it's almost release day with this new function :)
comment:16 by , 6 months ago
Priority: | major → normal |
---|
I have split the unit conversion part off to #23702 as it is a different component and tree size is only one occasion where the new function is useful.
comment:17 by , 6 months ago
Priority: | normal → major |
---|
Replying to anonym:
First impressions seems to be good, but I'm a bit puzzled on some things and have some questions on other things:
1) why does it return
-
on invalid inputs? This makes it fairly prone to errors like:
node[height ^= "-"] { throwError: tr("Surprisingly low value of {0} m", to_float(siunit_length(tag(height)))); }which crashes JOSM if validating an object with height=-1 (or gives a bad message if to_float is omitted, just included it here to showcase a worst case scenario). Why not returning
null
or so? The example mapcss is hypothetically by the way, not a suggestion :)
Because I thought it a good idea.
2) why not supporting negative numbers? For validation purposes, this makes a lot of sense, for instance the key
ele
has a lot of negative values.
Simply forgot them (or didn't think that's used).
3) as it's a conversion, wouldn't
to_SIunit_length
or so be a better name thansiunit_length
?
No. That name better fits the URL_, XML_ CRC32_ and so on names.
4) probably it should also support
km
andnmi
as they're explicitly listed on the wiki. Even though they're the defaults and could hypothetically be omitted, some people do include them :)
km and nmi aren't defaults. That's still m.
5) as it's always a Double, why not return it as a Double instead of a String 0:) ?
As far as I can see there is no type of function <X> name(<Y> arg), only function <X> name(<X> arg), so I cannot return another type as the argument.
Bumping the priority a bit since it's almost release day with this new function :)
If there is an unfinished new functionality, who cares. Reduce priority again.
Followup patch comes when the unit test are ok.
comment:20 by , 6 months ago
Milestone: | → 24.06 |
---|---|
Summary: | Check for trees that are too big → [patch] Check for trees that are too big |
From comment:2 and comment:4 with the new function from r19089.
Warn threshold is 45 instead of 50.
-
resources/data/validator/combinations.mapcss
896 896 } 897 897 898 898 /* #16898, tower vs. communications_tower, wiki suggests 100m as rough guideline, so we warn for < 75m */ 899 *[man_made=communications_tower][height][ height =~ /^((7[0-4]|[1-6]?[0-9])(\.[0-9]*)?( m)?|(2(4[0-5]|[0-3][0-9])|1?[0-9]?[0-9])((\.[0-9]*)?( ft|\')|\'(11|10|[0-9])(\.[0-9]*)?\"))$/] { /* 75m ~ 246ft ~ 246' */899 *[man_made=communications_tower][height][siunit_length(tag(height)) < 75] { 900 900 throwWarning: tr("{0} together with {1}", "{0.tag}", "{1.tag}"); 901 901 suggestAlternative: "man_made=tower + tower:type=communication + height"; 902 902 group: tr("suspicious tag combination"); … … 1066 1066 group: tr("suspicious tag combination"); 1067 1067 } 1068 1068 1069 /* #23621 */ 1070 node[natural=tree][circumference][siunit_length(tag(circumference)) > 45] { 1071 throwWarning: tr("Unusually large value of {0} in meters, possibly centimeter units are meant?", "{0.key}"); 1072 assertMatch: "node natural=tree circumference=200"; 1073 assertMatch: "node natural=tree circumference=82.4"; 1074 assertNoMatch: "node natural=tree circumference=18.4"; 1075 assertNoMatch: "node natural=tree circumference=\"100 cm\""; 1076 assertNoMatch: "node natural=tree circumference=43"; /* Current world record */ 1077 } 1078 No newline at end of file
comment:22 by , 6 months ago
Originally I've put it there, but ended up in the combinations. Feel free to move around the rule.
comment:23 by , 6 months ago
Okay, in the numeric file stating exactly this: measurement values and units warnings
.
by , 6 months ago
Attachment: | 23621.patch added |
---|
follow-up: 27 comment:24 by , 6 months ago
@gaben: A little bit off-topic, but since your patch also converts existing rules, there's another one: https://josm.openstreetmap.de/browser/josm/trunk/resources/data/validator/numeric.mapcss#L70 could also be converted to
*[roof:height][siunit_length(tag("roof:height")) = 0][roof:shape=flat] {
comment:27 by , 5 months ago
comment:28 by , 4 months ago
I just realized that the string is a bit strange in some cases:
throwWarning: tr("Unusually large value of {0} in meters, possibly centimeter units are meant?", "{0.key}");
For meter units (circumference=52 + natural=tree
), it says:
Unusually large value of natural in meters, possibly centimeter units are meant?
For feet units (circumference=3000ft + natural=tree
) it also says:
Unusually large value of natural in meters, possibly centimeter units are meant?
Suggestion:
- remove 'in meters' (because it can be in any unit now)
- use
{1.key}
instead of{0.key}
to refer tocircumference
rather thannatural
Which would imply using:
throwWarning: tr("Unusually large value of {0}, possibly centimeter units are meant?", "{1.key}");
Perhaps this can still be included in 24.07 before the translations are updated?
If interesting for JOSM, I would propose the following rule:
This gives about 72k results worldwide.
Or, if you also want to also include the 2 trees worldwide with
m
-units explicitly specified: