#14833 closed enhancement (fixed)
[PATCH] Selectively added details to data/boundaries.xml
Reported by: | westnordost | Owned by: | Don-vip |
---|---|---|---|
Priority: | minor | Milestone: | 17.06 |
Component: | Core | Version: | |
Keywords: | boundaries, geocoding | Cc: | Klumbumbus |
Description
I selectively added detail to the boundaries.xml globally for populated areas. In detail, I
- added detail to the border so that villages near the border are on the correct side of the border
- added much detail to towns that run alongside the border
The above only for international borders.
I edited the file with JSON, saved it and then reduced the precision of lat lons back to 5 decimal points (with a text editor) like it is in the current repository version.
Posting a patch does not really make sense here, as every line differs because JOSM generates new ids. I am posting it anyway, plus the new version of the file.
Attachments (7)
Change History (28)
by , 7 years ago
Attachment: | boundaries.patch added |
---|
by , 7 years ago
Attachment: | boundaries.osm added |
---|
comment:1 by , 7 years ago
Cc: | added |
---|
comment:2 by , 7 years ago
Replying to osm@…:
Posting a patch does not really make sense here, as every line differs because JOSM generates new ids.
I started a patch to change that, in order to be able to review changes on this file.
comment:3 by , 7 years ago
That sounds reasonable.
However, it does not affect this patch, as the work has been done and the "original" ids cannot retroactively be recovered, even when future JOSM versions will not generate new ids.
comment:4 by , 7 years ago
The problem is that without the patch, it's almost impossible to review changes... I didn't think someone would provide a patch so soon :( I'll let you know how I will handle your contribution.
comment:5 by , 7 years ago
Two things come to my mind:
- create a visual diff by subtracting the repos geometry from the patch geometry. Not sure if this is possible without too much effort in Josm, otherwise perhaps in QGis?
- Junit test that checks whether geometry is valid (closed ways etc), using the same checks as they are used before upload
by , 7 years ago
Attachment: | old-file-reference.png added |
---|
by , 7 years ago
Attachment: | new-file-output.png added |
---|
by , 7 years ago
Attachment: | new-file-differences.png added |
---|
by , 7 years ago
Attachment: | changes.png added |
---|
comment:6 by , 7 years ago
comment:7 by , 7 years ago
Assuming this is the right place to ask:
Some time after this patch has been merged, I plan to further split some countries into its provinces the same it has been done for US, AU and CA.
The reason is to be able to more precisely capture intra-country differences.
For example in India, many of the provinces each have a different language (and script!) as official language, same with China (Cantonese etc.). Then, there is Belgium with the North speaking Dutch, the South speaking French.
And even more so in many countries in Afrika.
So my question is whether I can expect that this kind of change would be merged?
comment:9 by , 7 years ago
I have not yet merged the patch. The reason why I split US/CA/AU into smaller units it's because the subentities (states, provinces) have a high degree of autonomy and sometimes different laws or regulations (ex: speed limits). This situation appears to be quite rare in the world, where generally the law does not differ in administrative units, so it was light.
Languages, however, are another subject. I'm afraid the file would become simply too big if we go into this level of detail, see https://en.wikipedia.org/wiki/List_of_multilingual_countries_and_regions
Before you start working on this, you should estimate about how much the data would increase. If it's two or three times bigger, I'd say no.
comment:10 by , 7 years ago
Milestone: | → 17.06 |
---|
comment:11 by , 7 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
comment:12 by , 7 years ago
The boundaries data currently accounts for 236 kB of the distributed .jar file (about 2%). This is okay, but I say we should try to keep it under 300 kB, including future refinements.
If you have a specific use case, e.g. a custom map style or validation rule: We can introduce a system where you can ship a boundaries file along with your .mapcss file (just like icons).
follow-up: 14 comment:13 by , 7 years ago
I don't think that the distribution size is our biggest problem there
The main problem I see with a more detailed file is a performance problem: Many rules (like left/right hand traffic) require us to query it. And we need to load it on every JOSM startup.
comment:14 by , 7 years ago
Replying to michael2402:
The main problem I see with a more detailed file is a performance problem: Many rules (like left/right hand traffic) require us to query it.
This query is highly optimized. On average, it shouldn't be a significantly slower than something like :in-downloaded-area
(please prove me wrong).
And we need to load it on every JOSM startup.
True, but it can be loaded in parallel to other tasks, which is a plus. (This seems to be broken at the moment.) Before making such a judgment, I'd prefer to do some tests, e.g. replace the file with one 3 times the size and compare startup time.
follow-up: 16 comment:15 by , 7 years ago
If you have a specific use case, e.g. a custom map style or validation rule
My specific use case is actually that I use this file in my own OSM project, StreetComplete, to add location based intelligence. So, basically the same reason as why the file exists for JOSM. With this patch, I just wanted to contribute the enhancements upstream.
Currently, I use the following data (perhaps relevant for JOSM) data per country/region and use the boundaries file to determine which region it applies to:
- which speed unit is used (for preselecting whether to use mph or km/h)
- which sports are popular (for order of selectable sports for pitches)
- first day of workweek and number of regular shopping days (to preselect the right days when adding opening hours to a place)
- regex for determining if a housenumber seems valid (i.e. "1 bis" in France is valid etc.)
- list of languages sorted by officiality, importance (for detecting and automatically un-abbreviating street name abbreviations, for offering the right keyboard layout when inputing names etc.)
Regarding size:
I think the size could be reduced quite a lot if each country would be a relation of its boundaries. So, each boundary line is used by two countries. But to do this manually is not worth the effort. Perhaps export to TopoJSON and simplify, then import from TopoJSON(?)
comment:16 by , 7 years ago
Replying to osm@…:
If you have a specific use case, e.g. a custom map style or validation rule
My specific use case is actually that I use this file in my own OSM project, StreetComplete, to add location based intelligence. So, basically the same reason as why the file exists for JOSM. With this patch, I just wanted to contribute the enhancements upstream.
Currently, I use the following data (perhaps relevant for JOSM) data per country/region and use the boundaries file to determine which region it applies to:
- which speed unit is used (for preselecting whether to use mph or km/h)
- which sports are popular (for order of selectable sports for pitches)
- first day of workweek and number of regular shopping days (to preselect the right days when adding opening hours to a place)
- regex for determining if a housenumber seems valid (i.e. "1 bis" in France is valid etc.)
- list of languages sorted by officiality, importance (for detecting and automatically un-abbreviating street name abbreviations, for offering the right keyboard layout when inputing names etc.)
Sounds quite interesting, the additional data could be of some value. If not as part of the main distributed jar file, then possibly for a plugin or similar.
Regarding size:
I think the size could be reduced quite a lot if each country would be a relation of its boundaries. So, each boundary line is used by two countries.
Removing this kind of verbatim duplication is what a zip algorithms is good at. So in the end, this may not improve the (zipped) file size at all.
follow-up: 18 comment:17 by , 7 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
In 12430/josm:
comment:18 by , 7 years ago
The patch is merged with some minor modifications described above. Thanks for this submission :) Can you please tell me how do you want to be credited? I didn't include your e-mail address as it would have become public information.
In the next few days I'll see how to integrate my patch that allows JOSM to keep primitive IDs. This way, future patches to this file will be easier to review.
by , 7 years ago
Attachment: | josm_keep_ids.patch added |
---|
comment:19 by , 7 years ago
You can use my real name (Tobias Zwick) or alias (westnordost), but credit is not required.
full new file