Opened 8 years ago
Closed 4 years ago
#12303 closed enhancement (fixed)
[Patch] When downloading objects with Overpass API, use recurse up to fetch referrers
Reported by: | simon04 | Owned by: | simon04 |
---|---|---|---|
Priority: | normal | Milestone: | 20.06 |
Component: | Core | Version: | |
Keywords: | mirrored_download overpass | Cc: | Penegal, tyr_asd |
Description
Since #7670, "Download object" can use the Overpass API instead of the OSM API. Since the Overpass API supports the recurse up operator <;
, we could make use of the for fetching referrers of the object.
Launching "Download object" with n288207873
and "Download referrers" as well as "Download relation members" set, we have a bunch of HTTP requests:
INFO: GET https://api.openstreetmap.org/api/0.6/nodes?nodes=288207873 -> 200 INFO: GET https://api.openstreetmap.org/api/0.6/node/288207873/ways -> 200 (943 B) INFO: Way 26,309,072 with 85 nodes has incomplete nodes because at least one node was missing in the loaded data. INFO: Way 25,648,174 with 13 nodes has incomplete nodes because at least one node was missing in the loaded data. INFO: GET https://api.openstreetmap.org/api/0.6/node/288207873/relations -> 200 (184 B) INFO: GET https://api.openstreetmap.org/api/0.6/way/26309072/full -> 200 INFO: GET https://api.openstreetmap.org/api/0.6/way/25648174/full -> 200 INFO: GET https://api.openstreetmap.org/api/0.6/nodes?nodes=288207917,288207914,1088131027,288207911,2939204923,288207910,278027141,288207924,288207920,1088131320,1088131071,288207882,2217273857,2217273859,1088131312,288207872,1318297762,1088131176,391657967,1318297764,1088130924,1088131040,288207853,2217262568,1088131099,288207852,2217273833,279583979,288207855,288207854,2217273835,2217273837,288207851,2217273839,288207845,288207844,279583971,288207847,288207846,2217062382,288977653,288207841,288207840,2217262566,288207843,288977655,288207842,2217273831,2217062378,415506827,288207869,288207868,288207871,288207870,288207865,288207864,372888428,530245199,288207867,2217273854,372888431,288207866,288207861,2217273840,279583986,288207860,288207863,288207862,2217273843,288207857,288207856,2217273845,288207859,288207858,2217273847,2217262538,1088131007,279583950,279583951,2217262531,2217262533,1088131254,2217262534,288207939,1088130986,288207839,2217262556,279583964,2217262544,2217262547,2217262548,2217262549,279583959,2217262551,1088130980 -> 200
Ideally, we could strip this down to one Overpass call. The question is just how to combine node(288207873)
and >;
and <;
in order to accomplish this. Help is welcome … :)
Attachments (10)
Change History (40)
comment:1 by , 8 years ago
Milestone: | 16.02 → 16.03 |
---|
comment:2 by , 8 years ago
Milestone: | 16.03 → 16.04 |
---|
comment:3 by , 8 years ago
Milestone: | 16.04 → 16.05 |
---|
comment:4 by , 8 years ago
Milestone: | 16.05 → 16.06 |
---|
comment:5 by , 8 years ago
Keywords: | overpass added |
---|---|
Milestone: | 16.06 → 16.07 |
comment:6 by , 8 years ago
Milestone: | 16.07 → 16.08 |
---|
comment:7 by , 8 years ago
Milestone: | 16.08 |
---|
comment:8 by , 6 years ago
This probably can in fact now be reduced to one Overpass call using the 'complete' block statement, new to Overpass API 0.7.55, which does the recursion for referrers on the server side. Using the right input set and most probably a loop limit of 1 (simply enough to get parents of the newly discovered elements that aren't part of the original union query)...
https://wiki.openstreetmap.org/wiki/Overpass_API/Overpass_QL#The_block_statement_complete
comment:10 by , 5 years ago
Cc: | added |
---|
comment:11 by , 4 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
comment:12 by , 4 years ago
The corresponding single overpass query for a list of objects could look like this
( node(id:70674932,...); way(id:29201179,...) -> .w; rel(id:69202,...) -> .r; ) -> .i; // .i : the input set ( .i;rel(br); // if any relations are in the input list .i;rel(bw); // if any ways ... .i;rel(bn);(.i;way(bn);>;); // if any nodes ... ) -> .p; // .p contains the parents and the input set (.p; - (.w;.r;);); // subtract relations and ways that were in the input set out meta;
The actual query depends on the existence of the elements. If the initial list doesn't contain relations some parts of the query would be omitted.
The result of this query should be the same as the sequnce of API requests done by the current code.
comment:13 by , 4 years ago
Ah, sorry, forget the above. The subtract clause doesn't work. This was a last minute change. I try to solve the problem that I sometimes get more data then wanted.
by , 4 years ago
Attachment: | 12303.patch added |
---|
comment:15 by , 4 years ago
Owner: | changed from | to
---|---|
Status: | assigned → needinfo |
Summary: | When downloading objects with Overpass API, use recurse up to fetch referrers → [Patch] When downloading objects with Overpass API, use recurse up to fetch referrers |
The patch implements the retrival of parents via overpass api if that overpass api is enabled. This is faster when many objects are requested but typically much slower for a single object.
Not sure if I should change the code so that it doesn't use overpass for less than - say - 5 objects?
comment:16 by , 4 years ago
At least, leave overpass as fallback if OSM-Server is not responding/down.
follow-up: 18 comment:17 by , 4 years ago
Do you mean to use overpass if OSM server produced a timeout while requesting parents for a small number of objects?
comment:18 by , 4 years ago
Replying to GerdP:
Do you mean to use overpass if OSM server produced a timeout while requesting parents for a small number of objects?
Yes, if you set it to only use OSM server for a small number.
follow-up: 22 comment:19 by , 4 years ago
By default, download.overpass.for-multi-fetch
is false and thus the new way of fetching is inactive.
For the example from the ticket description, still two requests are made:
2020-02-11 21:27:12.479 INFO: POST https://overpass-api.de/api/interpreter (25 B) ... 2020-02-11 21:27:12.479 TRACE: BODY: node(288207873);out meta; 2020-02-11 21:27:13.614 INFO: POST https://overpass-api.de/api/interpreter -> HTTP/1.1 200 (976 ms; 362 B) 2020-02-11 21:27:13.619 INFO: POST https://overpass-api.de/api/interpreter (85 B) ... 2020-02-11 21:27:13.619 TRACE: BODY: (node(288207873)->.n;.n;rel(bn);.n;way(bn);)->.pn;((.pn;);node(w);node(r););out meta; 2020-02-11 21:27:14.950 INFO: POST https://overpass-api.de/api/interpreter -> HTTP/1.1 200 (1.2 s; 2.93 kB)
Concerning the implementation, you could simplify genOverpassQuery
by using some Java 8 features:
TreeMap<OsmPrimitiveType, Set<PrimitiveId>> primitivesMap = children.stream() .collect(Collectors.groupingBy(PrimitiveId::getType, TreeMap::new, Collectors.toSet()));
The switch cases CLOSEDWAY
and MULTIPOLYGON
are irrelevant.
follow-up: 21 comment:20 by , 4 years ago
Yes, there is still a lot to improve. I think we have too many different classes for the download tasks.
reg. Java 8 features: I think many of them are almost unreadable, and this is one of them. I only understand what it does because I know what it should replace.
reg. switch: I think I added them only to calm down SonarLint. I know they are not needed.
comment:21 by , 4 years ago
Replying to GerdP:
reg. Java 8 features: I think many of them are almost unreadable, and this is one of them.
For me it's the opposite: once getting used to the effect of the various terminal operation of streams, expressions involving streams are much more concise (to read) and better convey the programmer's intent. I think the expression Collectors.groupingBy(PrimitiveId::getType)
is a good example for my point.
comment:22 by , 4 years ago
Replying to simon04:
By default,
download.overpass.for-multi-fetch
is false and thus the new way of fetching is inactive.
For the example from the ticket description, still two requests are made:
2020-02-11 21:27:12.479 INFO: POST https://overpass-api.de/api/interpreter (25 B) ... 2020-02-11 21:27:12.479 TRACE: BODY: node(288207873);out meta; 2020-02-11 21:27:13.614 INFO: POST https://overpass-api.de/api/interpreter -> HTTP/1.1 200 (976 ms; 362 B) 2020-02-11 21:27:13.619 INFO: POST https://overpass-api.de/api/interpreter (85 B) ... 2020-02-11 21:27:13.619 TRACE: BODY: (node(288207873)->.n;.n;rel(bn);.n;way(bn);)->.pn;((.pn;);node(w);node(r););out meta; 2020-02-11 21:27:14.950 INFO: POST https://overpass-api.de/api/interpreter -> HTTP/1.1 200 (1.2 s; 2.93 kB)
As a next step I'll move all code to generate overpass queries to static methods in MultiFetchOverpassObjectReader
.
comment:23 by , 4 years ago
The patch implements the static method, but I don't see yet where to use it.
For the example above the generated query looks like this:
node(288207873)->.n;.n;way(bn)->.wn;.n;rel(bn)->.rn;((.n;.wn;.rn;);node(w););out meta;
Performance isn't great. There are so many ways to get the wanted result and I am not yet sure which one performs best...
by , 4 years ago
Attachment: | 12303.3.patch added |
---|
comment:24 by , 4 years ago
12303.3.patch generates this which seems faster:
node(288207873)->.n;.n;way(bn)->.wn;.n;rel(bn)->.rn;(.wn;node(w);.rn;);out meta;
The possible query is only logged for now, see comment:23
by , 4 years ago
Attachment: | 12303.4.patch added |
---|
comment:25 by , 4 years ago
With the new patch a single download is performed when overpass api is enabled.
@Simon04: Please let me know what you think about the new class DownloadFromOverpassTask
. I'd rather prefer to reduce the number of DownloadXYZTask classes but I don't fully understand the threading concept.
There is more to do to handle the fact the overpass doesn't return deleted objects.
Try to download deleted node 1212 with the different APIs.
comment:26 by , 4 years ago
Milestone: | → 20.06 |
---|
comment:27 by , 4 years ago
The patch was written before I reverted the full recursive download of child relations for the normal API. See #18835
So, this needs more work.
by , 4 years ago
Attachment: | 12303.5.patch added |
---|
updated patch, doesn't do a full recursive download of children
by , 4 years ago
Attachment: | missing-1212.PNG added |
---|
by , 4 years ago
Attachment: | deleted-1212.PNG added |
---|
comment:28 by , 4 years ago
Up to now the overpass api doesn't allow to retrieve deleted objects. We simply get no data. Current code first showss this
(wrong) message when you try to download node 1212 from overpass:
sometimes followed by another message
I've not yet found out why the 2nd message appears only sometimes.
The big problem is that we cannot distinguish between a node that was deleted and one which never existed. If you try to download a node 1234567891234 you get a similar response from overpass.
by , 4 years ago
Attachment: | 12303.6.patch added |
---|
by , 4 years ago
Attachment: | missing-1212-overpas.PNG added |
---|
comment:29 by , 4 years ago
I've not yet found out why the 2nd message appears only sometimes.
My bad. Happened only when the layer already contained the data about the deleted node from a previous download via OSM API. The new patch handles this better.
New popup message looks like this when overpass is used:
New i18n strings for this:
trn("The server did not return data for the requested object, it was either deleted or does not exist.", "The server did not return data for the requested objects, they were either deleted or do not exist.", errs.size());
If both parents and members are downloaded the generated overpass query can be very complex even if only one object is downloaded.
relation(2732622)->.r;.r;rel(br)->.pr;.r;rel(r)->.rm;(.r;.pr;.r;>;.rm;);out meta;
Maybe one of the overpass devs could review them?
Milestone renamed