Modify

Opened 19 months ago

Closed 19 months ago

Last modified 19 months ago

#18368 closed defect (fixed)

Combine way takes ages (I stopped after 7 minutes)

Reported by: stoecker Owned by: GerdP
Priority: normal Milestone: 19.12
Component: Core Version:
Keywords: Cc:

Description

Current SVN: 15542

With attached file joining the two ways takes extremely long (if it ever finish). Shorting the ways allowed joining (also with the overlap).

Attachments (6)

Brandenburg.osm (1.9 MB) - added by stoecker 19 months ago.
brandenburg-simple.osm (2.0 KB) - added by GerdP 19 months ago.
corrected simplified sample, first version contained an artifact
18368.patch (2.3 KB) - added by GerdP 19 months ago.
sample.osm (1.3 KB) - added by GerdP 19 months ago.
18368-v2.patch (5.8 KB) - added by GerdP 19 months ago.
Improve performance of NodeGraph.buildSpanningPath(). Remaining sow case: When the graph is disconnected it might still search through all nodes, e.g. when you add an unconnected closed way with a few nodes to Brandenburg.osm search and try to connect all three ways. Will not take foreever but maybe a minute or so to find out no result. I think this case should better be detected in the calling method.
18368-v3b.patch (7.4 KB) - added by GerdP 19 months ago.
further improvements: detect unconnected graph, only iterate over those nodes which can be start nodes in a spanning way. This works in fractions of a second for all cases I can think of, even with > 40000 nodes.

Download all attachments as: .zip

Change History (22)

Changed 19 months ago by stoecker

Attachment: Brandenburg.osm added

comment:1 Changed 19 months ago by GerdP

Owner: changed from team to GerdP
Status: newassigned

Problem seems to be in org.openstreetmap.josm.data.osm.NodeGraph.buildSpanningPath()
Not sure why it is called in this situation.

comment:2 Changed 19 months ago by GerdP

There is a comment in org.openstreetmap.josm.data.osm.NodeGraph.buildSpanningPath()
"In the worst case this loops over all nodes which is very slow for large ways."
Very true ;)
My first assumption was that CombineWayAction would simply add the nodes of one way to the nodes of the other, similar to the method that is used to calculate the rings of multipolygons.
Instead, it seems to search for "good" first point in the combined way. I am still trying to understand the reason for this,
the code is very old.

@Dirk: Do you think that the result with my simplified example is correct?

Changed 19 months ago by GerdP

Attachment: brandenburg-simple.osm added

corrected simplified sample, first version contained an artifact

Changed 19 months ago by GerdP

Attachment: 18368.patch added

comment:3 Changed 19 months ago by GerdP

The problem with the existing algorithm is that it does a brute force search for the best node to start with when there is no "terminal node" like in a c-shaped or d-shaped way.
It starts with node 2684142672 and adds nodes in clockwise order until it detects a problem when it comes to node 176872704 for the 2nd time. This "problem detection" in buildSpanningPath() is a rather complex operation, I think it is at least O(n²) and it uses a synchronized Stack. Instead of starting a 2nd try with this problem node as start it uses the node 1740733337 and fails again at the same place after almost the same time. This is done for thousands of nodes and each call takes very long as we have a O(n³) algo now.
I think it is obvious that we should start with a node that appears most often in the given edges when there is no "terminal node". The patch implements this.
The search for the start node is much faster than the problem detection. On my PC I see < 30 ms for the additional search and ~1500 to 3500 ms for the (single) call of buildSpanningPath() to compute the result.
Timings heavily depend on the order in which you select the ways before combining them and the JRE just-in-time-compiler.

BTW: Validator complains that the combined way is self-intersecting. I am unsure if this warning is correct or not.

Last edited 19 months ago by GerdP (previous) (diff)

comment:4 Changed 19 months ago by stoecker

BTW: Validator complains that the combined way is self-intersecting. I am unsure if this warning is correct or not.

That's correct. The inner hole and the outer bounds are connected via an overlapping way (i.e. polygon with inner hole without multipolygon - very old style geometry, see #18367, that's how such things were drawn in the early days of OSM :-).

Leaving your fix aside: Why do we do such a complex algorithm at all? In the past we simply looked at the ends of the ways and joined them when two nodes are identical. That probably had issues when joining many ways, but in simple cases it was flawless.

My proposal (beside your fix) would be to reintroduce this for simply cases like only two ways. In maximum this are 4 node identity checks. This would also fix #18367.

comment:5 in reply to:  4 Changed 19 months ago by GerdP

Replying to stoecker:

BTW: Validator complains that the combined way is self-intersecting. I am unsure if this warning is correct or not.

That's correct. The inner hole and the outer bounds are connected via an overlapping way

No, it is not. The resulting way has no overlapping segments.

comment:6 Changed 19 months ago by stoecker

Well, then this is wrong. Combine Way action should not remove a section of the way. This is the task of other actions, but combine way only should combine ways.

comment:7 Changed 19 months ago by GerdP

That was also my thought. So why do we have this poorly performing method buildSpanningPath()?

comment:8 Changed 19 months ago by stoecker

No idea. Why/When was it introduced?

comment:9 Changed 19 months ago by GerdP

A big change in r2070. I did not find an explanation why it was introduced, unless #3208 was the reason.

comment:10 Changed 19 months ago by GerdP

In my eyes the methods in NodeGraph are simply a bad choice when we want to combine ways. Very confusing...

Changed 19 months ago by GerdP

Attachment: sample.osm added

comment:11 Changed 19 months ago by GerdP

I think the attached sample.osm explains why buildSpanningPath() was introduced. If we simply connect the two given ways we get two self-intersections. The implemented algo splits and reassembles the ways so that the result is not self-intersecting although validator says that it is.
This opens two questions:

  • Should we keep this behaviour in CombineWayAction ?
  • Why do we ignore p-shaped ways in SelfIntersectingWay but not B-shaped or 8-shaped ways?

comment:12 Changed 19 months ago by GerdP

Oops, sorry, seems I was all wrong. The implemented algo doesn't prevent the self-intersections.
In SelfIntersectingWay we check this: "all nodes of a way must appear only once, only the first and/or last node are allowed to appear exactly two times". This allows closed ways and b-shaped ways, but not B-shaped or 8-shaped ways.
This was discussed in #14202. I think it is just a compromise that suits the usage inOSM , not exactly right or wrong.

Method buildSpanningPath() calculates a path that is accepted by the OverlappingWays test, means, the path doesn' contain a segment twice. I see no reason to use this in CombineWayAction and I agree with Dirk that the result for the attached brandenburg-simple.osm is wrong as it silently removes a part of one of the ways.

Method buildSpanningPath() is also used in ParallelWayAction, and I also see no need to use it there.
I think org.openstreetmap.josm.plugins.opendata.core.datasets.WayCombiner is another such case.

IIGTR only the plugin merge-overlap uses and requires exactly what buildSpanningPath() does now.

So, I think I should

  • create a new simple method to combine ways which just connects end points of the ways and - maybe - reverses the order where needed, but without removing edges or recombining parts and use that new method to replace the wrong usage of buildSpanningPath()
  • improve performance of buildSpanningPath() as it might help the merge-overlap plugin
Last edited 19 months ago by GerdP (previous) (diff)

Changed 19 months ago by GerdP

Attachment: 18368-v2.patch added

Improve performance of NodeGraph.buildSpanningPath(). Remaining sow case: When the graph is disconnected it might still search through all nodes, e.g. when you add an unconnected closed way with a few nodes to Brandenburg.osm search and try to connect all three ways. Will not take foreever but maybe a minute or so to find out no result. I think this case should better be detected in the calling method.

Changed 19 months ago by GerdP

Attachment: 18368-v3b.patch added

further improvements: detect unconnected graph, only iterate over those nodes which can be start nodes in a spanning way. This works in fractions of a second for all cases I can think of, even with > 40000 nodes.

comment:13 Changed 19 months ago by GerdP

I plan to commit patch 18368-v3b.patch after the release of the new tested version.

comment:14 Changed 19 months ago by GerdP

In 15554/josm:

see #18368: Drastically improve performance of NodeGraph.buildSpanningPath() used in combine way 'C' and parallel ways map mode
In special cases the old code was extremely slow, for example when you try to connect the ways of a complex multipolygon containing inner rings
which always should fail.

  • detect unconnected graph
  • use extra HashSet for check if a NodePair is already in the path to avoid sequential search
  • avoid to test start points which cannot be the start point in a "spanning path"

comment:15 Changed 19 months ago by GerdP

Resolution: fixed
Status: assignedclosed

Performance problem is solved. I've opened #18385 for the wrong result.

comment:16 Changed 19 months ago by GerdP

Milestone: 19.12

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain GerdP.
as The resolution will be set.
The resolution will be deleted.

Add Comment


E-mail address and name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.