Modify

Opened 3 years ago

Closed 3 years ago

Last modified 3 years ago

#13361 closed enhancement (fixed)

[Patch] What is the bbox of an incomplete OSM primitive?

Reported by: GerdP Owned by: team
Priority: normal Milestone: 16.12
Component: Core Version:
Keywords: Cc: bastiK, simon04, michael2402

Description (last modified by Don-vip)

I stumbled over this while analysing performance problems in the QuadBuckets.search(BBox searchBbox) method.

I noticed that incomplete primitives have quite confusing bboxes:
1) an incomplete node has
xmin = xmax = ymin = ymax = 0;
2) an incomplete way has

xmin = Double.POSITIVE_INFINITY
xmax = Double.NEGATIVE_INFINITY
ymin = Double.POSITIVE_INFINITY
ymax = Double.NEGATIVE_INFINITY

if all nodes are incomplete, else it will be the bbox of the complete node(s).
3) a relation with eg. just two node members where one is complete at (10,40) and the other is incomplete will span a huge area from

xmin = 0
xmax = 40
ymin = 0
ymax = 10

I would expect that the bbox of a relation is only calculated using the complete nodes.

I'd also prefer when the bbox of incomplete nodes would not be located at (0,0), but I assume that there are good reasons for this.

Attachments (6)

13361_v1.patch (1.9 KB) - added by GerdP 3 years ago.
13361_v2.patch (22.3 KB) - added by GerdP 3 years ago.
13361_v3.patch (26.6 KB) - added by GerdP 3 years ago.
13361_v4.patch (26.4 KB) - added by GerdP 3 years ago.
13361_v6.patch (26.2 KB) - added by GerdP 3 years ago.
based on r11227, no further changes
13361_v7.patch (25.3 KB) - added by GerdP 3 years ago.
based on r11237, no further changes

Download all attachments as: .zip

Change History (41)

Changed 3 years ago by GerdP

Attachment: 13361_v1.patch added

comment:1 Changed 3 years ago by GerdP

Summary: What is the bbox of an incomplete OSM primitive?[Patch] What is the bbox of an incomplete OSM primitive?

Attached patch 13361_v1.patch will fix the bbox calculation for relations.
It should speed up rendering when you have downloaded an area and zoom in
as fewer relations are selected.

comment:2 Changed 3 years ago by Don-vip

Description: modified (diff)

comment:3 Changed 3 years ago by Don-vip

Cc: bastiK simon04 michael2402 added

@team, what do you think?

comment:4 Changed 3 years ago by michael2402

I would prefer a more generic approach:

  • An empty BBox is represented by xmin=+∞, xmax=-∞, ymin=+∞, ymax=-∞
  • isValid checks (xmin < xmax && ymin < ymax)
  • isInWorld checks !(xmin < -180.0 || xmax > 180.0 || ymin < -90.0 || ymax > 90.0)
  • add(LatLon) extends if the parameter is not null.
  • add(BBox) extends if the parameter isValid.
  • add(double, double) extends if the parameters are not NaN.
  • The empty constructor creates an invalid bbox
  • Adding a point to a invalid box makes that box exactly cover that point (we don't need to care for it)

We can then add a new, protected method to the OsmPrimitive: addToBBox

The generic bbox code would be something like:

BBox box = new BBox();
this.addToBBox(box, new HashSet<>()); // <- Replace with null if we use it for nodes/ways, they ignore second parameter
return box;

Relation would implement:

protected void addToBBox(Box box, Set<PrimitiveId> visited) {
    for (RelationMember rm : members) {
        if (visited.add(rm.getId())
            rm.getMember().addToBBox(box, visited);
    }
}

Way:

protected void addToBBox(Box box, Set<PrimitiveId> visited) {
    for (Node n : nodes) {
         n.addToBBox(box, visited);
    }
}

Node:

protected void addToBBox(Box box, Set<PrimitiveId> visited) {
    box.add(lat, lon);
}

This would:

  • Return an invalid BBox if no known members/nodes exists
  • Otherwise, return a BBox around all valid child nodes

BTW, we should replace x/y with lat/lon and then simply add getMaxLat methods, since x/y and up/down depends on the projection

comment:5 Changed 3 years ago by GerdP

Sounds good to me. One more open question: Does it make sense to add objects with an invalid bbox to a QuadBuckets
object? I found no source which accesses QuadBuckets other than with search(BBox searchBBox), and objects with
an invalid BBox are not found with this method.

comment:6 Changed 3 years ago by michael2402

I don't even know where to add them since invalid bounding boxes cannot lie inside other boxes.

I think we should define a BBox by the points that are contained by it. If you call contains(p) on an invalid BBox, it should return false.

contains(bbox) is more difficult: It should return true if all points covered by the bbox are in this box. Invalid bboxes of 0 are thus in every other bbox. We might add a special case handling here.

The idea behind a QuadBucket is to store each object in the bucket that contains it. So we would not even know where to put them. This is why I would vote for leaving them out of the QuadBucket tree (or adding them to an extra invalid list if required)

comment:7 Changed 3 years ago by GerdP

The current behaviour is that objects with invalid bbox are stored in the root of the QuadBuckets .
This list can grow to thousands of elements and is searched sequentially for each search action.
The objects with bbox(0,0,0,0) are stored somewhere else, maybe also causing trouble.
I'll have a look at that for class UnconnectedWays later.

comment:8 Changed 3 years ago by michael2402

UnconnectedWays is a validator test. It is useless on ways without a known position, so there should not be any problems with it.

comment:9 in reply to:  8 Changed 3 years ago by GerdP

Replying to michael2402:

UnconnectedWays is a validator test. It is useless on ways without a known position, so there should not be any problems with it.

Yes, forget UnconnectedWays. The problem should only appear with the search cals in class DataSet.

comment:10 Changed 3 years ago by GerdP

Hmm, there is quite a lot of code that might not work when we change the BBox class that much. I don't dare to do this
because I am not very familar with the code.
Besides that I was wrong with the statement that data from QuadBuckets is only retrieved via the search() method,
at least the copy constructor DataSet(DataSet copyFrom) uses the iterator.
So, for now I think the patch regarding Relation is good, but any other change should be done with great care.
I'll check now if QuadBuckets might profit when we store elements with invalid bboxes in a simple collection
so that they can be ignored by the 'search()' method while still available for the iterator.

Changed 3 years ago by GerdP

Attachment: 13361_v2.patch added

comment:11 Changed 3 years ago by GerdP

13361_v2.patch is my attempt to solve the problem. I do not understand the need for the addtoBBox code,
I prefer to have BBox.add(something) methods as this is closer to Rectangle2D.

The QuadBuckets class now uses an additonal Set to store the primitives with invalid BBox. I searched through
the code of all plugins to make sure that they don't depend on the trick (incomplete node stored at (0,0).

I've also added some tests and JavaDoc.
The change in Dataset should be revieved. If I got that right it makes no sense to call
primitive.updatePosition() before primitive.setDataset(), so that part is rather independent from the rest.

comment:12 Changed 3 years ago by Don-vip

Milestone: 16.09

comment:13 in reply to:  11 Changed 3 years ago by michael2402

Replying to Gerd Petermann <gpetermann_muenchen@…>:

I do not understand the need for the addtoBBox code

Visitor pattern ;-). It would make the code for finding the bbox of a member not depend on the instance of checks. But it does not change any functionality.

The QuadBuckets class now uses an additonal Set to store the primitives with invalid BBox. I searched through
the code of all plugins to make sure that they don't depend on the trick (incomplete node stored at (0,0).

If plugins had depended on this undocumented and obviously wrong behavior, we should fix it any way ;-)

I've also added some tests and JavaDoc.
The change in Dataset should be revieved. If I got that right it makes no sense to call
primitive.updatePosition() before primitive.setDataset(), so that part is rather independent from the rest.

Some Notes:

  • The BBox(final double x, final double y) and BBox(double ax, double ay, double bx, double by) constructor uses (0,0,0,0) on default instead of invalid.
  • The javadoc for BBox(double ax, double ay, double bx, double by) you can add: smallest possible
  • setInvalid: We prefer to have one assignment per line
  • isValid: You can write this shorter: return !(xmin > xmax || ymin > ymax). or return xmin <= xmax && ymin <= ymax
  • isInWorld: same here
  • invalidBBoxPrimitives: You can make this field final and then simply call invalidBBoxPrimitives.clear() to reset it.
  • WayTest: Move the speed test to Performance. You can find a timer in the performance test utils ;-). It also contains methods that will run the GC before each test run and compute the median time to make the result more reliable.

But those are only minor code style remarks, in general your patch looks really nice and I like the functionality it adds to JOSM ;-)

To your change in DataSet: It should not be a problem. You can add a test that allPrimitives.add returns true and throw an expection if it does not. This should never happen though (see first check)

comment:14 Changed 3 years ago by GerdP

Thanks for the quick reponse.
I'll post a new version tomorrow to fix the problems.
Reg. clear(): I first used an ArrayList() for the invalidBBoxPrimitives and that would not free the memory
for the array with clear(). I guess with the LinkedHashSetthe clear() would be okay. BTW,
I used a LinkedHashSet here to make the results predictable.
Reg. the visitor pattern: Your code would do all the calculations each time and not use a saved bbox.
Is that intended?

comment:15 in reply to:  14 Changed 3 years ago by michael2402

Replying to Gerd Petermann <gpetermann_muenchen@…>:

Reg. the visitor pattern: Your code would do all the calculations each time and not use a saved bbox.
Is that intended?

Not neccessarely. You can still use the bbox cache and then return the cached bbox object. It is just a structural change to move the contents of the if-Blocks out to the Node/Way/Relation classes.

Changed 3 years ago by GerdP

Attachment: 13361_v3.patch added

comment:16 Changed 3 years ago by GerdP

OK, with v3 I've fixed the problems and implemented the addToBBox() methods and used them.

Two open questions remain:
1) What should happen if the caller of a BBox constructor passes invalid data, e.g. with BBox(LatLon a, LatLon b). You can create a LatLon instance with nonsense like new LatLon(1234,54321).
Of course this is extreme, but for objects near lon=180 it is likely that we will see a value > 180 in some cases
where callers try to calculate a search bbox around the object, same with -180 etc.
2) The sanity() method doesn't make sure that the bbox is valid, see b5 in BBoxTest.testLatLonConstructor().
It makes sure that e.g. xmax is <= 180, but a value < -180 is not changed.

comment:17 Changed 3 years ago by GerdP

Forgot to mention that I removed the speedTest. I used it to compare some different implementations, I think it is no longer useful.

comment:18 in reply to:  16 Changed 3 years ago by michael2402

Replying to Gerd Petermann <gpetermann_muenchen@…>:

OK, with v3 I've fixed the problems and implemented the addToBBox() methods and used them.

Two open questions remain:
1) What should happen if the caller of a BBox constructor passes invalid data, e.g. with BBox(LatLon a, LatLon b). You can create a LatLon instance with nonsense like new LatLon(1234,54321).
Of course this is extreme, but for objects near lon=180 it is likely that we will see a value > 180 in some cases
where callers try to calculate a search bbox around the object, same with -180 etc.

This may happen quite often. The map view code may do it because it creates a BBox for the whole view. If you zoom out enough it is quite big.

If you feel the need for it, you can throw an IllegalArgumentException but in this case it is perfectly fine to clamp the coordinates. LatLon is not clamped as well, so you may get LatLon objects that are out of range.

2) The sanity() method doesn't make sure that the bbox is valid, see b5 in BBoxTest.testLatLonConstructor().
It makes sure that e.g. xmax is <= 180, but a value < -180 is not changed.

Feel free to define it the way it should be and test that, too ;-). It was probably added because of some "bug" where xmax was > 180. xmax < -180 is not possible for map view bounds at the moment. I added a Utils.clamp method some months ago, it may ease your work there ;-).

BBox(Node n) is broken. For a node at (1,2), it returns a BBox of: 0..1, 0..2

What you can do is to simply set minx/maxx/... to the invalid defaults. Then simply set or add the coordinates in the constructor. The JIT will to enough inlining so that this is pretty fast ;-). It will even remove the default assignments if it notices that the variable will be assigned to an other value later.

Changed 3 years ago by GerdP

Attachment: 13361_v4.patch added

comment:19 Changed 3 years ago by GerdP

BBox(Node n): Argh, yes, I agree that the setInvalid() code is error prone, I thought it might safe some CPU cycles but I will revert to the original code.

Regarding sanity(): The question is in what case the sanity() code will improve anything.
I think most often the BBox class is used to calculate a bbox for QuadBuckets.search() and I see no problem when
that search bbox overlaps planet.
I would not want to see such a bbox as a bounds tag in an OSM file or in a log message, but in other cases I see no problem.
In org.openstreetmap.josm.data.Bounds I don't find such a check, which makes me wonder why we need it in BBox
Anyway, I've left the call where it was before and added the Utils.clamp in sanity().
See v4.

comment:20 in reply to:  19 Changed 3 years ago by michael2402

Replying to Gerd Petermann <gpetermann_muenchen@…>:

BBox(Node n): Argh, yes, I agree that the setInvalid() code is error prone, I thought it might safe some CPU cycles but I will revert to the original code.

Those are writes to fields that are in the L1-Cache at that moment. Don't worry about it ;-)

Regarding sanity(): The question is in what case the sanity() code will improve anything.
I think most often the BBox class is used to calculate a bbox for QuadBuckets.search() and I see no problem when
that search bbox overlaps planet.
I would not want to see such a bbox as a bounds tag in an OSM file or in a log message, but in other cases I see no problem.
In org.openstreetmap.josm.data.Bounds I don't find such a check, which makes me wonder why we need it in BBox
Anyway, I've left the call where it was before and added the Utils.clamp in sanity().
See v4.

In my point of view, we can remove the sanity check completely. It would make the behavior more predictable. It would be best if we were able to merge Bounds and BBox in the long run - they mostly differ in the meridian handling (Bounds can span across it). But other than that, they serve a similar purpose.

comment:21 Changed 3 years ago by GerdP

I agree reg. removal of sanity. Feel free to change it that way.
Reg. Bounds: I was surprised when I found that this also exists, but I did not yet try to find out why.

comment:22 Changed 3 years ago by michael2402

The reasons are historical. We have a lot of duplicate code in the both creation. BBox is used for quad tree lookups while Bounds is used in most other code, like imaging and map paint.

Bounds cannot be invalid and are rounded to the OSM server precision on default.

We have the same for EastNorth coordinates: ProjectionBounds (which is something completely different). We have a BoundingXYVisitor for to compute the East/North bounds of an object.
A lot of JOSM code does some bounding box computation on it's own...

comment:23 Changed 3 years ago by simon04

Milestone: 16.0916.10

comment:24 Changed 3 years ago by simon04

Milestone: 16.1016.11

Milestone renamed

Changed 3 years ago by GerdP

Attachment: 13361_v6.patch added

based on r11227, no further changes

comment:25 Changed 3 years ago by bastiK

michael2402, what is the state of this ticket? It looks like worthwhile stuff, can we get it committed this milestone?

comment:26 Changed 3 years ago by michael2402

I have no objections, core should be fine. If there are plugins with issues we will notice it eventually ;-).

comment:27 Changed 3 years ago by bastiK

You are the most suitable person to apply the patch ... :-)

comment:28 Changed 3 years ago by michael2402

I'll do it this evening / tomorrow ;-)

Changed 3 years ago by GerdP

Attachment: 13361_v7.patch added

based on r11237, no further changes

comment:29 Changed 3 years ago by michael2402

Resolution: fixed
Status: newclosed

In 11269/josm:

Fix #13361: Use a more consistent invalid bbox for primitives.

Patch by Gerd Petermann

comment:30 Changed 3 years ago by michael2402

In 11271/josm:

See #13361: Add WayTest unit tests. Patch by Gerd Petermann.

comment:31 Changed 3 years ago by michael2402

In 11272/josm:

See #13361: Cleanup code and formatting

  • Add javadoc to public methods
  • Add @since tag to newer methods
  • Do not use if (...) return true; else return false;
  • Remove parentheses if they are not required
  • Use blocks for if-statements which is more consistent with most of JOSM code

comment:32 Changed 3 years ago by michael2402

The patch seems to make things better. I cleaned it up a bit in a separate revision so that everyone can blame me if I broke something.

I'm sorry I did not find time earlier this week.

comment:33 Changed 3 years ago by Don-vip

In 11276/josm:

see #13361 - fix unit test

comment:34 Changed 3 years ago by Don-vip

Milestone: 16.1116.12

Milestone renamed

comment:35 Changed 3 years ago by Don-vip

See #14186 for possible regression

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain team.
as The resolution will be set.
The resolution will be deleted.

Add Comment


E-mail address and name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.