Opened 7 years ago

Closed 7 years ago

# [Patch] What is the bbox of an incomplete OSM primitive?

Reported by: Owned by: GerdP team normal 16.12 Core bastiK, simon04, michael2402

I stumbled over this while analysing performance problems in the QuadBuckets.search(BBox searchBbox) method.

I noticed that incomplete primitives have quite confusing bboxes:
1) an incomplete node has
xmin = xmax = ymin = ymax = 0;
2) an incomplete way has

xmin = Double.POSITIVE_INFINITY
xmax = Double.NEGATIVE_INFINITY
ymin = Double.POSITIVE_INFINITY
ymax = Double.NEGATIVE_INFINITY

if all nodes are incomplete, else it will be the bbox of the complete node(s).
3) a relation with eg. just two node members where one is complete at (10,40) and the other is incomplete will span a huge area from

xmin = 0
xmax = 40
ymin = 0
ymax = 10

I would expect that the bbox of a relation is only calculated using the complete nodes.

I'd also prefer when the bbox of incomplete nodes would not be located at (0,0), but I assume that there are good reasons for this.

### comment:1 Changed 7 years ago by GerdP

Summary: What is the bbox of an incomplete OSM primitive? → [Patch] What is the bbox of an incomplete OSM primitive?

Attached patch 13361_v1.patch will fix the bbox calculation for relations.
It should speed up rendering when you have downloaded an area and zoom in
as fewer relations are selected.

### comment:2 Changed 7 years ago by Don-vip

Description: modified (diff)

### comment:3 Changed 7 years ago by Don-vip

@team, what do you think?

### comment:4 Changed 7 years ago by michael2402

I would prefer a more generic approach:

• An empty BBox is represented by xmin=+∞, xmax=-∞, ymin=+∞, ymax=-∞
• `isValid` checks `(xmin < xmax && ymin < ymax)`
• `isInWorld` checks `!(xmin < -180.0 || xmax > 180.0 || ymin < -90.0 || ymax > 90.0)`
• `add(LatLon)` extends if the parameter is not null.
• `add(BBox)` extends if the parameter `isValid`.
• `add(double, double)` extends if the parameters are not `NaN`.
• The empty constructor creates an invalid bbox
• Adding a point to a invalid box makes that box exactly cover that point (we don't need to care for it)

We can then add a new, protected method to the OsmPrimitive: `addToBBox`

The generic bbox code would be something like:

```BBox box = new BBox();
this.addToBBox(box, new HashSet<>()); // <- Replace with null if we use it for nodes/ways, they ignore second parameter
return box;
```

Relation would implement:

```protected void addToBBox(Box box, Set<PrimitiveId> visited) {
for (RelationMember rm : members) {
}
}
```

Way:

```protected void addToBBox(Box box, Set<PrimitiveId> visited) {
for (Node n : nodes) {
}
}
```

Node:

```protected void addToBBox(Box box, Set<PrimitiveId> visited) {
}
```

This would:

• Return an invalid BBox if no known members/nodes exists
• Otherwise, return a BBox around all valid child nodes

BTW, we should replace x/y with lat/lon and then simply add `getMaxLat` methods, since x/y and up/down depends on the projection

### comment:5 Changed 7 years ago by GerdP

Sounds good to me. One more open question: Does it make sense to add objects with an invalid bbox to a QuadBuckets
object? I found no source which accesses QuadBuckets other than with search(BBox searchBBox), and objects with

### comment:6 Changed 7 years ago by michael2402

I don't even know where to add them since invalid bounding boxes cannot lie inside other boxes.

I think we should define a BBox by the points that are contained by it. If you call `contains(p)` on an invalid BBox, it should return false.

`contains(bbox)` is more difficult: It should return true if all points covered by the bbox are in this box. Invalid bboxes of 0 are thus in every other bbox. We might add a special case handling here.

The idea behind a `QuadBucket` is to store each object in the bucket that contains it. So we would not even know where to put them. This is why I would vote for leaving them out of the QuadBucket tree (or adding them to an extra invalid list if required)

### comment:7 Changed 7 years ago by GerdP

The current behaviour is that objects with invalid bbox are stored in the root of the `QuadBuckets` .
This list can grow to thousands of elements and is searched sequentially for each search action.
The objects with bbox(0,0,0,0) are stored somewhere else, maybe also causing trouble.
I'll have a look at that for class UnconnectedWays later.

### comment:8 follow-up:  9 Changed 7 years ago by michael2402

`UnconnectedWays` is a validator test. It is useless on ways without a known position, so there should not be any problems with it.

### comment:9 in reply to:  8 Changed 7 years ago by GerdP

`UnconnectedWays` is a validator test. It is useless on ways without a known position, so there should not be any problems with it.

Yes, forget `UnconnectedWays`. The problem should only appear with the `search` cals in class `DataSet`.

### comment:10 Changed 7 years ago by GerdP

Hmm, there is quite a lot of code that might not work when we change the BBox class that much. I don't dare to do this
because I am not very familar with the code.
Besides that I was wrong with the statement that data from QuadBuckets is only retrieved via the search() method,
at least the copy constructor `DataSet(DataSet copyFrom)` uses the iterator.
So, for now I think the patch regarding Relation is good, but any other change should be done with great care.
I'll check now if `QuadBuckets` might profit when we store elements with invalid bboxes in a simple collection
so that they can be ignored by the 'search()' method while still available for the iterator.

### comment:11 follow-up:  13 Changed 7 years ago by GerdP

13361_v2.patch is my attempt to solve the problem. I do not understand the need for the `addtoBBox` code,
I prefer to have BBox.add(something) methods as this is closer to `Rectangle2D`.

The `QuadBuckets` class now uses an additonal `Set` to store the primitives with invalid BBox. I searched through
the code of all plugins to make sure that they don't depend on the trick (incomplete node stored at (0,0).

The change in Dataset should be revieved. If I got that right it makes no sense to call
`primitive.updatePosition()` before `primitive.setDataset()`, so that part is rather independent from the rest.

### comment:12 Changed 7 years ago by Don-vip

Milestone: → 16.09

### comment:13 in reply to:  11 Changed 7 years ago by michael2402

I do not understand the need for the `addtoBBox` code

Visitor pattern ;-). It would make the code for finding the bbox of a member not depend on the instance of checks. But it does not change any functionality.

The `QuadBuckets` class now uses an additonal `Set` to store the primitives with invalid BBox. I searched through
the code of all plugins to make sure that they don't depend on the trick (incomplete node stored at (0,0).

If plugins had depended on this undocumented and obviously wrong behavior, we should fix it any way ;-)

The change in Dataset should be revieved. If I got that right it makes no sense to call
`primitive.updatePosition()` before `primitive.setDataset()`, so that part is rather independent from the rest.

Some Notes:

• The `BBox(final double x, final double y)` and `BBox(double ax, double ay, double bx, double by)` constructor uses (0,0,0,0) on default instead of invalid.
• The javadoc for `BBox(double ax, double ay, double bx, double by)` you can add: `smallest possible`
• `setInvalid`: We prefer to have one assignment per line
• `isValid`: You can write this shorter: `return !(xmin > xmax || ymin > ymax)`. or `return xmin <= xmax && ymin <= ymax`
• `isInWorld`: same here
• `invalidBBoxPrimitives`: You can make this field final and then simply call `invalidBBoxPrimitives.clear()` to reset it.
• `WayTest`: Move the speed test to Performance. You can find a timer in the performance test utils ;-). It also contains methods that will run the GC before each test run and compute the median time to make the result more reliable.

But those are only minor code style remarks, in general your patch looks really nice and I like the functionality it adds to JOSM ;-)

To your change in DataSet: It should not be a problem. You can add a test that `allPrimitives.add` returns true and throw an expection if it does not. This should never happen though (see first check)

### comment:14 follow-up:  15 Changed 7 years ago by GerdP

Thanks for the quick reponse.
I'll post a new version tomorrow to fix the problems.
Reg. clear(): I first used an ArrayList() for the `invalidBBoxPrimitives` and that would not free the memory
for the array with `clear()`. I guess with the `LinkedHashSet`the `clear()` would be okay. BTW,
I used a `LinkedHashSet` here to make the results predictable.
Reg. the visitor pattern: Your code would do all the calculations each time and not use a saved bbox.
Is that intended?

### comment:15 in reply to:  14 Changed 7 years ago by michael2402

Reg. the visitor pattern: Your code would do all the calculations each time and not use a saved bbox.
Is that intended?

Not neccessarely. You can still use the bbox cache and then return the cached bbox object. It is just a structural change to move the contents of the if-Blocks out to the Node/Way/Relation classes.

### comment:16 follow-up:  18 Changed 7 years ago by GerdP

OK, with v3 I've fixed the problems and implemented the addToBBox() methods and used them.

Two open questions remain:
1) What should happen if the caller of a BBox constructor passes invalid data, e.g. with `BBox(LatLon a, LatLon b)`. You can create a LatLon instance with nonsense like `new LatLon(1234,54321)`.
Of course this is extreme, but for objects near lon=180 it is likely that we will see a value > 180 in some cases
where callers try to calculate a search bbox around the object, same with -180 etc.
2) The `sanity()` method doesn't make sure that the bbox is valid, see b5 in `BBoxTest.testLatLonConstructor()`.
It makes sure that e.g. xmax is <= 180, but a value < -180 is not changed.

### comment:17 Changed 7 years ago by GerdP

Forgot to mention that I removed the speedTest. I used it to compare some different implementations, I think it is no longer useful.

### comment:18 in reply to:  16 Changed 7 years ago by michael2402

OK, with v3 I've fixed the problems and implemented the addToBBox() methods and used them.

Two open questions remain:
1) What should happen if the caller of a BBox constructor passes invalid data, e.g. with `BBox(LatLon a, LatLon b)`. You can create a LatLon instance with nonsense like `new LatLon(1234,54321)`.
Of course this is extreme, but for objects near lon=180 it is likely that we will see a value > 180 in some cases
where callers try to calculate a search bbox around the object, same with -180 etc.

This may happen quite often. The map view code may do it because it creates a BBox for the whole view. If you zoom out enough it is quite big.

If you feel the need for it, you can throw an `IllegalArgumentException` but in this case it is perfectly fine to clamp the coordinates. LatLon is not clamped as well, so you may get LatLon objects that are out of range.

2) The `sanity()` method doesn't make sure that the bbox is valid, see b5 in `BBoxTest.testLatLonConstructor()`.
It makes sure that e.g. xmax is <= 180, but a value < -180 is not changed.

Feel free to define it the way it should be and test that, too ;-). It was probably added because of some "bug" where xmax was > 180. xmax < -180 is not possible for map view bounds at the moment. I added a `Utils.clamp` method some months ago, it may ease your work there ;-).

`BBox(Node n)` is broken. For a node at (1,2), it returns a BBox of: 0..1, 0..2

What you can do is to simply set minx/maxx/... to the invalid defaults. Then simply set or `add` the coordinates in the constructor. The JIT will to enough inlining so that this is pretty fast ;-). It will even remove the default assignments if it notices that the variable will be assigned to an other value later.

### comment:19 follow-up:  20 Changed 7 years ago by GerdP

`BBox(Node n)`: Argh, yes, I agree that the `setInvalid()` code is error prone, I thought it might safe some CPU cycles but I will revert to the original code.

Regarding `sanity()`: The question is in what case the sanity() code will improve anything.
I think most often the `BBox` class is used to calculate a bbox for `QuadBuckets.search()` and I see no problem when
that search bbox overlaps planet.
I would not want to see such a bbox as a bounds tag in an OSM file or in a log message, but in other cases I see no problem.
In `org.openstreetmap.josm.data.Bounds` I don't find such a check, which makes me wonder why we need it in `BBox`
Anyway, I've left the call where it was before and added the `Utils.clamp` in `sanity()`.
See v4.

### comment:20 in reply to:  19 Changed 7 years ago by michael2402

`BBox(Node n)`: Argh, yes, I agree that the `setInvalid()` code is error prone, I thought it might safe some CPU cycles but I will revert to the original code.

Those are writes to fields that are in the L1-Cache at that moment. Don't worry about it ;-)

Regarding `sanity()`: The question is in what case the sanity() code will improve anything.
I think most often the `BBox` class is used to calculate a bbox for `QuadBuckets.search()` and I see no problem when
that search bbox overlaps planet.
I would not want to see such a bbox as a bounds tag in an OSM file or in a log message, but in other cases I see no problem.
In `org.openstreetmap.josm.data.Bounds` I don't find such a check, which makes me wonder why we need it in `BBox`
Anyway, I've left the call where it was before and added the `Utils.clamp` in `sanity()`.
See v4.

In my point of view, we can remove the sanity check completely. It would make the behavior more predictable. It would be best if we were able to merge Bounds and BBox in the long run - they mostly differ in the meridian handling (Bounds can span across it). But other than that, they serve a similar purpose.

### comment:21 Changed 7 years ago by GerdP

I agree reg. removal of sanity. Feel free to change it that way.
Reg. Bounds: I was surprised when I found that this also exists, but I did not yet try to find out why.

### comment:22 Changed 7 years ago by michael2402

The reasons are historical. We have a lot of duplicate code in the both creation. `BBox` is used for quad tree lookups while `Bounds` is used in most other code, like imaging and map paint.

Bounds cannot be invalid and are rounded to the OSM server precision on default.

We have the same for EastNorth coordinates: `ProjectionBounds` (which is something completely different). We have a `BoundingXYVisitor` for to compute the East/North bounds of an object.
A lot of JOSM code does some bounding box computation on it's own...

### comment:23 Changed 7 years ago by simon04

Milestone: 16.09 → 16.10

### comment:24 Changed 7 years ago by simon04

Milestone: 16.10 → 16.11

Milestone renamed

### Changed 7 years ago by GerdP

based on r11227, no further changes

### comment:25 Changed 7 years ago by bastiK

michael2402, what is the state of this ticket? It looks like worthwhile stuff, can we get it committed this milestone?

### comment:26 Changed 7 years ago by michael2402

I have no objections, core should be fine. If there are plugins with issues we will notice it eventually ;-).

### comment:27 Changed 7 years ago by bastiK

You are the most suitable person to apply the patch ... :-)

### comment:28 Changed 7 years ago by michael2402

I'll do it this evening / tomorrow ;-)

### Changed 7 years ago by GerdP

based on r11237, no further changes

### comment:29 Changed 7 years ago by michael2402

Resolution: → fixed new → closed

In 11269/josm:

Fix #13361: Use a more consistent invalid bbox for primitives.

Patch by Gerd Petermann

### comment:30 Changed 7 years ago by michael2402

In 11271/josm:

See #13361: Add WayTest unit tests. Patch by Gerd Petermann.

### comment:31 Changed 7 years ago by michael2402

In 11272/josm:

See #13361: Cleanup code and formatting

• Do not use if (...) return true; else return false;
• Remove parentheses if they are not required
• Use blocks for if-statements which is more consistent with most of JOSM code

### comment:32 Changed 7 years ago by michael2402

The patch seems to make things better. I cleaned it up a bit in a separate revision so that everyone can blame me if I broke something.

I'm sorry I did not find time earlier this week.

### comment:33 Changed 7 years ago by Don-vip

In 11276/josm:

see #13361 - fix unit test

### comment:34 Changed 7 years ago by Don-vip

Milestone: 16.11 → 16.12

Milestone renamed

### comment:35 Changed 7 years ago by Don-vip

See #14186 for possible regression

### Modify Ticket

Change Properties