#13933 closed enhancement (fixed)
[Patch] Simplify QuadBuckets and improve memory footprint
Reported by: | GerdP | Owned by: | team |
---|---|---|---|
Priority: | normal | Milestone: | 16.12 |
Component: | Core | Version: | |
Keywords: | performance | Cc: |
Description
I've noticed that the spatial index requires quite a lot of memory when loading large files (say more than 1.000.000 nodes).
The patch includes these changes:
1) Let QBLevel
extend BBox and reduce the bytes needed used for some fields,
don't store a reference of QuadBuckts in QBLevel
2) Allow up to 48 primitives in one QBLevel
instance.
Effects: (all numbers for 64 Bit System)
1) the memory footprint of a single QBLevel
instance is reduced:
- The unpatched version allocates a BBox instance (48 bytes) and requires 97 for the fields
- The patched version requires 107 Bytes for the fields and no additional BBox instance.
In both cases the structures have a few "unused" bytes because JRE uses multiples of 8 (or 16?) Bytes for alignment.
2) Fewer QBLevel
instances
Numbers:
For an older extract of Bremen.osm.pbf (2016-03-26) I see ~521 MiB for r11123 and ~489 MiB for the patched version (~6 % more for the unpatched version),
for alaska (4.360.697 nodes) I see ~12 % more, so, the larger the file the higher the improvement.
I did not find any significant change in runtime.
Attachments (1)
Change History (7)
by , 8 years ago
Attachment: | improve_quadbuckets.patch added |
---|
comment:1 by , 8 years ago
Keywords: | performance added |
---|---|
Milestone: | → 16.11 |
comment:2 by , 8 years ago
follow-up: 4 comment:3 by , 8 years ago
I think we save the header plus a reference in QBLevel
. If I understand the svn history right, first BBox
was a class in QuadBuckets which was extracted later, that's why I thought that QBlevel could be coded as a subclass.
I liked the idea reg. code simplification, the effect on the memory footprint is rather small compared to the 16/48 limit change.
comment:4 by , 8 years ago
Replying to Gerd Petermann <gpetermann_muenchen@…>:
I think we save the header plus a reference in
QBLevel
. If I understand the svn history right, firstBBox
was a class in QuadBuckets which was extracted later, that's why I thought that QBlevel could be coded as a subclass.
I think most people have forgotten this history and consider it a generic box class.
I liked the idea reg. code simplification, the effect on the memory footprint is rather small compared to the 16/48 limit change.
In #13361 it was suggested to rename the field xmin
to lonMin
, etc. Such an internal refactoring of a class should be possible without breaking unrelated code all over the place. But it's not a big deal. :-)
Great, interesting work to find the memory hot spots when opening larger extracts! I like your optimizations, except making
QBLevel
a subclass ofBBox
. It feels like bad style to me, to hook into a low level class like that. However, I have to admit that it does save memory and the practical problems are manageable.What I can tell from
Bremen.osm.pbf
:On average:
By subclassing you save a 12 byte header, that is
saving for each primitive.
If these numbers are correct, I would prefer to keep BBox as a field.