Modify

Opened 7 months ago

Closed 7 months ago

Last modified 7 months ago

#24075 closed defect (fixed)

Reduce memory allocations for TaggingPresetItem#matches

Reported by: taylor.smock Owned by: team
Priority: normal Milestone: 25.01
Component: Core Version:
Keywords: Cc: maripogoda

Description

There are two things going on here:

  1. MatchType#ofString creates a new enumset each time it is called just to iterate through the values. This can be fixed either by calling values() or using a cached array. I elected to use a cached array for gc purposes. It would probably be better to change KeyedItem to have a setMatch method and get the keyed item there. This accounts for ~25% of the current CPU cost for TaggingPresetItem#matches. I haven't experimented with setMatch since I don't know if it would break EasyPresets.
  2. Massive memory allocations due to ArrayList.Itr creation (~75% of all memory allocations during validation of Colorado). This can be fixed by using a traditional for loop.

In testing, the total improvements for these two changes as measured when validating a Colorado pbf extract were as follows:

  • 99% reduction in memory allocations (purely from avoiding ArrayList.Itr creation)
  • 80% reduction in GC
  • 59% reduction in CPU usage
    • 35% is from MatchType#ofString. This is still >50% of the CPU cycles for TaggingPresetItem#matches though.

Attachments (3)

24075.TaggingPresetItem#matches.patch (2.6 KB ) - added by taylor.smock 7 months ago.
TaggingPresetItem#matches changes only
24075.KeyedItem_MatchType#ofString.patch (1.2 KB ) - added by taylor.smock 7 months ago.
KeyedItem.MatchType#ofString only
24075.KeyedItem_match.patch (4.3 KB ) - added by taylor.smock 7 months ago.
KeyedItem.match visibility reduction

Download all attachments as: .zip

Change History (14)

by taylor.smock, 7 months ago

TaggingPresetItem#matches changes only

by taylor.smock, 7 months ago

KeyedItem.MatchType#ofString only

comment:1 by taylor.smock, 7 months ago

@maripogoda: I'm inclined to cache the actual matchtype instead of having to call KeyedItem.MatchType#ofString all the time.

Will it break your plugin (EasyPresets) if I reduce the visibility of the match field in KeyedItem?

by taylor.smock, 7 months ago

Attachment: 24075.KeyedItem_match.patch added

KeyedItem.match visibility reduction

comment:2 by taylor.smock, 7 months ago

Resolution: fixed
Status: newclosed

In 19285/josm:

Fix #24075: Reduce memory allocations for TaggingPresetItem#matches

This is done by doing the following:

  • Converting KeyedItem.match to a MatchType from a String
    • This avoids calling MatchType#ofString repeatedly
    • This does decrease the visibility of the match field and change the type
  • Avoiding ArrayList.Itr creation in TaggingPresetItem#matches
    • This does produce some duplicate code, unfortunately.

The KeyedItem.match change reduces memory allocations in KeyedItem#matches by
98% and CPU cycles by 77%.
The TaggingPresetItem#matches change to avoid ArrayList.Itr creation reduces
memory allocations by 100% and CPU cycles by 94% for ArrayList (only looking at
changes between the for loop types).

The net change for TaggingPresetItem#matches is a reduction of memory
allocations by 99% and CPU cycles by 74%. As noted in the ticket, there was a
reduction in GC by ~80%.

comment:3 by stoecker, 7 months ago

Caused a new PMD warning:

Best Practices/ForLoopCanBeForeach: src/src/org/openstreetmap/josm/gui/tagging/presets/TaggingPresetItem.java#L182
https://docs.pmd-code.org/pmd-doc-7.6.0/pmd_rules_java_bestpractices.html#forloopcanbeforeach

A have no idea what this wants.

comment:4 by taylor.smock, 7 months ago

I know exactly what it wants. That is the other branch of the for loop. It doesn't like the for (int i = 0; i < list.size(); i++) { var x = list.get(i); /* do something with x */ } syntax.

As noted in the comments, that is much less efficient, at least for the ArrayList.Itr case (which is pretty common). IIRC, there is a no pmd comment I can use there.

I missed it since (on my machine) there are some erroneous issues for Java 11 that are valid issues once we move to Java 17+.

      [pmd] src/org/openstreetmap/josm/data/imagery/vectortile/mapbox/MapboxVectorCachedTileLoader.java:66:	CloseResource:	Ensure that resources like this ThreadPoolExecutor object are closed after use
      [pmd] src/org/openstreetmap/josm/data/osm/visitor/paint/StyledTiledMapRenderer.java:78:	CloseResource:	Ensure that resources like this ExecutorService object are closed after use
      [pmd] src/org/openstreetmap/josm/gui/tagging/presets/TaggingPresetItem.java:182:	ForLoopCanBeForeach:	This for loop can be replaced by a foreach loop
      [pmd] src/org/openstreetmap/josm/gui/tagging/presets/TaggingPresetReader.java:128:	MissingOverride:	The method 'getLast()' is missing an @Override annotation.

comment:5 by taylor.smock, 7 months ago

In 19286/josm:

See #24075: Reduce memory allocations for TaggingPresetItem#matches

Make PMD happy again.

As noted in the comments for the for loop in question, this is a fairly
significant performance optimization. The for-each loop that PMD prefers has
significant penalties on hot code sections for ArrayList objects, which we use
extensively.

TBH, the JVM should probably do this optimization for (at minimum) ArrayList
objects.

comment:6 by stoecker, 7 months ago

Ah, I looked at the wrong loop ;-(

comment:7 by stoecker, 7 months ago

I wonder, would

List<? extends TaggingPresetItem> items = (List<? extends TaggingPresetItem>) data;
for (TaggingPresetItem item : items) {

reach the same result?

I.e. lies the reduction in the first line or really in the for loop?

comment:8 by taylor.smock, 7 months ago

I can profile to check and see, but I would assume not. Which means I should probably check. :)

comment:9 by SekeRob, 7 months ago

Pretty please don't break EasyPreset...

(O/T Your changelog item [@19271] already had awesome effect on a slew of mapping activities in large datasets, parallel line replication (Shift+P), gridify parking spaces (Alt+Shift+Y), tracing/follow long ways to create adjacent polygons to name a few, and memory use had already been improved.)

comment:10 by taylor.smock, 7 months ago

@SekeRob: This shouldn't affect EasyPreset -- at least according to ant clean dist check-plugins. With that said, I haven't looked at the code for EasyPreset, so they could be using reflection.

As far as other performance issues go (like what was fixed in r19271), please report any performance regressions, especially if it takes a 30s operation and makes it a 3m operation (or worse).

For that matter, please report any other misc. performance issues ("validation takes 10 hours at <location> due to <specific test>").

If we do not know about a problem, we cannot fix the problem.

comment:11 by SekeRob, 7 months ago

I've always associated this slowness with me working in big data sets and it being kind of across the board (waiting for the upload pane to open after hitting Ctrl+Shift+Up is another one, where the actual pre-validation window counting up to now 46 is relatively regular in completion too, lest there's ways with 1000s of members in the set. In small sets wait states were not delaying the work flow, purely the big ones.

(I'll be on your case if an action suddenly starts to take 10x ;o).

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain team.
as The resolution will be set.
The resolution will be deleted. Next status will be 'reopened'.

Add Comment


E-mail address and name can be saved in the Preferences .
 
Note: See TracTickets for help on using tickets.