Modify

Opened 9 years ago

Closed 9 years ago

Last modified 9 years ago

#7908 closed defect (fixed)

[PATCH] Wrong comparison of strings

Reported by: anonymous Owned by: team
Priority: normal Milestone:
Component: Core Version: latest
Keywords: Cc:

Description

Attached is a patch that fixes to wrong comparisons of strings found by findbugs.

Attachments (1)

JOSM_comparison_bugs_findbugs.patch (1.2 KB) - added by anonymous 9 years ago.

Download all attachments as: .zip

Change History (11)

Changed 9 years ago by anonymous

comment:1 Changed 9 years ago by anonymous

Summary: Wrong comparison of strings[PATCH] Wrong comparison of strings

comment:2 Changed 9 years ago by Don-vip

Thanks for the patch, but... what's your name ? We generally credit patch authors :)

comment:3 Changed 9 years ago by Skyman

Sorry was to late and i forgot to register. Thanks, you can use my username. :-)

Version 0, edited 9 years ago by Skyman (next)

comment:4 Changed 9 years ago by Don-vip

Actually there's no problem in PresetTextComparator. The == operator tests if it's the same object or if both values are null. If not the correct test is made line 24. The test in GPXLayer, however, can be changed.

comment:5 Changed 9 years ago by Don-vip

Resolution: fixed
Status: newclosed

In 5371/josm:

fix #7908 - Wrong comparison of strings (modified patch by Skyman)

comment:6 Changed 9 years ago by verdy_p

There's nothing wrong in first checking if too strings are the same object before actually comparing them. But this is already what the normal String compare function does to avoid scanning the string contents when it is known that it is the same object.
If you encode an ==, then you have first to make sure that strings that should compare equal are effectively the same object. But this would require you to intern all strings, and string interning is very slow (internally this requires hashing string objects, and maintaining them in a, internal Hashmap collection from which the interned version is returned and the original string discarded). Interning however may save time for small strings (such as those containing very common properties, notably the attribute names of OSM objects like "name", "lon", "lat", "highway"...) Additionally it would reduce the memory footprint in the VM by discarding many separate string objects.
Whever interning should be performed or not should be studied on practical cases (notably when working on large OSM files). Interning strings may be interesting also in the validator (which could replace string objects as they will be reused a lot in tight recursive search loops).

But something that I have noted is that JOSM still does not use multithreading as it should for long operations, and so most of the time during long operations when using a CPU with 8 cores, JOSM just uses 12.5% of the CPU (i.e. only one core). But going to more multithreading for better performance will require studying the locking of monitors : JOSM still uses very frequently too many unmutable string objects that are also built with the inefficent StringBuffer class, instead of the much faster non-synchronized StringWriter, when just creating intermediate temporary strings that don't survive the local function. And most of these allocated strings are not reused (so there are too frequent calls to the memory allocator, and too much work performed by the garbage collector : thanks on a multicore CPU, the incremental garbage collector runs in a parallel thread, but on single-core CPUs, this causes too mauch delays. So even if you don't create more threads, it will be necessary to reduce the use of synchronized methods !

Using StringWriter instead of StringBuffer will certainly help getting better performance and better scalability on ulticore CPUs.

Also I note that when updating data from the OSM server still does use a basic simple loop, that allocates intermediate objects really too often (downloading about 200 nodes every 10 seconds per request requires using the allocator on more than 200 megabytes (most objects being garbage collected). Really this can be improved (not everything is allocated dirctly within JOSM's code itself, there's a significant part in the SAX parser library, but even the SAX parsers callbacks are not efficicent enough : so loading data from an OSM file is still too slow, as well as loading from the OSM server, notably on a 32-bit system with small VM sizes. This also causes display refreshes being really slow (for example when selecting objects on screen by performing position lookups where the mouse is pointing).

Can you investigate these with a profiler ?

comment:7 in reply to:  6 Changed 9 years ago by bastiK

Replying to Verdy_p:

There's nothing wrong in first checking if too strings are the same object before actually comparing them. But this is already what the normal String compare function does to avoid scanning the string contents when it is known that it is the same object.
If you encode an ==, then you have first to make sure that strings that should compare equal are effectively the same object. But this would require you to intern all strings, and string interning is very slow (internally this requires hashing string objects, and maintaining them in a, internal Hashmap collection from which the interned version is returned and the original string discarded). Interning however may save time for small strings (such as those containing very common properties, notably the attribute names of OSM objects like "name", "lon", "lat", "highway"...) Additionally it would reduce the memory footprint in the VM by discarding many separate string objects.

Tags (key and value) are already interned.

Whever interning should be performed or not should be studied on practical cases (notably when working on large OSM files). Interning strings may be interesting also in the validator (which could replace string objects as they will be reused a lot in tight recursive search loops).

But something that I have noted is that JOSM still does not use multithreading as it should for long operations, and so most of the time during long operations when using a CPU with 8 cores, JOSM just uses 12.5% of the CPU (i.e. only one core). But going to more multithreading for better performance will require studying the locking of monitors : JOSM still uses very frequently too many unmutable string objects that are also built with the inefficent StringBuffer class, instead of the much faster non-synchronized StringWriter, when just creating intermediate temporary strings that don't survive the local function. And most of these allocated strings are not reused (so there are too frequent calls to the memory allocator, and too much work performed by the garbage collector : thanks on a multicore CPU, the incremental garbage collector runs in a parallel thread, but on single-core CPUs, this causes too mauch delays. So even if you don't create more threads, it will be necessary to reduce the use of synchronized methods !

Using StringWriter instead of StringBuffer will certainly help getting better performance and better scalability on ulticore CPUs.

I don't know about StringWriter, but StringBuilder should be Ok, right?

Also I note that when updating data from the OSM server still does use a basic simple loop, that allocates intermediate objects really too often (downloading about 200 nodes every 10 seconds per request requires using the allocator on more than 200 megabytes (most objects being garbage collected). Really this can be improved (not everything is allocated dirctly within JOSM's code itself, there's a significant part in the SAX parser library, but even the SAX parsers callbacks are not efficicent enough : so loading data from an OSM file is still too slow, as well as loading from the OSM server, notably on a 32-bit system with small VM sizes.

Actually .osm parsing uses StAX; under the hood, this may be the same, I don't know.

This also causes display refreshes being really slow (for example when selecting objects on screen by performing position lookups where the mouse is pointing).

I doubt the issues you mentioned are related to map rendering speed.

Can you investigate these with a profiler ?

Personally, I would look at things that are actually slow and try to optimize these, rather than just doing code review and guessing from there. The following things in JOSM core cause measurable delays:

  • startup
  • the map rendering process
  • loading of the preferences window
  • validator
  • (possibly more)

We are constantly working on optimization. Concrete ideas, on how to improve performance are always welcome. But you should do some benchmarking first, because optimization without significant effect is a waste of time. If you can make rendering 25% faster (for example) I'd be impressed.

comment:8 Changed 9 years ago by verdy_p

This also causes display refreshes being really slow (for example when selecting objects on screen by performing position lookups where the mouse is pointing).

I doubt the issues you mentioned are related to map rendering speed.

No. I did not speak about speed of rendering, but speed of selection. Just hovering the mouse cursor over the map forces it to scan the map data to look for a selectable object within a small bounding box around the cursor position. Internally it reallocates lots of new temporary objects just for this lookup instead of using a fast lookup index, sorted geometrically by 2D position. And when an object's bounding box matches somewhere with the cursor"s bounding box, there's a complex geometry scan that will reconvert the internal object geometry to the coordinates in the window, instead of perfoming the reverse : the mouse should search for a bounding box defined directly in the internal WGS84 coordinates used by objects, to avoid all type conversions. Without editing anything, but just moving the mouse over the map I can see lots of objects created (not just for updating the status bar, something that can be delayed by about 50-100 ms and that does not need to be refreshed constantly : the status bar can use instead a worker thread to compute what to display, it just needs a reference to a single object to work with, a timed trigger for waking up.

Startup time is definitely not a critical issue when using JOSM (we can wait for a dozen of seconds one time), but performance is still required in the Validator (should be multithreaded to use multicore CPUs), and when selecting objects on the map or in one of the object lists in the side windows (docked or not in the right pane), because this affects us every time we use JOSM.

Performance is also wanted when viewing the map over large regions (e.g. a full country when we have thousands of relations of various sizes loaded with tens of thousands ways and millions nodes, no filter active, because we need all of them to solve conflicts or check all surface/boundary relations while we are interested only in editing some ways shared in several large relations with dependencies).

comment:9 Changed 9 years ago by stoecker

Quoting Paul:
"We are constantly working on optimization. Concrete ideas, on how to improve performance are always welcome."

I.e. Patches speeding up JOSM are highly appreciated.

comment:10 in reply to:  9 Changed 9 years ago by bastiK

Replying to stoecker:

Quoting Paul:
"We are constantly working on optimization. Concrete ideas, on how to improve performance are always welcome."

I.e. Patches speeding up JOSM are highly appreciated.

+1

You analysis is slightly condescending because you suggest various optimizations that are already implemented (implying we are too stupid for this but your generous hints will lead us on the right track). Please get your hands dirty, then we are talking...

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain team.
as The resolution will be set.
The resolution will be deleted.

Add Comment


E-mail address and name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.