Modify

Opened 10 years ago

Closed 10 years ago

Last modified 7 years ago

#6886 closed enhancement (fixed)

[Patch] Add PBF reading support to JOSM

Reported by: Don-vip Owned by: team
Priority: major Milestone:
Component: Plugin pbf Version: latest
Keywords: pbf Cc:

Description (last modified by bastiK)

I've written a patch to enhance JOSM in order to read PBF files.
The file format is described on OSM wiki:
http://wiki.openstreetmap.org/wiki/PBF_Format
and widely used, for example on Geofabrik download server:
http://download.geofabrik.de/osm/

The idea behind this patch is to create two new classes, PbfImporter and PbfReader, on the same model as OsmImporter and OsmReader. This leads to create a common abstract class for those two *Reader, AbstractReader.

The PBF "parsing" is done using osmpbf, the reference Java implementation (the one that powers Osmosis), available here:
https://github.com/scrosby/OSM-binary

This library has a dependence on Google Protobuf library, available here:
http://code.google.com/p/protobuf/

Two things are needed in Google library:

  • the library itself, protobuf-java, composed of 37 manually written Java files, and one generated file
  • the protobuf compiler (protoc), that generates one Java file in this library, and two files in osmpbf. The compiler is available as a Windows executable on Google code, or can be built from source.

Please find the patch adding this PBF support. I don't have updated the build.xml because I don't know how you would integrate these additional dependences.

One more thing: the PBF has been designed for future scalability, so all IDs are 64 bit-long. This causes a lot of impacted files in JOSM due to the int->long conversion for changeset IDs.

Attachments (2)

patch.diff (59.1 KB) - added by Don-vip 10 years ago.
patchV2.diff (37.6 KB) - added by Don-vip 10 years ago.

Download all attachments as: .zip

Change History (20)

Changed 10 years ago by Don-vip

Attachment: patch.diff added

comment:1 in reply to:  description ; Changed 10 years ago by bastiK

Great! The only drawback I can see is increased size of the josm binary. Nevertheless, I'd prefer to have it in core (rather than in a plugin). When JOSM supports it out of the box, third party services might be encouraged to switch to pbf (and save resources in the process).

@stoecker: what's your opinion?

Replying to Don-vip:

One more thing: the PBF has been designed for future scalability, so all IDs are 64 bit-long. This causes a lot of impacted files in JOSM due to the int->long conversion for changeset IDs.

Why not cast the changeset id to Integer and throw an exception, when it happens to be larger than the maximum int? For the main API, there is still plenty of room until it hits the 31 bit limit.

comment:2 in reply to:  1 Changed 10 years ago by Don-vip

Replying to bastiK:

Why not cast the changeset id to Integer and throw an exception, when it happens to be larger than the maximum int? For the main API, there is still plenty of room until it hits the 31 bit limit.

OK. I'm reverting those changes and adding some more validity checks.

comment:3 Changed 10 years ago by stoecker

Hmm. I can't believe that we really need 300KB compressed additionally data only to read a fileformat like PBF. I assume writing an own importer would mean probably a tenth of size increase only.

With this size increase I would vote to add it as external plugin.

Last edited 10 years ago by stoecker (previous) (diff)

Changed 10 years ago by Don-vip

Attachment: patchV2.diff added

comment:4 in reply to:  3 Changed 10 years ago by Don-vip

Replying to stoecker:

With this size increase I would vote to add it as external plugin.

Fine for me. If the few other changes are made (the code refactorisation of AbstractReader and the new method isValid() in LatLon), it's easy :)

The new patch (V2) reverts the int->long change and adds some validity checks.

comment:5 Changed 10 years ago by simon04

To promote the usage of the pbf format, it would be beneficial to have the importer included in core.

I played around with ProGuard hoping to shrink the size of the libraries. Unfortunately, I didn't succeed due to some errors (e.g., Warning: com.google.protobuf.Descriptors$MethodDescriptor: can't find referenced class com.google.protobuf.DescriptorProtos$MethodDescriptorProto). I'm novice to ProGuard, though. These are the parameters I used:

java \
  -jar tools/proguard.jar \
  -dontoptimize \
  -dontobfuscate \
  -dontskipnonpubliclibraryclasses \
  -injars osmpbf.jar \
  -injars protobuf.jar \
  -libraryjars $JAVA_HOME/lib/rt.jar \
  -libraryjars $JAVA_HOME/lib/jce.jar \
  -keep class crosby.binary.BinaryParser \
  -keep class crosby.binary.Osmformat \
  -keep class crosby.binary.Osmformat.DenseNodes \
  -keep class crosby.binary.Osmformat.HeaderBlock \
  -keep class crosby.binary.file.BlockInputStream \
  -keep class crosby.binary.file.FileBlockPosition \
  -outjars o.jar \
  -verbose

comment:6 Changed 10 years ago by Don-vip

Sorry, looks like the uploaded jar was incomplete. Here's the correct version.

comment:7 Changed 10 years ago by simon04

Thanks, the errors are gone, but the ProGuard result is still not what is expected …

comment:8 Changed 10 years ago by bastiK

When compiling everything from source and simply using javac on MainApplication.java and PbfImporter.java, the size of the compressed classes increases by 200 kB.

Most of this seems to come from the auto-generated class crosby.binary.Osmformat, which is 8000 lines in source code. We could try to use ProGuard or a similar tool, to remove unused methods.

comment:9 Changed 10 years ago by Don-vip

In [4490/josm]:

see #6886 - Allow JOSM plugin writers to create their own *Reader / *Importer, first for the upcoming PBF plugin.

comment:10 Changed 10 years ago by Scott Crosby (nutscrape)

There are heavy and lite versions of protobuf for at least some languages. The lite versions omit various introspection/metadata code. Let me investigate whether we can include/compile only the lite version. This would reduce the amount of extra code included. (I wrote osmpbf.)

comment:11 Changed 10 years ago by bastiK

Description: modified (diff)

@Scott Crosby: Great work! A bit off topic: For wide adoption of the PBF format, the license of OSM-binary (GPL 3) could turn out to be a little restrictive some day. The osmformat.proto file, which seems to define the PBF format, is also under this license, so only GPL 3 software is allowed to read and write PBF (or someone has to reverse engineer the format ;). Maybe LGPL is an option?

comment:12 Changed 10 years ago by Scott Crosby (nutscrape)

It should be under the LGPL. Can you point me to a file that claims it is GPLv3?

comment:13 in reply to:  12 Changed 10 years ago by bastiK

Replying to Scott Crosby:

It should be under the LGPL. Can you point me to a file that claims it is GPLv3?

You are right - guess it was too late at night.

comment:14 in reply to:  10 ; Changed 10 years ago by Don-vip

Replying to Scott Crosby (nutscrape):

Let me investigate whether we can include/compile only the lite version. This would reduce the amount of extra code included. (I wrote osmpbf.)

Any update on this ?

Another question: is there some kind of magic trick allowing to define a svn:externals to a git repository ?

comment:15 in reply to:  14 ; Changed 10 years ago by simon04

Replying to Don-vip:

Another question: is there some kind of magic trick allowing to define a svn:externals to a git repository ?

Are you looking for git submodule?

comment:16 in reply to:  15 Changed 10 years ago by Don-vip

Replying to simon04:

Are you looking for git submodule?

git submodule is the equivalent of svn-externals on git, no ? I am looking to something allowing me to access source code of osmpbf from OSM SVN (in read only) without having to copy manually the sources files. Maybe with an ant task or something ?

comment:17 Changed 10 years ago by Don-vip

Component: CorePlugin
Resolution: fixed
Status: newclosed

I've pushed the PBF code as a plugin in [o26962].

By the way, I found a rounding bug in LatLon that I needed to fix to allow proper JOSM PBF->xml saving: see r4541.

comment:18 Changed 7 years ago by Don-vip

Component: PluginPlugin pbf

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain team.
as The resolution will be set.
The resolution will be deleted.

Add Comment


E-mail address and name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.