Modify

Opened 12 years ago

Closed 12 years ago

#7915 closed enhancement (fixed)

[PATCH] Automatically discard some TIGER tags on upload

Reported by: ToeBee Owned by: team
Priority: normal Milestone:
Component: Core Version: latest
Keywords: Cc: ToeBee

Description

There are some tags that were uploaded in the original TIGER import which are now entirely useless and just taking up space and making the tag list harder to manage as we add more and more real world tags to ways. I am proposing to extending JOSM's silent delete feature for the created_by tag to include these TIGER tags so that as people edit these objects, the tags are dropped.

In particular, these tags:

tiger:upload_uuid - This tag is a hash that was used to group uploads. It was useful during the import process (essentially it was a changeset identifier for API 0.5 which lacked changesets) but now accounts for about 1 GB in the planet file and has absolutely no value to anyone. Since it is a random string, it also degrades compressibility of the file so it's a double hit. It has already been removed from a lot of ways when the name expansion bot was run on them.

tiger:tlid - This is a foreign key to the original TIGER data. In theory it could be used to synchronize with future TIGER data sets or otherwise do some cross-dataset analysis. However the TIGER data model has changed since the import and this field no longer exists. Also, as ways have been split and combined, the value in this tag has been mangled. Sometimes to the point where the tag value exceeds API length limits at which point the user must either truncate or delete the tag anyway.

tiger:source - again, a key that had some potentially useful information at import time. But especially after a user has edited the way, this tag becomes unimportant and crufty.

tiger:separated - In theory, a tag that indicates if a road is dual carriageway. In practice, it is wrong a majority of the time and it was put on all ways from residential up. So on 95% of the ways it is completely uninteresting data anyway.

In addition, I need to wear more tin foil around my brain because the Canadians caught wind of my plans. They came up with two tags to add to the list from their imports:
geobase:datasetName and geobase:uuid

I am supplying a patch. I implemented this similarly to the "uninteresting" tags with a list in OsmPrimitive and then I check from OsmWriter whether a given key is in the list instead of only hard-coding on "created_by" as was being done before.

Links to the relevant mailing list threads where this has been discussed among the affected communities:

US: http://lists.openstreetmap.org/pipermail/talk-us/2012-July/008830.html

Canada: http://lists.openstreetmap.org/pipermail/talk-ca/2012-July/004948.html

Now to see if I can make a usable patch file... I've been spoiled by pull requests :)

Attachments (1)

discard_tags.patch (4.8 KB ) - added by ToeBee 12 years ago.
Alright let's try this on for size. Implemented as an upload hook instead.

Download all attachments as: .zip

Change History (9)

comment:1 by stoecker, 12 years ago

"odbl" can also be dropped (see #7906).

If we update this for more tags, then it should be done properly. Problem now is that JOSM does not know that it dropped a tag, so there is potential for conflicts here. The tags should be dropped from JOSM's dataset as well (i.e. on OK from server-upload).

comment:2 by anonymous, 12 years ago

I've never had a conflict because of this. If I redownload the area after uploading, the tags just silently vanish.

in reply to:  1 comment:3 by rickmastfan67, 12 years ago

Replying to stoecker:

"odbl" can also be dropped (see #7906).

Don't forget "odbl:note" too. ;)

comment:4 by bastiK, 12 years ago

As stoecker said, the change should be reflected in the local dataset. We could add a command to the undo stack that removes the unwanted tags on modified primitives right before upload (or something like that).

by ToeBee, 12 years ago

Attachment: discard_tags.patch added

Alright let's try this on for size. Implemented as an upload hook instead.

comment:5 by bastiK, 12 years ago

Summary: Automatically discard some TIGER tags on upload[PATCH] Automatically discard some TIGER tags on upload

comment:6 by stoecker, 12 years ago

Much better. Adding odbl and odbl:note it should be ready for applying. bastiK?

comment:7 by stoecker, 12 years ago

Ah - And "geobase:datasetName" is twice in the list.

comment:8 by bastiK, 12 years ago

Resolution: fixed
Status: newclosed

In 5497/josm:

applied #7915 - Automatically discard some TIGER tags on upload (based on patch by ToeBee)

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain team.
as The resolution will be set.
The resolution will be deleted. Next status will be 'reopened'.

Add Comment


E-mail address and name can be saved in the Preferences .
 
Note: See TracTickets for help on using tickets.