Modify

Opened 2 years ago

Closed 2 years ago

#7915 closed enhancement (fixed)

[PATCH] Automatically discard some TIGER tags on upload

Reported by: ToeBee Owned by: team
Priority: normal Milestone:
Component: Core Version: latest
Keywords: Cc: toby.murray@…

Description

There are some tags that were uploaded in the original TIGER import which are now entirely useless and just taking up space and making the tag list harder to manage as we add more and more real world tags to ways. I am proposing to extending JOSM's silent delete feature for the created_by tag to include these TIGER tags so that as people edit these objects, the tags are dropped.

In particular, these tags:

tiger:upload_uuid - This tag is a hash that was used to group uploads. It was useful during the import process (essentially it was a changeset identifier for API 0.5 which lacked changesets) but now accounts for about 1 GB in the planet file and has absolutely no value to anyone. Since it is a random string, it also degrades compressibility of the file so it's a double hit. It has already been removed from a lot of ways when the name expansion bot was run on them.

tiger:tlid - This is a foreign key to the original TIGER data. In theory it could be used to synchronize with future TIGER data sets or otherwise do some cross-dataset analysis. However the TIGER data model has changed since the import and this field no longer exists. Also, as ways have been split and combined, the value in this tag has been mangled. Sometimes to the point where the tag value exceeds API length limits at which point the user must either truncate or delete the tag anyway.

tiger:source - again, a key that had some potentially useful information at import time. But especially after a user has edited the way, this tag becomes unimportant and crufty.

tiger:separated - In theory, a tag that indicates if a road is dual carriageway. In practice, it is wrong a majority of the time and it was put on all ways from residential up. So on 95% of the ways it is completely uninteresting data anyway.

In addition, I need to wear more tin foil around my brain because the Canadians caught wind of my plans. They came up with two tags to add to the list from their imports:
geobase:datasetName and geobase:uuid

I am supplying a patch. I implemented this similarly to the "uninteresting" tags with a list in OsmPrimitive and then I check from OsmWriter whether a given key is in the list instead of only hard-coding on "created_by" as was being done before.

Links to the relevant mailing list threads where this has been discussed among the affected communities:

US: http://lists.openstreetmap.org/pipermail/talk-us/2012-July/008830.html

Canada: http://lists.openstreetmap.org/pipermail/talk-ca/2012-July/004948.html

Now to see if I can make a usable patch file... I've been spoiled by pull requests :)

Attachments (1)

discard_tags.patch (4.8 KB) - added by ToeBee 2 years ago.
Alright let's try this on for size. Implemented as an upload hook instead.

Download all attachments as: .zip

Change History (9)

comment:1 follow-up: Changed 2 years ago by stoecker

"odbl" can also be dropped (see #7906).

If we update this for more tags, then it should be done properly. Problem now is that JOSM does not know that it dropped a tag, so there is potential for conflicts here. The tags should be dropped from JOSM's dataset as well (i.e. on OK from server-upload).

comment:2 Changed 2 years ago by anonymous

I've never had a conflict because of this. If I redownload the area after uploading, the tags just silently vanish.

comment:3 in reply to: ↑ 1 Changed 2 years ago by rickmastfan67

Replying to stoecker:

"odbl" can also be dropped (see #7906).

Don't forget "odbl:note" too. ;)

comment:4 Changed 2 years ago by bastiK

As stoecker said, the change should be reflected in the local dataset. We could add a command to the undo stack that removes the unwanted tags on modified primitives right before upload (or something like that).

Changed 2 years ago by ToeBee

Alright let's try this on for size. Implemented as an upload hook instead.

comment:5 Changed 2 years ago by bastiK

  • Summary changed from Automatically discard some TIGER tags on upload to [PATCH] Automatically discard some TIGER tags on upload

comment:6 Changed 2 years ago by stoecker

Much better. Adding odbl and odbl:note it should be ready for applying. bastiK?

comment:7 Changed 2 years ago by stoecker

Ah - And "geobase:datasetName" is twice in the list.

comment:8 Changed 2 years ago by bastiK

  • Resolution set to fixed
  • Status changed from new to closed

In 5497/josm:

applied #7915 - Automatically discard some TIGER tags on upload (based on patch by ToeBee)

Add Comment

Modify Ticket

Change Properties
<Author field>
Action
as closed .
as The resolution will be set. Next status will be 'closed'.
The resolution will be deleted. Next status will be 'reopened'.
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.