[PATCH] Automatically discard some TIGER tags on upload
|Reported by:||ToeBee||Owned by:||team|
There are some tags that were uploaded in the original TIGER import which are now entirely useless and just taking up space and making the tag list harder to manage as we add more and more real world tags to ways. I am proposing to extending JOSM's silent delete feature for the created_by tag to include these TIGER tags so that as people edit these objects, the tags are dropped.
In particular, these tags:
tiger:upload_uuid - This tag is a hash that was used to group uploads. It was useful during the import process (essentially it was a changeset identifier for API 0.5 which lacked changesets) but now accounts for about 1 GB in the planet file and has absolutely no value to anyone. Since it is a random string, it also degrades compressibility of the file so it's a double hit. It has already been removed from a lot of ways when the name expansion bot was run on them.
tiger:tlid - This is a foreign key to the original TIGER data. In theory it could be used to synchronize with future TIGER data sets or otherwise do some cross-dataset analysis. However the TIGER data model has changed since the import and this field no longer exists. Also, as ways have been split and combined, the value in this tag has been mangled. Sometimes to the point where the tag value exceeds API length limits at which point the user must either truncate or delete the tag anyway.
tiger:source - again, a key that had some potentially useful information at import time. But especially after a user has edited the way, this tag becomes unimportant and crufty.
tiger:separated - In theory, a tag that indicates if a road is dual carriageway. In practice, it is wrong a majority of the time and it was put on all ways from residential up. So on 95% of the ways it is completely uninteresting data anyway.
In addition, I need to wear more tin foil around my brain because the Canadians caught wind of my plans. They came up with two tags to add to the list from their imports:
geobase:datasetName and geobase:uuid
I am supplying a patch. I implemented this similarly to the "uninteresting" tags with a list in OsmPrimitive and then I check from OsmWriter whether a given key is in the list instead of only hard-coding on "created_by" as was being done before.
Links to the relevant mailing list threads where this has been discussed among the affected communities:
Now to see if I can make a usable patch file... I've been spoiled by pull requests :)
Change History (9)
Changed 9 months ago by ToeBee
comment:5 Changed 9 months ago by bastiK
- Summary changed from Automatically discard some TIGER tags on upload to [PATCH] Automatically discard some TIGER tags on upload