I want to map the "Texas Steakhaouse", but the JOSM preset has no suitable value. So I looked into the Wiki page, which lists about 90 values and manually use cuisine=steak_house.

I wonder why these values are not in the preset (and validator complain about it), so I looked into taginfo. The cuisine tag is used 347272 times with 10628 (!) different values. And I saw several, cases where clearly are typos:

steak_house 2400
steakhouse 36
Steakhouse 23
Steak_House 12
steak-house 3
Steak_house 3
steak␣house 3
steack_house 2
Steak␣House 1
Steakhaus 1

And some similar values:

steak 359
Steak 27
steak_grill 12
steaks 12
Steaks 1
ステーキ␣(Steak) 1
Beefsteak_house 1
steack 1
Steakhouse_chain 1

Many off the different values are combinations like (as an example I just pick out the combination with seafood):

steak;seafood 15
seafood;steak 5
steak_house;seafood 3
steak_house;fish 3
Steak_and_Seafood 3
seafood;steak_house 2
seafood␣and␣steaks 2
seafood,␣steak 2
seafood;steaks 2
fisch,steaks 1
seafood,steak_house 1
seafood,steak 1
Seafood_and_Steaks 1
steak,␣seafood 1
steak_&_seafood 1
Steak,_Fisch_und_mehr 1
Steak,_sea_food 1
Steak_&_Fish 1
Steaks_and_Seafood 1
steak;fish 1
steak;␣sea_food; 1
seafood,_steak 1
Steak,_Seafood 1
steak;seafood; 1
fish;steaks 1
steak;␣seafood 1
steaks_and_seafood 1
seafood,␣steack_house 1
Steaks,_Seafood 1

What can we do to reduce this mess?

  • Should we add more values to the preset? Nice if the preset and the wiki would be in sync.
  • Should we mark deprecated values more clearly in the wiki (like fish or sub)?
  • Should we warn about deprecated values and offer a fix (like replacing cuisine=fish with cuisine=seafood)
  • Should we have some pretty type mechanism like for opening_hours?

If we have a value which is not in the preset and we do operations like:

  • convert to lower case
  • replace ' ' or '-' with '_'
  • replace ',' with ';'
  • remove trailing or leading special chars like ' ', ';', '_'

and we now find the result in the preset, warn and offer a fix to replace the value.

Additionally we could also have a test where we remove the '_' from the preset value. If we find a match in this case like "steakhouse" would match "steak_house" without '_', we can warn and offer a fix to replace cuisine=steakhouse with cuisine=steak_house.

We should split combined values (separated by ';') and check each part separatly. And perhas these parts should be sorted alphabetically (prefer cuisine=seafood;steak instead of cuisine=steak;seafood).

Perhaps this could be a generic validation rule for many tags with a given list of values plus user defined values.

In 8353/josm:

fix #11433 - add steak_house as cuisine

Added steak_house.

For the typos in database: You can load them in JOSM, verify them and fix entries. Please don't do automatic fixes, as usually bad values go together with other errors.

Jochen started a cleanup challenge some time ago:

Why is the "house" is the name needed? Is that the common name of the cuisine (not the restaurant facility) in the US?

