Modify

Opened 5 months ago

Last modified 10 days ago

#21720 new defect

Delete Vietnamese localization

Reported by: 1ec5 Owned by: team
Priority: normal Milestone: 22.05
Component: Core Version:
Keywords: i18n vietnamese Cc: Don-vip

Description (last modified by 1ec5)

The Vietnamese (vi) localization of JOSM is completely unusable, to the point of insulting any Vietnamese speaker (even a non-native speaker like myself). It hinders Vietnamese speakers from contributing to OSM and hurts their perception of the project. Among its many problems:

  • Most translated strings seem as if they had been translated by machine-translating every other word and concatenating the results, disregarding Vietnamese grammar.
  • Basic terms are mistranslated in a manner that suggests total unfamiliarity with OpenStreetMap or the Vietnamese language.
  • Spaces are frequently missing between words. Words are inconsistently capitalized with no discernible pattern.
  • Most strings containing HTML formatting have broken syntax.
  • JOSM incorrectly considers Vietnamese to lack a plural grammatical form.

I’m only able to contribute to OSM using JOSM because I’m familiar with the English localization. Every time a warning or error appears, I have to play a game of Mad Libs to figure out if my edits would cause problems.

Just a few examples of mistranslated terminology, really the tip of the iceberg:

EnglishJOSM VietnameseLiteral meaningCorrect Vietnamese
way cách manner lối
reverse way cách xếp sorting method đảo ngược lối
to join (a way) tham gia to participate in (a project) gắn vào
(imagery) offset bù đắp to compensate độ lệch
lat/lon lạt lon bamboo strip of beverage can vĩ độ/kinh độ
upload to (server) tải lên để upload in order to tải lên
rubber-band cao su-band rubber + band dây chun
delete mode xóa mode delete + mode chế độ xóa
jump there jump có jump + there is nhảy tới đấy
right (side) quyền permission phải
to snap (to a way) chụp to snap a photo dính

JOSM’s Launchpad instance unfairly blames me for the problem. It says my account was responsible for contributing 7,867 strings (61%) out of 12,805 total on 12 May 2015, two days before the localization was committed to the repository in r8352. However, I have no recollection of contributing thousands of strings to a Vietnamese translation of JOSM at that time, when I was too busy to even contribute very much to OSM. Even if I did, I would not have translated a single one of these strings the way Launchpad says I did. That much is clear when comparing JOSM with iD and Vespucci, which I am responsible for translating into Vietnamese. (Launchpad lists a handful of strings of mine from 27 June 2010 that do look like something I would’ve written, but they were all overwritten by the 2015 mistranslations.)

So, as the supposed author of the majority of JOSM’s Vietnamese localization, I kindly ask that it be deleted from the repository until translators have had an opportunity to clean things up. If possible, the offending translations should also be deleted from Launchpad to facilitate the cleanup effort.

Attachments (1)

upload.png (380.7 KB) - added by 1ec5 5 months ago.
Upload dialog

Download all attachments as: .zip

Change History (15)

Changed 5 months ago by 1ec5

Attachment: upload.png added

Upload dialog

comment:1 Changed 5 months ago by 1ec5

Description: modified (diff)

comment:2 Changed 5 months ago by stoecker

Cc: Don-vip added

Reading you your text I have several issues.

  • If Launchpad says translations are attributed to you then I rather believe your made a mistake or forgot something which was in the past compared to the idea that launchpad invents something
  • I will not remove translations for an existing language only because a single somebody opens a ticket and asks to do so. If it's a serious issue for years I'd expect that topic should have come up more than once until now.
  • If the translations are as bad as you say, start an effort to improve them. Vietnamese community should be willing to help to improve situation if it is really as described
  • "JOSM incorrectly considers Vietnamese to lack a plural grammatical form." → JOSM does not define something itself. We take the plural forms from Launchpad and the definition for "vi" is this: https://translations.launchpad.net/+languages/vi which is also reflected in any po-file (Plural-Forms: nplurals=1; plural=0;).

Altogether while I cannot decide myself whether Vietnamese in JOSM is correct or not simply because I don't speak Vietnamese your ticket has a lot of points which indicate that what you describe isn't exact either.

comment:3 Changed 5 months ago by Don-vip

Keywords: i18n vietnamese added

comment:4 Changed 5 months ago by maxerickson@…

Does Launchpad have a log of translation files that are uploaded? I don't see one (but I'm not logged in and so on).

https://help.launchpad.net/Translations/YourProject/ImportingTranslations

It does seem unlikely that thousands of strings were translated in a day or two using the web interface.

comment:5 Changed 5 months ago by stoecker

Not that I know about. We only have a daily backup.

Last edited 5 months ago by stoecker (previous) (diff)

comment:6 in reply to:  2 Changed 5 months ago by 1ec5

Replying to stoecker:

  • If Launchpad says translations are attributed to you then I rather believe your made a mistake or forgot something which was in the past compared to the idea that launchpad invents something

As Max has pointed out, it’s extremely unlikely that thousands of translations were contributed manually. Launchpad may have had a bug misattributing the import to me. It’s plausible that I was a preexisting contributor to the localization at the time, having submitted a few (correct) translations manually through the Web interface in 2010. For example, string #153 shows a correct translation by me that was rejected in favor of a humorously incorrect one also under my name.

I suspect that someone did a crude find-and-replace job on the original English .po, manually added the X-Exported-From-Launchpad header so that Launchpad would accept it, and overwrote existing translations by me and a few other users. I insist that I did not do something so unreasonable in a fever dream, but if you don’t want to take my word for it, then there’s little I can do about that.

  • I will not remove translations for an existing language only because a single somebody opens a ticket and asks to do so.

In general, that would be a fine principle, but it leaves me in a catch-22 situation. On the one hand, you’re claiming that the localization is largely authored by me based on Launchpad’s attribution, yet on the other hand, you seem to be discounting me as “a single somebody” asking to delete the work of others. So how do I undo what I supposedly accidentally wrought?

If it's a serious issue for years I'd expect that topic should have come up more than once until now.

No one has said anything because pretty much no one uses the Vietnamese localization. I think you’re assuming that a typical Vietnamese speaker’s first reaction to a poor localization would’ve been to head over to this issue tracker and file a ticket in English. But most Vietnamese mappers would either avoid JOSM or switch JOSM’s interface language to English (for those who can find the option). After all, why would someone bother to engage with a project that uses an offensive caricature of their language?

I downloaded the latest changeset dump from 24 December 2021 and filtered it to changesets starting from 14 May 2015 (the date the Vietnamese localization was committed to the repo) that overlap with the Vietnam bounding box (8.27673°N, 101.86523°E to 23.56399°N, 109.64355°E, excluding the Paracels and Spratlys to avoid including Hong Kong). Here’s the absolute and relative editor localization usage by number of changesets:

Locale JOSM iD Go Map!! 3.1+ Vespucci 0.9.9+
English 809213 124581 185 958
Vietnamese 869 35691 55 360
Others 14228 45534 4 783
Total 824310 205806 244 2101
% Vietnamese 1.726% 17.34% 22.54% 45.60%

Even though JOSM has much higher absolute usage than iD, Vietnamese usage is proportionally much lower among JOSM changesets than it is among iD, Go Map!!, and Vespucci changesets.

Of the 869 changesets in Vietnam uploaded using JOSM’s Vietnamese localization since 2015, 333 (38%) were uploaded by one user, V U P H A N, who stopped mapping in 2018. Among the 19 mappers who have contributed 10 or more changesets, Hieu Van is the second most prolific mapper and the only one who has ever been active. But they also switched to the English localization back in 2018. I reached out to both mappers for their feedback about the Vietnamese localization.

Here are the commands I used for this analysis:

osmium changeset-filter -c --bbox='101.865234375,8.276727101164047,109.64355468749999,23.563987128451217' -a '2015-05-14T14:41:03Z' --output=vnchangesets.opl.bz2 changesets-211220.osm.bz2
grep -E 'created_by=JOSM/[^,]*en' vnchangesets.opl | wc -l
grep -E 'created_by=JOSM/[^,]*vi' vnchangesets.opl | wc -l
grep -E 'created_by=JOSM/' vnchangesets.opl | wc -l
grep -E 'created_by=iD' vnchangesets.opl | wc -
grep -E 'created_by=iD' vnchangesets.opl | grep -E 'locale=en' | wc -
grep -E 'created_by=iD' vnchangesets.opl | grep -E 'locale=vi' | wc -
grep -E 'created_by=Go%20%Map' vnchangesets.opl | grep -E 'locale=' | wc -l
grep -E 'created_by=Go%20%Map' vnchangesets.opl | grep -E 'locale=en' | wc -l
grep -E 'created_by=Go%20%Map' vnchangesets.opl | grep -E 'locale=vi' | wc -l
grep -E 'created_by=Vespucci' vnchangesets.opl | wc -
grep -E 'created_by=Vespucci' vnchangesets.opl | grep -E 'locale=en' | wc -
grep -E 'created_by=Vespucci' vnchangesets.opl | grep -E 'locale=vi' | wc -
grep -E 'created_by=JOSM/[^,]*vi' vnchangesets.opl | sed -E 's/.* u([^ ]+).*/\1/g' | sort | uniq -c | sort -r

Outside of Vietnam, I assume I’m the main user of JOSM in Vietnamese via my import accounts.

  • If the translations are as bad as you say, start an effort to improve them. Vietnamese community should be willing to help to improve situation if it is really as described

Of course, I’m not asking that JOSM permanently delete the Vietnamese localization. But I hope to convince you that the majority of translated strings are incorrect enough to be removed now. And if these strings are to be removed, then we’re left with not enough translated strings for an official published localization.

I’ve reached out to the other translators whose translations were overwritten by the import to encourage them to take another look at the localization and comment on next steps. Regardless, these 7,000-plus strings will take a very long time to retranslate. The subset of Vietnamese users who contribute to OSM software translation is vanishingly small these days – I’m pretty much the only one who actively translates any of the projects. As things stand, I’m personally not very motivated to clean up this mess by hand compared to translating another editor from scratch.

To my knowledge, there’s no active, open communication channel for Vietnamese mappers. But I posted to the talk-vi mailing list, which has been mostly inactive since 2012, in case anyone on that list is still participating in OSM.

  • "JOSM incorrectly considers Vietnamese to lack a plural grammatical form." → JOSM does not define something itself. We take the plural forms from Launchpad and the definition for "vi" is this: https://translations.launchpad.net/+languages/vi which is also reflected in any po-file (Plural-Forms: nplurals=1; plural=0;).

I’m referring to this line in the JOSM codebase that gives Vietnamese only one grammatical number form. Vietnamese does indeed make a grammatical distinction between singular and plural using extra plural-marking words, even if it doesn’t inflect the noun itself for plural number. You can read about it in this introductory grammar textbook or this academic reference. As it is, we would have to write (các) lối này and (những) người này in parentheses, akin to writing “this/these way(s)” or “this person/these people” in English.

Launchpad probably got the incorrect plural form data from CLDR, which has a known issue in this regard. It’s understandable that JOSM would be unable to fix the plural setting independently of its CLDR-based translation platform. iD is also in the same situation with Transifex. So I’m not requesting that JOSM fix the plural setting just yet, even though it does contribute to the general brokenness of the localization.

Altogether while I cannot decide myself whether Vietnamese in JOSM is correct or not simply because I don't speak Vietnamese your ticket has a lot of points which indicate that what you describe isn't exact either.

I will interpret this sentence as charitably as possible. Since you don’t speak Vietnamese, I’ve provided representative examples, quantitative analysis, and links to reputable supporting material that would hopefully give you some confidence that I’m not just making things up. I understand that deleting a localization wholesale is a somewhat extreme step, but it will motivate the community toward working on a better translation more than what we have now. In the meantime, falling back to another language such as English would set a more accurate expectation with users.

Consider this an undiscussed, botched import that overwrote correct translations and drove away craft-translators and craft-mappers. I don’t fault you for committing vi.lang to the repo, because you didn’t know any better. But now that you’re aware of the problem, the criteria for reverting should be no more stringent than it was for importing it in the first place, right? After all, that’s how it normally works in OSM.

Last edited 5 months ago by 1ec5 (previous) (diff)

comment:7 Changed 5 months ago by Don-vip

Thank you Minh for explaining us the issue with so much details.
I agree we should find a way to revert the 2015 change, or if we can't, delete the impacted strings. Then if we have less than 2000 translated core strings, remove the translation until enough strings are translated again. This wouldn't be the first time we remove a translation.

comment:8 Changed 5 months ago by Don-vip

Milestone: 22.01

comment:9 Changed 4 months ago by nhoccondalonroi@…

I saw the report from Mr. Minh.
I'm Vietnamese and I agree that many translated strings in this issue are not good.
If possible, please modify them.

comment:10 in reply to:  2 Changed 4 months ago by Le Viet Thanh <lethanhx2k@…>

Replying to stoecker:

Reading you your text I have several issues.

  • I will not remove translations for an existing language only because a single somebody opens a ticket and asks to do so. If it's a serious issue for years I'd expect that topic should have come up more than once until now.
  • If the translations are as bad as you say, start an effort to improve them. Vietnamese community should be willing to help to improve situation if it is really as described
  • "JOSM incorrectly considers Vietnamese to lack a plural grammatical form." → JOSM does not define something itself. We take the plural forms from Launchpad and the definition for "vi" is this: https://translations.launchpad.net/+languages/vi which is also reflected in any po-file (Plural-Forms: nplurals=1; plural=0;).

Altogether while I cannot decide myself whether Vietnamese in JOSM is correct or not simply because I don't speak Vietnamese your ticket has a lot of points which indicate that what you describe isn't exact either.

Just to comment on the points that I know of, as one of the Vietnamese mappers and Vi translators for JOSM in c.a. 2009-2015:
There were not many active Vi mappers back in that period and even now. Most of the mappers were using the default En interface and then switched to the web version (OSM iD), so no ones have been aware of these changes, and I don't think Vietnamese mappers/translators will take their effort to fix the mistranslated words. Looking at some recently translated words as 1ec5 pointed out in comparison with the 2015 version, they are completely non-sense in the context of JOSM interface. I really appreciate the enthusiastic contribution of 1ec5 to OSM for over 1 decade and raising this translation issue. I think the simplest solution would be just to revert the changes back to the 2015 version.

Thanh
Retired OSM mapper (ninomax)

comment:11 Changed 3 months ago by stoecker

Milestone: 22.0122.02

Milestone renamed

comment:12 Changed 3 months ago by Don-vip

Milestone: 22.0222.03

comment:13 Changed 6 weeks ago by stoecker

Milestone: 22.0322.04

comment:14 Changed 10 days ago by stoecker

Milestone: 22.0422.05

Milestone renamed

Modify Ticket

Change Properties
Set your email in Preferences
Action
as new The owner will remain team.
as The resolution will be set.
to The owner will be changed from team to the specified user.
The owner will change to 1ec5
as duplicate The resolution will be set to duplicate.The specified ticket will be cross-referenced with this ticket
The owner will be changed from team to anonymous.

Add Comment


E-mail address and name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.