Modify

Opened 4 years ago

Closed 4 years ago

Last modified 5 months ago

#10989 closed defect (fixed)

I18n display support for major Asian scripts (Tamil, Bengali, ...)

Reported by: anonymous Owned by: team
Priority: normal Milestone: 15.02
Component: Core Version:
Keywords: i18n text tamil font Cc:

Description

I've got no problem at all displaying properly major Asian scripts on the web on any OS.
Except in JOSM which still does not bind properly the needed font support for these scripts; notably for Tamil or Marathi.
JOSM is certainly usefull in India for millions of native users of these major script used by speakers of some of major languages of the world.
Zhat can be done to setup correctly the JAva environment to support these scripts ? You added Khmer recently and a few minor languages but Tamil is certainly a more urgent priority (it already works on the web for renderers and web editors. Hindi (and Devannagari script) is not the only language to support for India.
Well; JOSM allows entering data in those languages (notably in "name:<lang>=*" tags) but you cannot read them without first copy-pasting them from JOSM to an external basic editor).
Things should be as simple as with Arabic, Urdu, Chinese, Korean, Thai and Hindi and other Asian languages (including many other minor languages) using the same scripts.

It is much more important to support Tamil now than Asturian. Things are now OK with Khmer, although we lack Lao (and Burmese whose support still lays behind on many platforms due to late developments)
Tamil is not in fact a complex script (much less than Khmer in fact) even if its encoding in Unicode using the same model as Devanagari has made things a bit more complicated. The Tamil script actulaly looks more like an alphabet and is defi,itely simpler than Arabic or evan Thai (but Thai is supported easily because it was encoded not with the logical model but with the visual model; meanng that it is simpler to handle for things like normalization, but more complex to support for things such as collation; however collation is a secondary goal compared to proper input and display support).

For now Tamil users are just told to use web editors, not JOSM. And there are already lot of opendata available in this language/script and ready to be integrated and updated; not just for India/Sri Lanka but for locations around the world! integration is possible but correction to this data is still a problem with JOSM.

Attachments (5)

josm-script-support-ubuntu1.png (55.6 KB) - added by bastiK 4 years ago.
josm-script-support-ubuntu2.png (18.1 KB) - added by bastiK 4 years ago.
josm-script.osm (2.6 KB) - added by bastiK 4 years ago.
test file for script support
10989_win8.png (23.8 KB) - added by Don-vip 4 years ago.
korean.png (20.5 KB) - added by Don-vip 5 months ago.

Download all attachments as: .zip

Change History (41)

comment:1 Changed 4 years ago by stoecker

Owner: changed from team to anonymous
Status: newneedinfo

To support additional languages we need translations. We don't care whether Tamil is more important than Asturian or not. Our rules are simply - When we have translations (see Translations), we include them, otherwise not. Someone did translations for Asturian, but not for Tamil.

Regarding fonts: To display relevant characters Java needs to use fonts, which contain the relevant characters - Java does not automatically use all fonts, but has lists of supported fonts. But JOSM also allows to set fonts in the expert preferences. Please supply examples for missing display possibility, so we can have a look. I did a short test, but it seems my machine can display Tamil letters as expected, thought I'm not sure, because I actually know nothing about it.

Anyway in most cases we cannot do much, but bugs and feature requests should be directed to Java developers.

Last edited 4 years ago by stoecker (previous) (diff)

comment:2 in reply to:  description Changed 4 years ago by Don-vip

Replying to anonymous:

For now Tamil users are just told to use web editors, not JOSM.

Tell them to translate the software and we will include with great pleasure this new translation. Tamil has a long way to go: only 326 string (3% of the whole software+plugins) have been translated. When we'll reach 2000 translated strings we will see how to include the new language.

comment:3 Changed 4 years ago by bastiK

Translation is unrelated to support of Tamil script, it's a hen and egg problem. If you cannot enter the name of a POI in your native language, then you simply will not use JOSM. If you aren't a JOSM user, you don't care enough to spend hours on translations.

Changed 4 years ago by bastiK

Changed 4 years ago by bastiK

Changed 4 years ago by bastiK

Attachment: josm-script.osm added

test file for script support

comment:4 Changed 4 years ago by bastiK

On Ubuntu Linux, it used the DejaVu font which is designed to have wide Unicode coverage.

Still, Dhivehi (dv) is doesn't work at all and there is a problem with Urdu (ur):



On Windows it uses Arial, as far as I know. So essentially this is neither a JOSM nor a Java problem, but depends on the language support of the OS.

Changed 4 years ago by Don-vip

Attachment: 10989_win8.png added

comment:5 Changed 4 years ago by Don-vip

On Windows 8.1 I have problems with more languages:



comment:6 Changed 4 years ago by stoecker

Under openSUSE 13.1 I get a exception with the test file:

Build-Date: 2015-01-23 08:40:36
Revision: 7983
Is-Local-Build: true

Identification: JOSM/1.5 (7983 SVN de) Linux openSUSE 13.1 (Bottle) (x86_64)
Memory Usage: 639 MB / 5351 MB (456 MB allocated, but free)
Java version: 1.7.0_51, Oracle Corporation, OpenJDK 64-Bit Server VM
Java package: java-1_7_0-openjdk:x86_64-1.7.0.6
Program arguments: [/home/stoecker/josm-script.osm]
Dataset consistency test: No problems found

java.lang.ArrayIndexOutOfBoundsException: -33030140
        at sun.font.FileFontStrike.getCachedGlyphPtr(FileFontStrike.java:472)
        at sun.font.FileFontStrike.getSlot0GlyphImagePtrs(FileFontStrike.java:438)
        at sun.font.CompositeStrike.getGlyphImagePtrs(CompositeStrike.java:115)
        at sun.font.GlyphList.mapChars(GlyphList.java:272)
        at sun.font.GlyphList.setFromString(GlyphList.java:244)
        at sun.java2d.pipe.GlyphListPipe.drawString(GlyphListPipe.java:71)
        at sun.java2d.pipe.ValidatePipe.drawString(ValidatePipe.java:165)
        at sun.java2d.SunGraphics2D.drawString(SunGraphics2D.java:2867)
        at sun.swing.SwingUtilities2.drawString(SwingUtilities2.java:552)
        at sun.swing.SwingUtilities2.drawStringUnderlineCharAt(SwingUtilities2.java:584)
        at javax.swing.plaf.basic.BasicLabelUI.paintEnabledText(BasicLabelUI.java:119)
        at javax.swing.plaf.basic.BasicLabelUI.paint(BasicLabelUI.java:179)
        at javax.swing.plaf.ComponentUI.update(ComponentUI.java:161)
        at javax.swing.JComponent.paintComponent(JComponent.java:769)
        at javax.swing.JComponent.paint(JComponent.java:1045)
        at javax.swing.CellRendererPane.paintComponent(CellRendererPane.java:151)
        at javax.swing.plaf.basic.BasicTableUI.paintCell(BasicTableUI.java:2109)
        at javax.swing.plaf.basic.BasicTableUI.paintCells(BasicTableUI.java:2010)
        at javax.swing.plaf.basic.BasicTableUI.paint(BasicTableUI.java:1806)
        at javax.swing.plaf.ComponentUI.update(ComponentUI.java:161)
        at javax.swing.JComponent.paintComponent(JComponent.java:769)
        at javax.swing.JComponent.paint(JComponent.java:1045)
        at javax.swing.JComponent.paintChildren(JComponent.java:878)
        at javax.swing.JComponent.paint(JComponent.java:1054)
        at javax.swing.JComponent.paintChildren(JComponent.java:878)
        at javax.swing.JComponent.paint(JComponent.java:1054)
        at javax.swing.JViewport.paint(JViewport.java:731)
        at javax.swing.JComponent.paintChildren(JComponent.java:878)
        at javax.swing.JComponent.paint(JComponent.java:1054)
        at javax.swing.JComponent.paintToOffscreen(JComponent.java:5210)
        at javax.swing.RepaintManager$PaintManager.paintDoubleBuffered(RepaintManager.java:1529)
        at javax.swing.RepaintManager$PaintManager.paint(RepaintManager.java:1452)
        at javax.swing.RepaintManager.paint(RepaintManager.java:1249)
        at javax.swing.JComponent._paintImmediately(JComponent.java:5158)
        at javax.swing.JComponent.paintImmediately(JComponent.java:4969)
        at javax.swing.RepaintManager$3.run(RepaintManager.java:808)
        at javax.swing.RepaintManager$3.run(RepaintManager.java:796)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.security.ProtectionDomain$1.doIntersectionPrivilege(ProtectionDomain.java:76)
        at javax.swing.RepaintManager.paintDirtyRegions(RepaintManager.java:796)
        at javax.swing.RepaintManager.paintDirtyRegions(RepaintManager.java:769)
        at javax.swing.RepaintManager.prePaintDirtyRegions(RepaintManager.java:718)
        at javax.swing.RepaintManager.access$1100(RepaintManager.java:62)
        at javax.swing.RepaintManager$ProcessingRunnable.run(RepaintManager.java:1677)
        at java.awt.event.InvocationEvent.dispatch(InvocationEvent.java:251)
        at java.awt.EventQueue.dispatchEventImpl(EventQueue.java:733)
        at java.awt.EventQueue.access$200(EventQueue.java:103)
        at java.awt.EventQueue$3.run(EventQueue.java:694)
        at java.awt.EventQueue$3.run(EventQueue.java:692)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.security.ProtectionDomain$1.doIntersectionPrivilege(ProtectionDomain.java:76)
        at java.awt.EventQueue.dispatchEvent(EventQueue.java:703)
        at java.awt.EventDispatchThread.pumpOneEventForFilters(EventDispatchThread.java:242)
        at java.awt.EventDispatchThread.pumpEventsForFilter(EventDispatchThread.java:161)
        at java.awt.EventDispatchThread.pumpEventsForHierarchy(EventDispatchThread.java:150)
        at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:146)
        at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:138)
        at java.awt.EventDispatchThread.run(EventDispatchThread.java:91)

And a lot of error windows afterwards or a broken design of the side bar.

Last edited 4 years ago by stoecker (previous) (diff)

comment:7 Changed 4 years ago by skyper

No errors on Debian 7.8 though I do not have many fonts installed and therefor are missing lots of symbols.

Have the same problem with Urdu (ur) as bastiK on Ubuntu.

comment:8 Changed 4 years ago by stoecker

Ticket #11020 has been marked as a duplicate of this ticket.

comment:9 Changed 4 years ago by bastiK

Ticket #11066 has been marked as a duplicate of this ticket.

comment:10 Changed 4 years ago by bastiK

Summary: I18n display support for major Asian scripts notably in India (Tamil...)I18n display support for major Asian scripts (Tamil, Bengali, ...)

comment:11 Changed 4 years ago by bastiK

Owner: changed from anonymous to team
Status: needinfonew

comment:12 in reply to:  6 Changed 4 years ago by bastiK

Replying to stoecker:

Under openSUSE 13.1 I get a exception with the test file:

You can at least find out what language is causing the error, but please in a separate ticket.

comment:13 Changed 4 years ago by stoecker

See #11068 for crash report.

comment:14 Changed 4 years ago by Klumbumbus

If this helps, on my system (win7) I see the same problems like Don-vip with with win8.1 (comment:5)

Revision: 8002
Repository Root: http://josm.openstreetmap.de/svn
Relative URL: ^/trunk
Last Changed Author: Don-vip
Last Changed Date: 2015-02-02 23:56:07 +0100 (Mon, 02 Feb 2015)
Build-Date: 2015-02-03 02:34:26
URL: http://josm.openstreetmap.de/svn/trunk
Repository UUID: 0c6e7542-c601-0410-84e7-c038aed88b3b
Last Changed Rev: 8002

Identification: JOSM/1.5 (8002 en) Windows 7 32-Bit
Memory Usage: 247 MB / 742 MB (109 MB allocated, but free)
Java version: 1.8.0_31, Oracle Corporation, Java HotSpot(TM) Client VM
VM arguments: [-Djava.security.manager, -Djava.security.policy=file:C:\Program Files\Java\jre1.8.0_31\lib\security\javaws.policy, -DtrustProxy=true, -Djnlpx.home=<java.home>\bin, -Djnlpx.origFilenameArg=C:\Program Files\josm-latest.jnlp, -Djnlpx.remove=false, -Djava.util.Arrays.useLegacyMergeSort=true, -Djnlpx.heapsize=256m,768m, -Djnlpx.splashport=55605, -Djnlpx.jvm=<java.home>\bin\javaw.exe, -Djnlpx.vmargs=LURqYXZhLnV0aWwuQXJyYXlzLnVzZUxlZ2FjeU1lcmdlU29ydD10cnVlAA==]
Program arguments: [--debug]
Dataset consistency test: No problems found


comment:15 Changed 4 years ago by stoecker

My display under Linux now that the crash is gone is slighly better (see installed fonts in #9729). I'm missing bn, bo, bpy, dv, kn, ml, pa, si, te (thus have am, my, ta). ur is fine as well.

Maybe we should collect a recommendation what fonts should be installed for which system to help our users? Starting point would be to find differences between Paul's and your systems?

comment:16 Changed 4 years ago by Don-vip

Keywords: font added

comment:17 Changed 4 years ago by bastiK

Keywords: font removed

According to wikipedia and the Microsoft doc, the following Tamil and Bengali fonts ship with Windows:

  • Vrinda (Bengali) - Windows XP and later
  • Shonar Bangla (Bengali) - Windows 7 and later
  • Latha (Tamil) - Windows XP and later
  • Vijaya (Tamil) - Windows 7 and later
  • Nirmala UI (Bengali, Tamil and others) - Windows 8 and later

I tested on Windows XP and I can indeed display both scripts in Notepad, but I have to select Vrinda / Latha fond explicitly.

When Latha is selected, it shows only Tamil and no Roman script. As you cannot have fallback fonts for a JTextField, this makes it practically useless in our case.

Arial Unicode MS is a font which seems to have fairly good coverage, but it only comes with MS Word and Outlook 2000. Anyway, it may be an option to switch to this font in case it is available.

Last edited 4 years ago by bastiK (previous) (diff)

comment:18 Changed 4 years ago by bastiK

Keywords: font added

comment:19 Changed 4 years ago by bastiK

On Ubuntu, it apparently uses the Lohit fonts for Indic scripts:

From my fontconfig.properties:

# Indic scripts
allfonts.bengali=Lohit Bengali
allfonts.gujarati=Lohit Gujarati
allfonts.hindi=Lohit Hindi
#allfonts.malayalam=Lohit Malayalam
allfonts.oriya=Lohit Oriya
allfonts.punjabi=Lohit Punjabi
allfonts.tamil=Lohit Tamil
allfonts.telugu=Lohit Telugu
allfonts.sinhala=LKLUG

[...]

# Search Sequences

[...]

sequence.fallback=wqy-microhei,uminghk,shanheisun,wqy-zenhei,japanese-ipafont,japanese-vlgothic,japanese-sazanami,bengali,gujarati,hindi,oriya,punjabi,tamil,telugu

comment:20 Changed 4 years ago by bastiK

In 8006/josm:

see #10989 - I18n display support for major Asian scripts (Tamil, Bengali, ...)

add support for the following charsets on Windows: devanagari, gurmuhi,
gujarati, tamil, telugu, kannada, bengali

Adds a couple of new advanced preference values:

  • "fontconfig.properties" - allows to install a custom fontconfig.properties file (for advanced users). Default value: <unset>
  • "font.extended-unicode" - allows to disable the new experimental code introduced with this changeset. Default value: "true" (enabled)
  • "font.extended-unicode.added-items" - specify the charsets and fonts that get added to the default Java font configuration.

This can be changed by the user to add different or more fallback fonts.
Format: 3 entries per line; 1st: character subset, 2nd: Font name, 3rd: Font file name.

comment:21 Changed 4 years ago by bastiK

Cc: JunaidAhmed added

comment:22 Changed 4 years ago by bastiK

I've added support for a few more scripts. Could you please test and report if it works? I have no experience with Indic scripts, so I cannot tell if it looks acceptable or not. I've selected the following fonts: TUNGA.TTF, RAAVI.TTF, LATHA.TTF, GAUTAMI.TTF and VRINDA.TTF. Please let me know, if you prefer different fonts or need more.

It is possible to configure the additional fonts directly with the advanced preference value font.extended-unicode.added-items (see changeset comment in [8006]).

comment:23 Changed 4 years ago by bastiK

comment:24 Changed 4 years ago by bastiK

In 8014/josm:

see #10989 - added 3 more scripts (Syriac, Thaana, Malayalam)

comment:25 Changed 4 years ago by stoecker

For openSUSE I now created a josm-fonts package, which installs following fonts:

# Standard to silence fonts-config
Requires:       arphic-gbsn00lp-fonts
Requires:       arphic-bsmi00lp-fonts
Requires:       ipa-gothic-fonts
Requires:       ipa-pgothic-fonts
Requires:       ipa-pmincho-fonts
# standard java, enables language "ko" display (korean)
Requires:       baekmuk-ttf-fonts
# to enable "bn", "bpy", "kn", "ko", "ml", "pa", "sa", "te" languages
Requires:       indic-fonts
# to enable "my" language (burmese)
Requires:       sil-padauk-fonts
# to enable "bo" language (tibetan) 
Requires:       jomolhari-fonts
# to enable "km" language (khmer)
Requires:       khmeros-fonts  
# to enable "si" language (sinhala)
Requires:       lklug-fonts

Didn't find a solution for "dv" yet and the "Java-OpenStreetMap" part of JOSM programm texts looks broken in Khmer language.

Maybe we can do same for Ubuntu repo?

comment:26 Changed 4 years ago by skyper

Still we should document the fonts to make it easier for package maintainers to suggest the right font-packages.

comment:27 Changed 4 years ago by bastiK

Cc: JunaidAhmed removed

comment:28 Changed 4 years ago by bastiK

Resolution: fixed
Status: newclosed

In 8015/josm:

fixed #10989 - I18n display support for major Asian scripts (Tamil, Bengali, ...)

adds more scripts for Windows, including Tibet, Khmer, Lao, Mongolian, Myanmar
Only works if corresponding font is installed, so basically starting
with a certain Windows version. See source code for details.

Advanced pref font.extended-unicode.added-items renamed to
font.extended-unicode.extra-items, type is now list-of-maps.

comment:29 Changed 4 years ago by bastiK

Font support on Windows is not bad and many scripts are covered. However, Oracle doesn't bother to include these fonts in their default configuration for Java and makes it ridiculously difficult to fix this on application level.

It's also strange that no one has written a decent fontconfig.properties file for Windows yet, at least I couldn't find it. Is JOSM the first Java application ever, which needs support for all the mayor scripts?

comment:30 in reply to:  25 Changed 4 years ago by bastiK

Replying to stoecker:

Maybe we can do same for Ubuntu repo?

Yes, good idea!

comment:31 Changed 4 years ago by Don-vip

Milestone: 15.02

comment:32 Changed 4 years ago by Don-vip

In 8099/josm:

see #10989 - add support for more languages on recent version of Windows:

  • Yi, 2m (ii, Win Vista)
  • Tai Lü, 700k (khb, Win 7)
  • Lisu, 940k (lis, Win 8)
  • Buginese, 5m (bug, Win 8.1)
  • Javanese, 82m (jav, Win 8.1)
  • Santali, 6m (sat, Win 8.1: Ol Chiki script)
  • Sora, 250k (srb, Win 8.1: Sora Sompeng script)

+ fix potential NPE, improve javadoc

comment:34 in reply to:  29 Changed 4 years ago by JunaidAhmed

Replying to bastiK:

Font support on Windows is not bad and many scripts are covered. However, Oracle doesn't bother to include these fonts in their default configuration for Java and makes it ridiculously difficult to fix this on application level.

It's also strange that no one has written a decent fontconfig.properties file for Windows yet, at least I couldn't find it. Is JOSM the first Java application ever, which needs support for all the mayor scripts?

Thanks bastik for adding support for Bengali(Bangla) language in JOSM. Though it can't render Bangla properly(except for the right editing panel, here's a screenshot ​https://onedrive.live.com/redir?resid=B09DFB31F3364570%215325&authkey=%21AOV0bvcZPU_uWBs&v=3&ithint=photo%2cjpg) just like openstreetmap.org(discussed here https://github.com/gravitystorm/openstreetmap-carto/issues/1346). But thank you anyway.

comment:35 Changed 4 years ago by bastiK

Hi JunaidAhmed, thanks for reporting the problem! This ticket is dedicated to adding fonts to the Java runtime. I would like to discuss the specific rendering problems in a separate ticket, see #11194.

Changed 5 months ago by Don-vip

Attachment: korean.png added

comment:36 Changed 5 months ago by Don-vip

Korean does not work anymore on Windows 10 / Java 8u172 / JOSM r13790:

See #16215 where korean is being added as a new language.


comment:37 Changed 5 months ago by Don-vip

In 13791/josm:

see #10989, #16215, fix #16199 - load "Malgun Gothic" font on Windows in order to display Korean characters on Windows 10 without Korean language pack installed

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain team.
as The resolution will be set.
The resolution will be deleted.

Add Comment


E-mail address and name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.