Modify

Opened 9 months ago

Closed 6 weeks ago

Last modified 7 days ago

#14602 closed defect (fixed)

Confusion between digit group separators and decimal marks

Reported by: sommerluk Owned by: team
Priority: minor Milestone: 17.10
Component: Core Version:
Keywords: decimal separator Cc: bastiK

Description

In JOSM, settings, audio player, speed, I can see “1.3” and that seems to mean “one (and three tenth). In JOSM, settings, WMS/TMS, settings, cache size, I can see “20.000” and that means “twenty thousand”. So, sometimes the full stop “.” is used as digit group separator and sometimes it is used as decimal mark. That’s extremely confusing and definitively wrong.

Furthermore, this is also a localization issue. In some countries, “,” is the decimal mark, in others “.” is the decimal mark. The world seems to be divided fifty-fifty. As I’m using JOSM with the localization “German (Germany)”, so I would expect JOSM to use “,” as decimal mark, but JOSM doesn’t. Maybe it is intentionally that JOSM ignores the localization of the decimal mark, because Geo-URLs often use the decimal point as separator and JOSM doesn’t want to confuse the user furthermore?

Anyway, for digit group separators it would be a good idea to use always consistently (and independent of the localization) in all fields a space, just as ISO 31-0 tells. The character U+202F NARROW NO-BREAK SPACE might be a good space character to use.

Attachments (0)

Change History (19)

comment:1 Changed 8 months ago by stoecker

Priority: normalminor

Hmm, the issue is:

  • All input fields are English standard "." as decimal separator
  • All translated displayed texts (i.e. the cache size) are using the localized variant, because of the MessageFormat function used to create the text.

JOSM should use the English standard always, as that is also used in the database.

comment:2 Changed 8 months ago by sommerluk

JOSM should use the English standard always,
as that is also used in the database.

It sounds indeed reasonable to use everywhere in JOSM the same style as in the OSM database. So:

Decimal mark: Always “.” (independently from language and locale)

Digit group separator: Never used (independently from language and locale)

Not using the digit group separator would make long numbers harder to read, but it’s the only way to get a consistent feeling between JOSM and the database.

comment:3 Changed 2 months ago by Don-vip

Agreed. Concerning ISO, it's now ISO 80000-1, which states in its chapter 7.3.1 (Numbers, General):

To facilitate the reading of numbers with many digits, these may be separated into groups of three, counting from the decimal sign towards the left and the right. No group shall contain more than three digits. Where such separation into groups of three is used, the groups shall be separated by a small space and not by a point or a comma or by any other means.

In the case where there is no decimal part (and thus no decimal sign), the counting shall be from the rightmost digit towards the left.

The separation into groups of three should not be used for ordinal numbers used as reference numbers, e.g. ISO 80000-1.

comment:4 Changed 2 months ago by Don-vip

Cc: bastiK added
Milestone: 17.10

comment:5 Changed 2 months ago by bastiK

In 12917/josm:

see #14602 - don't use thousands separator when displaying the JOSM version

comment:6 Changed 2 months ago by Don-vip

@bastiK: I have edited your svn log message to reference this bug report :)

comment:7 in reply to:  6 ; Changed 2 months ago by bastiK

Replying to Don-vip:

@bastiK: I have edited your svn log message to reference this bug report :)

Each time I'm amazed it doesn't crash everything. :)

comment:8 in reply to:  7 Changed 2 months ago by stoecker

Replying to bastiK:

Replying to Don-vip:

@bastiK: I have edited your svn log message to reference this bug report :)

Each time I'm amazed it doesn't crash everything. :)

Well, it is not CVS, RCS or SCCS. It's a bit more modern ;-)

comment:9 Changed 2 months ago by bastiK

This removes the thousands separator (plainIntegerFormat) or changes it to thin space (smallSpaceIntegerFormat) for integers:

  • src/org/openstreetmap/josm/tools/I18n.java

     
    1010import java.lang.annotation.RetentionPolicy;
    1111import java.net.URL;
    1212import java.nio.charset.StandardCharsets;
     13import java.text.DecimalFormat;
     14import java.text.DecimalFormatSymbols;
    1315import java.text.MessageFormat;
     16import java.text.NumberFormat;
    1417import java.util.ArrayList;
    1518import java.util.Arrays;
    1619import java.util.Collection;
     
    114117     */
    115118    public static String tr(String text, Object... objects) {
    116119        if (text == null) return null;
    117         return MessageFormat.format(gettext(text, null), objects);
     120        return customMessageFormat(gettext(text, null), objects);
    118121    }
    119122
     123    public static final NumberFormat plainIntegerFormat = NumberFormat.getIntegerInstance();
     124    public static final NumberFormat smallSpaceIntegerFormat = NumberFormat.getIntegerInstance();
     125    static {
     126        plainIntegerFormat.setGroupingUsed(false);
     127        if (smallSpaceIntegerFormat instanceof DecimalFormat) {
     128            DecimalFormat df = (DecimalFormat) smallSpaceIntegerFormat;
     129            DecimalFormatSymbols dfs = df.getDecimalFormatSymbols();
     130            dfs.setGroupingSeparator('\u2009');
     131            df.setDecimalFormatSymbols(dfs);
     132        }
     133    }
     134
     135    public static String customMessageFormat(String text, Object... objects) {
     136        MessageFormat mf = new MessageFormat(text);
     137        for (int i = 0; i < objects.length; i++) {
     138            if (objects[i] instanceof Integer || objects[i] instanceof Long) {
     139                mf.setFormatByArgumentIndex(i, plainIntegerFormat);
     140            }
     141        }
     142        return mf.format(objects);
     143    }
     144
    120145    /**
    121146     * Translates some text in a context for the current locale.
    122147     * There can be different translations for the same text within different contexts.

comment:10 Changed 2 months ago by Don-vip

Don't commit it please, I'm working on a totally different approach but it takes some time to debug Java internals...

comment:11 Changed 2 months ago by Don-vip

OK, here's my solution:

  • build.xml

     
    6363    <target name="create-revision-eclipse">
    6464        <property name="revision.dir" value="bin"/>
    6565        <antcall target="create-revision"/>
     66        <mkdir dir="bin/META-INF/services"/>
     67        <echo encoding="UTF-8" file="bin/META-INF/services/java.text.spi.DecimalFormatSymbolsProvider">org.openstreetmap.josm.tools.JosmDecimalFormatSymbolsProvider</echo>
    6668    </target>
    6769    <!--
    6870      ** Initializes the REVISION.XML file from SVN information
     
    150152                <attribute name="Add-Exports" value="java.base/sun.security.util java.base/sun.security.x509 java.desktop/com.apple.eawt java.desktop/com.sun.imageio.spi javafx.graphics/com.sun.javafx.application jdk.deploy/com.sun.deploy.config" />
    151153                <attribute name="Add-Opens" value="java.base/java.lang java.base/jdk.internal.loader java.desktop/javax.imageio.spi java.desktop/javax.swing.text.html java.prefs/java.util.prefs" />
    152154            </manifest>
     155            <service type="java.text.spi.DecimalFormatSymbolsProvider" provider="org.openstreetmap.josm.tools.JosmDecimalFormatSymbolsProvider" />
    153156            <zipfileset dir="images" prefix="images"/>
    154157            <zipfileset dir="data" prefix="data"/>
    155158            <zipfileset dir="styles" prefix="styles"/>
  • src/org/openstreetmap/josm/tools/I18n.java

     
    8989    private static volatile Map<String, String> strings;
    9090    private static volatile Map<String, String[]> pstrings;
    9191    private static Map<String, PluralMode> languages = new HashMap<>();
     92    static {
     93        //languages.put("ar", PluralMode.MODE_AR);
     94        languages.put("ast", PluralMode.MODE_NOTONE);
     95        languages.put("bg", PluralMode.MODE_NOTONE);
     96        languages.put("be", PluralMode.MODE_RU);
     97        languages.put("ca", PluralMode.MODE_NOTONE);
     98        languages.put("ca@valencia", PluralMode.MODE_NOTONE);
     99        languages.put("cs", PluralMode.MODE_CS);
     100        languages.put("da", PluralMode.MODE_NOTONE);
     101        languages.put("de", PluralMode.MODE_NOTONE);
     102        languages.put("el", PluralMode.MODE_NOTONE);
     103        languages.put("en_AU", PluralMode.MODE_NOTONE);
     104        languages.put("en_GB", PluralMode.MODE_NOTONE);
     105        languages.put("es", PluralMode.MODE_NOTONE);
     106        languages.put("et", PluralMode.MODE_NOTONE);
     107        //languages.put("eu", PluralMode.MODE_NOTONE);
     108        languages.put("fi", PluralMode.MODE_NOTONE);
     109        languages.put("fr", PluralMode.MODE_GREATERONE);
     110        languages.put("gl", PluralMode.MODE_NOTONE);
     111        //languages.put("he", PluralMode.MODE_NOTONE);
     112        languages.put("hu", PluralMode.MODE_NOTONE);
     113        languages.put("id", PluralMode.MODE_NONE);
     114        //languages.put("is", PluralMode.MODE_NOTONE);
     115        languages.put("it", PluralMode.MODE_NOTONE);
     116        languages.put("ja", PluralMode.MODE_NONE);
     117        // fully supported only with Java 8 and later (needs CLDR)
     118        languages.put("km", PluralMode.MODE_NONE);
     119        languages.put("lt", PluralMode.MODE_LT);
     120        languages.put("nb", PluralMode.MODE_NOTONE);
     121        languages.put("nl", PluralMode.MODE_NOTONE);
     122        languages.put("pl", PluralMode.MODE_PL);
     123        languages.put("pt", PluralMode.MODE_NOTONE);
     124        languages.put("pt_BR", PluralMode.MODE_GREATERONE);
     125        //languages.put("ro", PluralMode.MODE_RO);
     126        languages.put("ru", PluralMode.MODE_RU);
     127        languages.put("sk", PluralMode.MODE_SK);
     128        //languages.put("sl", PluralMode.MODE_SL);
     129        languages.put("sv", PluralMode.MODE_NOTONE);
     130        //languages.put("tr", PluralMode.MODE_NONE);
     131        languages.put("uk", PluralMode.MODE_RU);
     132        languages.put("vi", PluralMode.MODE_NONE);
     133        languages.put("zh_CN", PluralMode.MODE_NONE);
     134        languages.put("zh_TW", PluralMode.MODE_NONE);
     135    }
    92136
    93137    /**
    94138     * Translates some text for the current locale.
     
    300344        return languages.containsKey(code);
    301345    }
    302346
     347    static void setupJavaLocaleProviders() {
     348        // Look up SPI providers first (for JosmDecimalFormatSymbolsProvider).
     349        // Enable CLDR locale provider on Java 8 to get additional languages, such as Khmer.
     350        // http://docs.oracle.com/javase/8/docs/technotes/guides/intl/enhancements.8.html#cldr
     351        // FIXME: This must be updated after we switch to Java 9.
     352        // See https://docs.oracle.com/javase/9/docs/api/java/util/spi/LocaleServiceProvider.html
     353        System.setProperty("java.locale.providers", "SPI,JRE,CLDR"); // Don't call Utils.updateSystemProperty to avoid spurious log at startup
     354    }
     355
    303356    /**
    304357     * I18n initialization.
    305358     */
    306359    public static void init() {
    307         // Enable CLDR locale provider on Java 8 to get additional languages, such as Khmer.
    308         // http://docs.oracle.com/javase/8/docs/technotes/guides/intl/enhancements.8.html#cldr
    309         // FIXME: This can be removed after we switch to a minimal version of Java that enables CLDR by default
    310         // or includes all languages we need in the JRE. See http://openjdk.java.net/jeps/252 for Java 9
    311         System.setProperty("java.locale.providers", "JRE,CLDR"); // Don't call Utils.updateSystemProperty to avoid spurious log at startup
    312 
    313         //languages.put("ar", PluralMode.MODE_AR);
    314         languages.put("ast", PluralMode.MODE_NOTONE);
    315         languages.put("bg", PluralMode.MODE_NOTONE);
    316         languages.put("be", PluralMode.MODE_RU);
    317         languages.put("ca", PluralMode.MODE_NOTONE);
    318         languages.put("ca@valencia", PluralMode.MODE_NOTONE);
    319         languages.put("cs", PluralMode.MODE_CS);
    320         languages.put("da", PluralMode.MODE_NOTONE);
    321         languages.put("de", PluralMode.MODE_NOTONE);
    322         languages.put("el", PluralMode.MODE_NOTONE);
    323         languages.put("en_AU", PluralMode.MODE_NOTONE);
    324         languages.put("en_GB", PluralMode.MODE_NOTONE);
    325         languages.put("es", PluralMode.MODE_NOTONE);
    326         languages.put("et", PluralMode.MODE_NOTONE);
    327         //languages.put("eu", PluralMode.MODE_NOTONE);
    328         languages.put("fi", PluralMode.MODE_NOTONE);
    329         languages.put("fr", PluralMode.MODE_GREATERONE);
    330         languages.put("gl", PluralMode.MODE_NOTONE);
    331         //languages.put("he", PluralMode.MODE_NOTONE);
    332         languages.put("hu", PluralMode.MODE_NOTONE);
    333         languages.put("id", PluralMode.MODE_NONE);
    334         //languages.put("is", PluralMode.MODE_NOTONE);
    335         languages.put("it", PluralMode.MODE_NOTONE);
    336         languages.put("ja", PluralMode.MODE_NONE);
    337         // fully supported only with Java 8 and later (needs CLDR)
    338         languages.put("km", PluralMode.MODE_NONE);
    339         languages.put("lt", PluralMode.MODE_LT);
    340         languages.put("nb", PluralMode.MODE_NOTONE);
    341         languages.put("nl", PluralMode.MODE_NOTONE);
    342         languages.put("pl", PluralMode.MODE_PL);
    343         languages.put("pt", PluralMode.MODE_NOTONE);
    344         languages.put("pt_BR", PluralMode.MODE_GREATERONE);
    345         //languages.put("ro", PluralMode.MODE_RO);
    346         languages.put("ru", PluralMode.MODE_RU);
    347         languages.put("sk", PluralMode.MODE_SK);
    348         //languages.put("sl", PluralMode.MODE_SL);
    349         languages.put("sv", PluralMode.MODE_NOTONE);
    350         //languages.put("tr", PluralMode.MODE_NONE);
    351         languages.put("uk", PluralMode.MODE_RU);
    352         languages.put("vi", PluralMode.MODE_NONE);
    353         languages.put("zh_CN", PluralMode.MODE_NONE);
    354         languages.put("zh_TW", PluralMode.MODE_NONE);
     360        setupJavaLocaleProviders();
    355361
    356362        /* try initial language settings, may be changed later again */
    357363        if (!load(LanguageInfo.getJOSMLocaleCode())) {
  • src/org/openstreetmap/josm/tools/JosmDecimalFormatSymbolsProvider.java

     
     1// License: GPL. For details, see LICENSE file.
     2package org.openstreetmap.josm.tools;
     3
     4import java.text.DecimalFormatSymbols;
     5import java.text.spi.DecimalFormatSymbolsProvider;
     6import java.util.Locale;
     7
     8/**
     9 * JOSM implementation of the {@link java.text.DecimalFormatSymbols DecimalFormatSymbols} class,
     10 * consistent with OSM API and ISO 80000-1.
     11 * This class will only be used with Java 9 and later runtimes, as Java 8 implementation relies
     12 * on Java Extension Mechanism only, while Java 9 supports application classpath.
     13 * See {@link java.util.spi.LocaleServiceProvider LocaleServiceProvider} javadoc for more details.
     14 * @since xxx
     15 */
     16public class JosmDecimalFormatSymbolsProvider extends DecimalFormatSymbolsProvider {
     17
     18    @Override
     19    public DecimalFormatSymbols getInstance(Locale locale) {
     20        DecimalFormatSymbols symbols = new DecimalFormatSymbols(locale);
     21        // Override decimal mark to be consistent with OSM API
     22        symbols.setDecimalSeparator('.');
     23        // Override digit group separator to be consistent across languages with ISO 80000-1, chapter 7.3.1
     24        symbols.setGroupingSeparator('\u202F'); // U+202F: NARROW NO-BREAK SPACE
     25        return symbols;
     26    }
     27
     28    @Override
     29    public Locale[] getAvailableLocales() {
     30        return I18n.getAvailableTranslations();
     31    }
     32}
  • src/org/openstreetmap/josm/tools/Logging.java

    Property changes on: src\org\openstreetmap\josm\tools\JosmDecimalFormatSymbolsProvider.java
    ___________________________________________________________________
    Added: svn:eol-style
    ## -0,0 +1 ##
    +native
     
    5151    private static final RememberWarningHandler WARNINGS = new RememberWarningHandler();
    5252
    5353    static {
     54        // We need to be sure java.locale.providers system property is initialized by JOSM, not by JRE
     55        // The call to ConsoleHandler constructor makes the JRE access this property by side effect
     56        I18n.setupJavaLocaleProviders();
     57
    5458        LOGGER.setLevel(Level.ALL);
    5559        LOGGER.setUseParentHandlers(false);

Looks simple but it took me quite a few hours to understand a serious design limitation of Java 6/7/8 removed in Java 9.

With this, we register a new LocaleServiceProvider (a DecimalFormatSymbolsProvider to be exact) which will override the decimal symbols for all locales supported by JOSM. Absolutely everywhere, so there is no need for a new public API or to track the formatting calls we make.

I also create the META-INF/services folder in Eclipse bin directory so that the mechanism works when running JOSM from Eclipse. Ant takes care of creating it in the jar by itself.

The fun point: this API introduced in Java 6 is clearly elegant, but it was absolutely worthless until Java 9. Why? Because up to Java 8 it works only with service providers registered through the Java Extension Mechanism (putting jars in the JRE directory...) which is non-practical for us, and a deprecated technology removed in Java 9. Java 9 removes this horrible restriction and allows now service providers registered in the application classpath to override JRE locale providers, which is exactly what we need.

So the good point: the code does not require Java 9 to compile, as the API is there since Java 6.
the bad point: the code require Java 9 at runtime. But as it is a very minor defect, and we will likely migrate to Java 9 next year anyway, I don't think that's a problem.

Thoughts?

comment:12 Changed 2 months ago by bastiK

Your solution looks much more like a proper configuration than my crude hack. It is a good idea to get rid of localized group separator and use thin space instead.

I'm not sure it is okay to ditch the decimal mark localization for the sake of consistency, though. It looks strange to have for example "0.5" in the middle of a German sentence. This would seem as if it is incorrectly or only partially translated.

comment:13 Changed 2 months ago by stoecker

+1 Please restore proper decimal point/comma. That would be a serious step back to introduce a point everywhere.

comment:14 Changed 2 months ago by stoecker

Question remains how we can keep the point as standard for OSM data without destroying the comma/point as language standard for the UI.

An auto-conversion does probably more harm than good.

comment:15 Changed 2 months ago by bastiK

So where do we have floating point input?

  • Tools add node: Both point and comma is already allowed for input
  • Audio preferences: I would switch to localized decimal mark, but still allow decimal point as input (auto-conversion when saving to the preferences)
  • Imagery layer Offset - same here (coordinates are separated by semicolon, so no problem)
  • Tag value: User must be educated, that decimal point is the standard for OSM tagging, no way around that. (Validator warnings should give a hint)
  • Download dialog / Bounding Box: Not sure, coordinates are likely to be copy & pasted to and from the OSM website
  • Advanced preferences: clue is in the name, user must know what they are doing

comment:16 Changed 2 months ago by Don-vip

In 12931/josm:

see #14602 - Override digit group separator to be consistent across languages with ISO 80000-1 + checkstyle fixes

comment:17 Changed 6 weeks ago by Don-vip

Resolution: fixed
Status: newclosed

In 13050/josm:

fix #14602 - allow both dot and comma decimal separator everywhere possible for user-entered values

comment:18 Changed 4 weeks ago by Don-vip

See #15552

comment:19 Changed 7 days ago by Klumbumbus

Keywords: decimal separator added

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain team.
as The resolution will be set.
The resolution will be deleted.

Add Comment


E-mail address and name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.