Modify

Opened 7 months ago

Closed 6 months ago

#18126 closed enhancement (wontfix)

[RFC][PATCH] Add MapCSS function contains

Reported by: francians Owned by: francians
Priority: normal Milestone:
Component: Core validator Version:
Keywords: Cc: taylor.smock, Klumbumbus

Description

Hello,

I'd like to have in mapcss the possibility to call a function to get an array from a value after splitting with a regex like the xslt tokenize().
I could try to implement it, but I'd like to know if the idea could be approved.
By example:

node[tokenize(value, ';')='test']

If I want to split a value that could be separated by semicolon and check if a 'test' value exists in array.

I would also like to be able to make a new array with a syntax xslt-like:

('A', 'B')

But with tokenize it could also be

tokenize('A,B',',')

Please let me know what you think...
Cheers,
Francesco

Attachments (2)

contains.patch (1.1 KB) - added by francians 6 months ago.
containsV2.patch (1.9 KB) - added by francians 6 months ago.

Download all attachments as: .zip

Change History (31)

comment:1 Changed 7 months ago by Don-vip

we already have split since r5699, see Help/Styles/MapCSSImplementation#Evalexpressions. Isn't it what you want?

comment:2 Changed 7 months ago by Don-vip

Owner: changed from team to francians
Status: newneedinfo

comment:3 Changed 7 months ago by francians

Hi,
Split is supposed to be working after char sequences or after regex?
Can you please give me an example of split() and contains?
Many thanks
Cheers

comment:4 Changed 7 months ago by Don-vip

An example of split is in default presets:

node[traffic_sign][maxspeed=~/^[0-9]+ mph/][!is_prop_set(icon-image)] {
    maxspeedprop: get(split(" mph",tag(maxspeed)),0);
    set maxspeedclass;
}
node[traffic_sign][maxspeed=~/[0-9]+ km\/h/][!is_prop_set(icon-image)] {
    maxspeedprop: get(split(" km/h",tag(maxspeed)),0);
    set maxspeedclass;
}
node[traffic_sign][maxspeed=~/[0-9]+ knots/][!is_prop_set(icon-image)] {
    maxspeedprop: get(split(" knots",tag(maxspeed)),0);
    set maxspeedclass;
}
node[prop(maxspeedclass, default)][!is_prop_set(icon-image, default)]::core_maxnodebg {
    /* background (white) */
    symbol-shape: circle;
    symbol-size: 17;
    symbol-fill-color: white;
    major-z-index: 4.2;
}

comment:5 Changed 7 months ago by francians

Thanks!
Can you also give an example of array contains and tell how spit works?
Cheers
Francesco

comment:6 Changed 6 months ago by Don-vip

Cc: taylor.smock Klumbumbus added

I don't have any example in mind, maybe someone else has one?

comment:7 in reply to:  5 Changed 6 months ago by taylor.smock

Replying to francians:

Thanks!
Can you also give an example of array contains and tell how spit works?
Cheers
Francesco

Lets say we have a way with the tag ref=A1;B2;C3;D4.
If we use split(";", tag("ref")) on it, we get an array with ["A1", "B2", "C3", "D4"]. This is mostly useful for other array based operations (e.g., uniq_list, count, join_list, sort_list, and so on).

However, if we just want to check if it has a value (e.g., if it has B2), we can use a regex_test or regex_match.
In this example, we would have a regex that looked something like this: regexp_test("(^|.*;)B2(;.*|$)", tag("ref")) (note that this is run on the original value from the tag).

Here is an example of how I use arrays (in a very non-optimized fashion):

@supports (min-josm-version: 15323) {
  /* Check for ways that don't have the parent relation ref */
  relation[type=route][route=road][ref] > way[ref][!regexp_test(concat("^(", join_list("|", uniq_list(split(";", join(";", tag_regex("ref"))))), "|;?)+$"), join_list(";", uniq_list(sort_list(parent_tags("ref")))), "i")] {
    group: tr("experimental");
    throwWarning: tr("Parent ref not in child ref");
    set .parent_ref_not_in_child_ref;
  }

  /* Check for ways that have a ref that is not in a parent */
  way[highway][ref][count(parent_tags("ref")) != 0 && !regexp_test(join_list(";", uniq_list(sort_list(split(";", join_list(";", tag_regex("^(ref|nat_ref|int_ref|reg_ref)$")))))), join_list(";", uniq_list(sort_list(parent_tags("ref")))), "i")]!.parent_ref_not_in_child_ref {
    group: tr("experimental");
    throwWarning: tr("A {0} on a way is not in a parent relation", "ref");
    set .ref_not_in_relation
  }
}

I hope this helps (and that I understood your question properly).

comment:8 Changed 6 months ago by francians

Yes, but I think this syntax could be simplified a lot with new modifiers, don't you agree? If yes, I could try to come back with a patch. I am sure I saw contains () in a proposal.. but probably never got merged or implemented

comment:9 in reply to:  8 Changed 6 months ago by taylor.smock

Replying to francians:

Yes, but I think this syntax could be simplified a lot with new modifiers, don't you agree? If yes, I could try to come back with a patch. I am sure I saw contains () in a proposal.. but probably never got merged or implemented

The syntax would probably be a bit easier with a contains tag.

Just keep in mind that if you are checking if a tag contains a specific value, you are still going to have to split it. It is, however, a bit more readable than having a regex that does the same thing.

So, if you are checking for a specific value in a tag, you would do something like this (with your proposal):

way[ref][contains("A1", split(";", tag("ref"))) {
  throwWarning: tr("We found a ref with A1");
}

There are other situations where it would make life a bit simpler. If you want to try and come back with a patch, go ahead.

If you don't want to write a patch, let me know so I can write it, but it might be a while before I get around to it.

Changed 6 months ago by francians

Attachment: contains.patch added

comment:10 Changed 6 months ago by francians

Hello
please find the patch to implement "contains" method with the syntax you suggested:

contains(needle, haystack)

I tested it on a real osm file, but haven't added any unit..
Please let me know what you think
Cheers,
Francesco

comment:11 Changed 6 months ago by francians

I'd like also to suggest this method:

        /**
         * Splits string {@code toSplit} at occurrences of a regex {@code sep} and returns a list of matches.
         * @param sep separator string
         * @param toSplit string to split
         * @return list of matches
         */
        public static List<String> tokenize(String sep, String toSplit) { // NO_UCD (unused code)
            return Arrays.asList(toSplit.split(sep, -1));
        }

to be able to evaluate this expression:

way[ref][contains("A1", tokenize("[;,]", tag("ref"))) {
  throwWarning: tr("We found a ref with A1");
}

oftentimes both comma and semicolon can be used as separator and could avoid some extra head ache.
This is my original proposal... BTW

Last edited 6 months ago by Don-vip (previous) (diff)

comment:12 in reply to:  10 Changed 6 months ago by taylor.smock

Replying to francians:

Hello
please find the patch to implement "contains" method with the syntax you suggested:

contains(needle, haystack)

I tested it on a real osm file, but haven't added any unit..
Please let me know what you think
Cheers,
Francesco

Notes:

  • The function should be in the Functions.java file (same directory as ExpressionFactory.java).
  • You should add an @since xxx javadoc declaration (this helps the maintainers like Don-Vip automagically add the version of JOSM required for a specific function into the javadoc). I've forgotten it a few times for some things.
  • ant pmd checkstyle didn't throw any warnings (which is a good thing -- on one of my machines, I cannot run it, and I have to rely on style enforcement in Eclipse, which can be hit or miss).
  • You should have tests that verify the correctness of the output. It may perform correctly now, but if someone refactors it for speed/readability/code deduplication, then it might not perform correctly. It might not error out, but still return erroneous results. Some people (like me) assume that the original test writer wrote the tests checking for accurate output.

[EDIT: I probably should have added that I knew you had mentioned that you hadn't added any unit tests]

Replying to francians:

I'd like also to suggest this method:

        /**
         * Splits string {@code toSplit} at occurrences of a regex {@code sep} and returns a list of matches.
         * @param sep separator string
         * @param toSplit string to split
         * @return list of matches
         */
        public static List<String> tokenize(String sep, String toSplit) { // NO_UCD (unused code)
            return Arrays.asList(toSplit.split(sep, -1));
        }

to be able to evaluate this expression:

way[ref][contains("A1", tokenize("[;,]", tag("ref"))) {

throwWarning: tr("We found a ref with A1");

}

oftentimes both comma and semicolon can be used as separator and could avoid some extra head ache.
This is my original proposal... BTW

I don't know if I (personally) would add another function "name" to remember.
Personally, I'd overload split in Functions.java to have another parameter, like so:

    public static List<String> split(String sep, String toSplit, boolean asRegex) { // NO_UCD (unused code)
        if (asRegex) {
            return Arrays.asList(toSplit.split(sep, -1));
        } else {
            return split(sep, toSplit);
        }
    }

Disclaimer:
I'm not a JOSM maintainer, so take what I say with a grain of salt. A maintainer may have completely different feedback.

Last edited 6 months ago by taylor.smock (previous) (diff)

comment:13 Changed 6 months ago by taylor.smock

Something else I should have mentioned:

I think that you are using an old version of JOSM's source code (probably via git). Can you update it?

If you must use git, run git svn clone https://josm.openstreetmap.de/svn/trunk josm, but you should use svn co https://josm.openstreetmap.de/svn/trunk josm (this gets all of JOSM's source code, including most of the public plugins).

See https://josm.openstreetmap.de/wiki/Source%20code#Getthesource for more information (like the URL for just JOSM without plugins).

Don't use the git repository (from GitHub) on that page -- it looks like it is several months out of date (last update was May 25).

comment:14 Changed 6 months ago by francians

Hello again,

thanks for the reply. TBH, I'm not sure how to satisfy all your requests:

  1. OK, I can move the method in Functions.java;
  2. I'm not involved in the development, so IDK what revision could contain my change too;
  3. OK
  4. Please find my unit (org.openstreetmap.josm.gui.mappaint.mapcss.MapCSSParserTest):
    @Test
    public void testListContains() throws Exception {
        Selector s1 = getParser("way[contains(\"x10\", split(\";\", tag(\"ref\")))]").selector();
        assertTrue(s1.matches(new Environment(OsmUtils.createPrimitive("way ref=x10;x20"))));
    }

Testcase: testListContains took 0.573 sec

Can you please ask a maintainer to merge both the patch and the unit?
Many thanks
Best
Francesco

P.S. I never worked with svn and don't have time to learn more about it... can you please update your git repo too?

comment:15 in reply to:  14 Changed 6 months ago by taylor.smock

Replying to francians:

Hello again,

thanks for the reply. TBH, I'm not sure how to satisfy all your requests:

  1. OK, I can move the method in Functions.java;
  2. I'm not involved in the development, so IDK what revision could contain my change too;

I guess I did a poor job of explaining the @since xxx. The maintainers have a script that they run before committing the change that essentially gets the current svn revision, adds one, and then does a search and replace for @since xxx where it replaces the xxx with that revision.

  1. OK
  2. Please find my unit (org.openstreetmap.josm.gui.mappaint.mapcss.MapCSSParserTest):
    @Test
    public void testListContains() throws Exception {
        Selector s1 = getParser("way[contains(\"x10\", split(\";\", tag(\"ref\")))]").selector();
        assertTrue(s1.matches(new Environment(OsmUtils.createPrimitive("way ref=x10;x20"))));
    }

Any chance you can add an assert for when you expect it to not match?

Testcase: testListContains took 0.573 sec

Can you please ask a maintainer to merge both the patch and the unit?
Many thanks
Best
Francesco

P.S. I never worked with svn and don't have time to learn more about it... can you please update your git repo too?

I assumed that might be the case (I very little experience with svn myself before I started working on JOSM), which is why I showed

$ git svn clone https://josm.openstreetmap.de/svn/trunk josm

in my previous comment.

As far as the git mirror goes, see #6887. The mirror is not, as far as I know, an officially supported repository.
There is some work going on to migrate to git (see #16857), but it is still a work in progress.

Changed 6 months ago by francians

Attachment: containsV2.patch added

comment:16 Changed 6 months ago by francians

Hello,

I followed your suggestions in V2, but I wasn't able to test that branch...
Please let me know if this can be applied on current master.

Cheers,
Francesco

comment:17 in reply to:  16 Changed 6 months ago by taylor.smock

Replying to francians:

Hello,

I followed your suggestions in V2, but I wasn't able to test that branch...
Please let me know if this can be applied on current master.

Cheers,
Francesco

It looks good to me. The tests passed and it did apply cleanly.

Thank you.

comment:18 Changed 6 months ago by francians

Any chance to get it in next JOSM release?
As soon as it will get merged I could focus back on the original request.

Anyway, I'm not a big Java overload fan, because a pile of true and false at the end of a method aren't that readable in the end.

comment:19 in reply to:  18 Changed 6 months ago by taylor.smock

Replying to francians:

Any chance to get it in next JOSM release?
As soon as it will get merged I could focus back on the original request.

Anyway, I'm not a big Java overload fan, because a pile of true and false at the end of a method aren't that readable in the end.

You might want to clear the needinfo flag soonish.

Beyond that, as far as it getting merged in before the next JOSM release, I'd say pretty good, assuming Klumbumbus is able to take a look at it. And he agrees with my assessment, which he might not.

comment:20 Changed 6 months ago by francians

Summary: New mapcss feature suggestion [RFC]New mapcss feature suggestion [RFC][PATCH]

I'm developing my own Mapcss validations file and while doing it I hope to send patches to improve the JOSM Mapcss parser. I started with something very simple, but my goal is to have a full ruleset for Italy: https://github.com/fansanelli/map-this-way

comment:21 Changed 6 months ago by simon04

Summary: New mapcss feature suggestion [RFC][PATCH][RFC][PATCH] Add MapCSS function contains

The intended use-case is to check whether a specific value is present in a ;-separated list of values? If so, ~= from Help/Styles/MapCSSImplementation#Conditionselector tests exactly that.

If we need contains nevertheless, it should be implemented using java.util.List#contains.

comment:22 Changed 6 months ago by francians

I need contains for my Mapcss rules and I provided a working implementation...
Can you please elaborate why List contains is a better alternative? Did you benchmarked it?

Thanks
Francesco

comment:23 Changed 6 months ago by Don-vip

We try to not reinvent the wheel, especially such a basic function as List::contains. But I'm not convinced this is needed, can you please try the existing operator Simon gave you?

comment:24 Changed 6 months ago by francians

Dear Don-vip, I could send the 3rd version
of my patch if you think that it will be finally merged... I don't have much time, so if you don't think this work it's useful fill free to close my ticket.
That said, I want to implement more functions, so please let me know what is the correct way to submit them in future.

comment:25 Changed 6 months ago by simon04

The following rule should work in JOSM and exactly fulfill your needs. Therefore, no contains needs to be added.

way[ref][ref~=A1] {
  throwWarning: tr("We found a ref with A1");
}

comment:26 Changed 6 months ago by francians

How about a way ref=xA1.. it will be match

comment:27 Changed 6 months ago by francians

I misread the example.. your example. I was sure you meant regex, but I was wrong

comment:28 Changed 6 months ago by francians

The only reason I suggested the contains is because certain tags like house numbers can be separated by commas. I may want to split by comma or semicolon. May be the only corner case

comment:29 Changed 6 months ago by francians

Resolution: wontfix
Status: needinfoclosed

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain francians.
as The resolution will be set.
The resolution will be deleted.

Add Comment


E-mail address and name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.