Modify

Opened 14 years ago

Closed 14 years ago

Last modified 13 years ago

#6137 closed enhancement (fixed)

[PATCH] provide osm wiki image retrieval in ImageProvider

Reported by: cmuelle8 Owned by: cmuelle8
Priority: normal Milestone:
Component: Core Version:
Keywords: Cc:

Description (last modified by bastiK)

patch has some comments


GetAvailableImpl could need some more cleaning.. I started by swapping the wiki code into a separate function (which was also needed, since I need the Icon object generated in GetAvailable for screen updates..


greetings

Attachments (2)

wiki-image-provider.patch (7.0 KB ) - added by cmuelle8 14 years ago.
fetch wiki:// style links in ImageProvider to interoperate with osm wiki as an icon source
ImageProvider.rev4398.patch (15.0 KB ) - added by cmuelle8 14 years ago.
updated patch (using digestutils to build img urls)

Download all attachments as: .zip

Change History (36)

by cmuelle8, 14 years ago

Attachment: wiki-image-provider.patch added

fetch wiki:// style links in ImageProvider to interoperate with osm wiki as an icon source

comment:1 by stoecker, 14 years ago

Owner: changed from team to cmuelle8
Status: newneedinfo

Why do we need a special wiki fetcher? We can specify them as normal http links

comment:2 by bastiK, 14 years ago

cmuelle8: In case you didn't know, the path is generated from the first two chars of the md5 hash:

md5("Motorway-photo.jpg")=60fe690fae086e6da383a963c311c2d5 --> http://wiki.openstreetmap.org/w/images/6/60/Motorway-photo.jpg

comment:3 by cmuelle8, 14 years ago

@stoecker: we need the wiki fetcher, since some tags have relative urls only.. e.g. wiki:symbol

@bastiK: thanks, that makes fetching a whole lot easier. i actually looked for that in wiki help, google and what-not, but was unable to dig it up..

greetings

comment:4 by bastiK, 14 years ago

I'd suggest something like

node[wiki:symbol] {
    icon-image: eval(wiki(get_tag_value("wiki:symbol")));
    icon-size: 20;
}

Where wiki("Motorway-photo.jpg") evaluates to "http://wiki.openstreetmap.org/w/images/6/60/Motorway-photo.jpg". No reason to mess with ImageProvider.

Last edited 14 years ago by bastiK (previous) (diff)

in reply to:  4 comment:5 by cmuelle8, 14 years ago

Replying to bastiK:

I'd suggest something like

node[wiki:symbol] {
    icon-image: eval(wiki(get_tag_value("wiki:symbol")));
    icon-size: 20;
}

Where wiki("Motorway-photo.jpg") evaluates to "http://wiki.openstreetmap.org/w/images/6/60/Motorway-photo.jpg". No reason to mess with ImageProvider.

This looks nice, but it is not the same. Think of classic elemstyles.xml type of styles. An Expression function will not benefit these. Style writers who, for some reason, do not want to use mapcss, should get consistency, at least where possible. Moreover, I think that the asynchronous loading of these images from the web belongs into ImageProvider, from a good software practices view.

E.g. if I cripple class IconReference or even class Instruction with the asynchronous loading (and therefore intercept http:// requests before ImageProvider gets it that does synchronous loading) I split program logic at will. Good software practices should allow you to have it in ImageProvider.

I'm actually with you on the "icon-size" issue, but implementing it will probably mean inefficiency or impracticability (e.g. you will not be able to scale the icon in ImageProvider, and hence put it in the paint code for the node, where "icon-size" is known - then you will resize the original size cached image every time you paint the node which seems ridiculous from an energy consumption point of view :) ).


So let's just have it the way I wrote it with the MD5 optimization in finding the right URL. If someone later on does a more general overhaul of josms image handling he/she can move it out into a then more appropiate place. For now I think ImageProvider is actually the very right place to put it.

Greetings

in reply to:  2 comment:6 by cmuelle8, 14 years ago

Replying to bastiK:

cmuelle8: In case you didn't know, the path is generated from the first two chars of the md5 hash:

md5("Motorway-photo.jpg")=60fe690fae086e6da383a963c311c2d5 --> http://wiki.openstreetmap.org/w/images/6/60/Motorway-photo.jpg

This is not the same either, I just noticed. My approach recognizes interwiki linked images, these I cannot find with the MD5 method.

 http://wiki.openstreetmap.org/w/images/f/f1/Signet_Elster_Radweg.png

does not exist, while

 http://wiki.openstreetmap.org/wiki/File:Signet_Elster_Radweg.png

actually has a link.. maybe MediaWiki offers an API to make it easiert than SAX parsing the "File:" description page, but I was not successful finding one.

all the best,
Christian

comment:7 by bastiK, 14 years ago

added icon-size in [3999].

wikimedia commons is probably the only relevant foreign repository. If both URLs return 404, you can still do the html parsing.

in reply to:  7 ; comment:8 by cmuelle8, 14 years ago

Replying to bastiK:

added icon-size in [3999].

wikimedia commons is probably the only relevant foreign repository. If both URLs return 404, you can still do the html parsing.

Fair enough. Thx for the icon-size support. Is elemstyles.xml able to use that as well? Then I can strip the scaling part without further thought from the patch and add the 404 checks on MD5-built urls prior to parsing the description page.

Greetings

in reply to:  8 comment:9 by bastiK, 14 years ago

Replying to cmuelle8:

Thx for the icon-size support. Is elemstyles.xml able to use that as well?

If you want it for xml style, you have to create proper support, i.e. <icon ... size="20"/>. I do not plan to port any new features to xml style format myself, but you are welcome to do so. :)

comment:10 by cmuelle8, 14 years ago

well, having had the size encoded in the url would have worked for both. i do not plan to mess with it twice. Also, I do not see how to do asynchronous loading in ImageProvider anymore, since you generate the Icon in NodeElemStyles now and ripping the patch out of Image Provider, putting parts of it into NodeElemStyles does not seem to make any sense.

greetings

in reply to:  10 comment:11 by cmuelle8, 14 years ago

Replying to cmuelle8:

generate the Icon in NodeElemStyles now

this is not true, ok, you just scale and use setImage to modify it there. question is, it you want asynchronous loading at all, if and how to get width in the image loading thread to scale the fetched result. using icon.getImage().getWidth() seems messy..

in reply to:  10 ; comment:12 by bastiK, 14 years ago

Replying to cmuelle8:

well, having had the size encoded in the url would have worked for both.

wiki:30://File:Signet_Elster_Radweg.png

You put 2 independent pieces of information into one string. I'd consider this bad style unless there is a reason for it. What comes next, wiki:30:122://... where 122 is the alpha value? It is also inconsistent because the same does not work for http://... URLs. Actually the "30:" part is optional, so I don't really care...

Also, I do not see how to do asynchronous loading in ImageProvider anymore, since you generate the Icon in NodeElemStyles now and ripping the patch out of Image Provider, putting parts of it into NodeElemStyles does not seem to make any sense.

The problem of asynchronous loading is independent of the place where the scaling is done (NodeElemStyles or ImageProvider). If we want to scale the image such that it fits into a square of SIZE x SIZE, we need to know width and height of the image in advance. In principle it can be done using Image.getWidth(new ImageObserver() { ... });. The image observer will report the image dimensions long time before the image is fully loaded.

However I'm not sure it is worth the effort as images are usually displayed on screen very soon after the scaling is performed / requested. Also images on a map are usually quite small.

in reply to:  12 comment:13 by cmuelle8, 14 years ago

Replying to bastiK:

However I'm not sure it is worth the effort as images are usually displayed on screen very soon after the scaling is performed / requested. Also images on a map are usually quite small.

I just don't want to block the main thread, say if the image is not mirrored yet and the network is slow to respond.. We did the same with wiki help (doing network related stuff in an asynchronous thread)

greetings

comment:14 by bastiK, 14 years ago

You have a point here. The problem is also present for styles with a lot of "http://" images like http://dev.openseamap.org/josm/seamark_styles.xml. If the server has slow response, JOSM may freeze for a long time. (At most 5 seconds times the number of displayed icons.) And restart does not really help, the user has to remove the style before it works again. The situation has been improved a little since it loads only those icons that are actually used.

But how can we solve this? Display an hourglass symbol instead of the image and repaint when the icon is fully fetched? This would be somewhat similar to asynchronous thumbnail generation for photos.

As the issue is not new, we could apply your patch first and think about this later.

in reply to:  14 comment:15 by cmuelle8, 14 years ago

Replying to bastiK:

You have a point here. The problem is also present for styles with a lot of "http://" images like http://dev.openseamap.org/josm/seamark_styles.xml. If the server has slow response, JOSM may freeze for a long time. (At most 5 seconds times the number of displayed icons.) And restart does not really help, the user has to remove the style before it works again. The situation has been improved a little since it loads only those icons that are actually used.

But how can we solve this? Display an hourglass symbol instead of the image and repaint when the icon is fully fetched? This would be somewhat similar to asynchronous thumbnail generation for photos.

As the issue is not new, we could apply your patch first and think about this later.

Yes, currently I use "misc/no_icon.png" as long as loading is not finished. I think the correct way to do asynchronous loading is

  • 1) putting icon.setImage(scaleblah..) into a Thread that does icon.wait() (put Thread to sleep)
  • 2) in ImageProvider in getAvailableImpl do the http fetching in another Thread (i refer to it as HTTP-Thread)
  • 3) when HTTP-Thread is done fetching it ought to do icon.notify() and wakes the Thread we created in NodeElemStyles (that waited with the Image loading)
  • 4) the Thread that awakes now has the correct image and all the variables of the cascade at hand to do the scaling (all within NodeElemStyle, where it belongs)

I'll provide an updated patch this weekend..

comment:16 by stoecker, 14 years ago

What is the status?

comment:17 by bastiK, 14 years ago

Summary: [PATCH] provide osm wiki image retrieval in ImageProvider[unfinished PATCH] provide osm wiki image retrieval in ImageProvider

comment:18 by stoecker, 14 years ago

Status: needinfonew

comment:19 by stoecker, 14 years ago

Status: newneedinfo

Can't this be applied without fixing the remaining (parallel loading) issues?

I think parallel loading is a topic for all files going through MirroredInputStream class anyway.

comment:20 by bastiK, 14 years ago

It can be applied, when md5 shortcut is implemented.

by cmuelle8, 14 years ago

Attachment: ImageProvider.rev4398.patch added

updated patch (using digestutils to build img urls)

comment:21 by cmuelle8, 14 years ago

Summary: [unfinished PATCH] provide osm wiki image retrieval in ImageProvider[PATCH] provide osm wiki image retrieval in ImageProvider

comment:22 by bastiK, 14 years ago

Very interesting... When 2 or more request for the same wiki image come shortly after one another, it would be better, if the later ones would wait for the first to finish.

Also it might be particularly costly, when an invalid wiki name is requested a lot of times, because it will always enter getImgUrlFromInfoPage(). I'm not aware of anything in JOSM core that requests a certain (user defined) image more than a dozen times, but something may come from a plugin (e.g. script within scripting plugin or whatever).

We could put a marker in cache, so it will not try again when it failed once (or just after a certain period of time).

in reply to:  22 comment:23 by cmuelle8, 14 years ago

I think those issues should in some way all be dealt with in MirroredInputStream.

Basically, MirroredInputStream could be aware of all it's asynchronously running instances and, if a request comes in that is not mirrored (i.e. in cache) yet, look if the request is currently fetched in another thread. Then wait for it to finish and

  • on success return the mirrored image for all waiting instances.
  • on failure add the url to an ExpiringCache

This would benefit all requests to MirroredInputStream, not just those done in ImageProvider. But those are issues separate to this patch, since they already existed before. Note that this works for FileInfoPages, that are fetched through MirroredInputStream, as well:

wget -O -  http://wiki.openstreetmap.org/wiki/File:DS13.JPG

--2011-09-04 14:33:01--  http://wiki.openstreetmap.org/wiki/File:DS13.JPG
Auflösen des Hostnamen »wiki.openstreetmap.org«... 193.63.75.26
Verbindungsaufbau zu wiki.openstreetmap.org|193.63.75.26|:80... verbunden.
HTTP-Anforderung gesendet, warte auf Antwort... 404 Not Found
2011-09-04 14:33:01 FEHLER 404: Not Found.

I.e. the parser does not run if MirroredInputStream does not return a document or throws an Exception (possibly because the url was in the proposed ExpiringCache).

Greetings

in reply to:  22 ; comment:24 by cmuelle8, 14 years ago

Actually, even without a polished MirroredInputStream (MIS) the parser should not run, if the image does not exist. Since a request to a FileInfoPage to this non-existing image would return 404 as well. It is possible though that MediaWiki returns different http codes depending on the user agent. E.g. if I feed a browser with

http://wiki.openstreetmap.org/wiki/File:DS13.JPG

I get a full blown html page, explaining that the image does not exist. Using wget, a 404 is returned. I have just confirmed that MIS also gets an 404 which causes an Exception to be thrown, so the sax parser does not run at all, if the image does not exist.

Nevertheless, not fetching the same resource in parallel remains an issue that is left for MIS to provide. But please notice that the issue exists only until the resource is cached.

in reply to:  24 ; comment:25 by bastiK, 14 years ago

Replying to cmuelle8:

Actually, even without a polished MirroredInputStream (MIS) the parser should not run, if the image does not exist.

That's good.

The whole asynchronous loading business probably needs a little more attention. I'd prefer to test it on normal "http://..." images first because these are already in use and we will get bug reports if there are problems.

Your approach (to replace the image of an ImageIcon object) looks elegant, but there are some drawbacks:

  • First the preliminary icon is shown on the map, but when loading is finished, no repaint is done.
  • The ElemStyles are cached and the preliminary Image causes wrong hash value. Consequently the wrong icon may be shown.

The "wiki part" of the patch is ready to be applied (after next release). I'd say we skip the parallel loading for now (as stoecker suggested) and open another ticket for it. If you like to proceed, that's great. Otherwise we can postpone the issue.

in reply to:  24 ; comment:26 by stoecker, 14 years ago

Replying to cmuelle8:

I get a full blown html page, explaining that the image does not exist. Using wget, a 404 is returned. I have just confirmed that MIS also gets an 404 which causes an Exception to be thrown, so the sax parser does not run at all, if the image does not exist.

A 404 is only a status code. A full html web-page can nevertheless have a 404 code (and also should have). Proper webservice give correct status code and a human readable web-page explaining the error.

in reply to:  26 ; comment:27 by cmuelle8, 14 years ago

Replying to stoecker:

A 404 is only a status code. A full html web-page can nevertheless have a 404 code (and also should have). Proper webservice give correct status code and a human readable web-page explaining the error.

Thx for the insight, but see above - the html content is, in fact, not delivered when using wget. I only get the full blown version using a browser, so MediaWiki probably messes with the user-agent for bandwidth reasons..

Greetings

in reply to:  27 comment:28 by stoecker, 14 years ago

Replying to cmuelle8:

Thx for the insight, but see above - the html content is, in fact, not delivered when using wget. I only get the full blown version using a browser, so MediaWiki probably messes with the user-agent for bandwidth reasons..

Wget gets the same response (you can check e.g. with wireshark). It simply does not display the data. See here.

comment:29 by cmuelle8, 14 years ago

good to know, again thx.

in reply to:  25 comment:30 by cmuelle8, 14 years ago

  • First the preliminary icon is shown on the map, but when loading is finished, no repaint is done.

This should be solvable.

  • The ElemStyles are cached and the preliminary Image causes wrong hash value. Consequently the wrong icon may be shown.

This really sucks. There is a comment in StyleCache I suppose you are refering to: "StyleCache must not be changed after it has been added to the intern pool." This would mitigate all efforts to do asynchronous resource loading.

Since resources effect hashCode calculation and hashCodes must not be manipulated after intern pool is built, we would have to wait building the intern pool until all resources are fetched.

Would it not suffice to just hash the name of the image? After all, if images use the same icon-image value in a style sheet, they evaluate to the same hashCode, just as the name values should, if they are identical. There would only be deviations to current behavior if a style writer uses the same image under different names - a case highly improbable.

friendly greetings

in reply to:  25 comment:31 by cmuelle8, 14 years ago

Replying to bastiK:

  • First the preliminary icon is shown on the map, but when loading is finished, no repaint is done.

In MapPainter.java, function drawNodeIcon(), line

icon.paintIcon ( nc, g,   [...]

nc is used as a Component implementing ImageObserver. Maybe we can leverage this and use AWT functionality for asynchronous image loading, no need to write code twice.

comment:32 by bastiK, 14 years ago

Resolution: fixed
Status: needinfoclosed

In [4403/josm]:

applied #6137 - provide osm wiki image retrieval in ImageProvider (based on patch by cmuelle8)

comment:33 by bastiK, 14 years ago

See #6797 for asynchronous loading.

comment:34 by bastiK, 13 years ago

Description: modified (diff)

Note to self:

There is a mediawiki API to get image metadata, e.g.

http://wiki.openstreetmap.org/w/api.php?action=query&prop=imageinfo&titles=Image:Residential.jpg&iiprop=url%7Ccontent&format=xml

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain cmuelle8.
as The resolution will be set.
The resolution will be deleted. Next status will be 'reopened'.

Add Comment


E-mail address and name can be saved in the Preferences .
 
Note: See TracTickets for help on using tickets.