#6137 closed enhancement (fixed)
[PATCH] provide osm wiki image retrieval in ImageProvider
Reported by: | cmuelle8 | Owned by: | cmuelle8 |
---|---|---|---|
Priority: | normal | Milestone: | |
Component: | Core | Version: | |
Keywords: | Cc: |
Description (last modified by )
patch has some comments
GetAvailableImpl could need some more cleaning.. I started by swapping the wiki code into a separate function (which was also needed, since I need the Icon object generated in GetAvailable for screen updates..
greetings
Attachments (2)
Change History (36)
by , 14 years ago
Attachment: | wiki-image-provider.patch added |
---|
comment:1 by , 14 years ago
Owner: | changed from | to
---|---|
Status: | new → needinfo |
Why do we need a special wiki fetcher? We can specify them as normal http links
follow-up: 6 comment:2 by , 14 years ago
cmuelle8: In case you didn't know, the path is generated from the first two chars of the md5 hash:
md5("Motorway-photo.jpg")=60fe690fae086e6da383a963c311c2d5 --> http://wiki.openstreetmap.org/w/images/6/60/Motorway-photo.jpg
comment:3 by , 14 years ago
@stoecker: we need the wiki fetcher, since some tags have relative urls only.. e.g. wiki:symbol
@bastiK: thanks, that makes fetching a whole lot easier. i actually looked for that in wiki help, google and what-not, but was unable to dig it up..
greetings
follow-up: 5 comment:4 by , 14 years ago
I'd suggest something like
node[wiki:symbol] { icon-image: eval(wiki(get_tag_value("wiki:symbol"))); icon-size: 20; }
Where wiki("Motorway-photo.jpg")
evaluates to "http://wiki.openstreetmap.org/w/images/6/60/Motorway-photo.jpg"
. No reason to mess with ImageProvider.
comment:5 by , 14 years ago
Replying to bastiK:
I'd suggest something like
node[wiki:symbol] { icon-image: eval(wiki(get_tag_value("wiki:symbol"))); icon-size: 20; }Where
wiki("Motorway-photo.jpg")
evaluates to"http://wiki.openstreetmap.org/w/images/6/60/Motorway-photo.jpg"
. No reason to mess with ImageProvider.
This looks nice, but it is not the same. Think of classic elemstyles.xml type of styles. An Expression function will not benefit these. Style writers who, for some reason, do not want to use mapcss, should get consistency, at least where possible. Moreover, I think that the asynchronous loading of these images from the web belongs into ImageProvider, from a good software practices view.
E.g. if I cripple class IconReference or even class Instruction with the asynchronous loading (and therefore intercept http:// requests before ImageProvider gets it that does synchronous loading) I split program logic at will. Good software practices should allow you to have it in ImageProvider.
I'm actually with you on the "icon-size" issue, but implementing it will probably mean inefficiency or impracticability (e.g. you will not be able to scale the icon in ImageProvider, and hence put it in the paint code for the node, where "icon-size" is known - then you will resize the original size cached image every time you paint the node which seems ridiculous from an energy consumption point of view :) ).
So let's just have it the way I wrote it with the MD5 optimization in finding the right URL. If someone later on does a more general overhaul of josms image handling he/she can move it out into a then more appropiate place. For now I think ImageProvider is actually the very right place to put it.
Greetings
comment:6 by , 14 years ago
Replying to bastiK:
cmuelle8: In case you didn't know, the path is generated from the first two chars of the md5 hash:
md5("Motorway-photo.jpg")=60fe690fae086e6da383a963c311c2d5 --> http://wiki.openstreetmap.org/w/images/6/60/Motorway-photo.jpg
This is not the same either, I just noticed. My approach recognizes interwiki linked images, these I cannot find with the MD5 method.
http://wiki.openstreetmap.org/w/images/f/f1/Signet_Elster_Radweg.png
does not exist, while
http://wiki.openstreetmap.org/wiki/File:Signet_Elster_Radweg.pngactually has a link.. maybe MediaWiki offers an API to make it easiert than SAX parsing the "File:" description page, but I was not successful finding one.
all the best,
Christian
follow-up: 8 comment:7 by , 14 years ago
added icon-size in [3999].
wikimedia commons is probably the only relevant foreign repository. If both URLs return 404, you can still do the html parsing.
follow-up: 9 comment:8 by , 14 years ago
Replying to bastiK:
added icon-size in [3999].
wikimedia commons is probably the only relevant foreign repository. If both URLs return 404, you can still do the html parsing.
Fair enough. Thx for the icon-size support. Is elemstyles.xml able to use that as well? Then I can strip the scaling part without further thought from the patch and add the 404 checks on MD5-built urls prior to parsing the description page.
Greetings
comment:9 by , 14 years ago
Replying to cmuelle8:
Thx for the icon-size support. Is elemstyles.xml able to use that as well?
If you want it for xml style, you have to create proper support, i.e. <icon ... size="20"/>
. I do not plan to port any new features to xml style format myself, but you are welcome to do so. :)
follow-ups: 11 12 comment:10 by , 14 years ago
well, having had the size encoded in the url would have worked for both. i do not plan to mess with it twice. Also, I do not see how to do asynchronous loading in ImageProvider anymore, since you generate the Icon in NodeElemStyles now and ripping the patch out of Image Provider, putting parts of it into NodeElemStyles does not seem to make any sense.
greetings
comment:11 by , 14 years ago
Replying to cmuelle8:
generate the Icon in NodeElemStyles now
this is not true, ok, you just scale and use setImage to modify it there. question is, it you want asynchronous loading at all, if and how to get width in the image loading thread to scale the fetched result. using icon.getImage().getWidth() seems messy..
follow-up: 13 comment:12 by , 14 years ago
Replying to cmuelle8:
well, having had the size encoded in the url would have worked for both.
wiki:30://File:Signet_Elster_Radweg.png
You put 2 independent pieces of information into one string. I'd consider this bad style unless there is a reason for it. What comes next, wiki:30:122://...
where 122 is the alpha value? It is also inconsistent because the same does not work for http://...
URLs. Actually the "30:" part is optional, so I don't really care...
Also, I do not see how to do asynchronous loading in ImageProvider anymore, since you generate the Icon in NodeElemStyles now and ripping the patch out of Image Provider, putting parts of it into NodeElemStyles does not seem to make any sense.
The problem of asynchronous loading is independent of the place where the scaling is done (NodeElemStyles or ImageProvider). If we want to scale the image such that it fits into a square of SIZE x SIZE, we need to know width and height of the image in advance. In principle it can be done using Image.getWidth(new ImageObserver() { ... });
. The image observer will report the image dimensions long time before the image is fully loaded.
However I'm not sure it is worth the effort as images are usually displayed on screen very soon after the scaling is performed / requested. Also images on a map are usually quite small.
comment:13 by , 14 years ago
Replying to bastiK:
However I'm not sure it is worth the effort as images are usually displayed on screen very soon after the scaling is performed / requested. Also images on a map are usually quite small.
I just don't want to block the main thread, say if the image is not mirrored yet and the network is slow to respond.. We did the same with wiki help (doing network related stuff in an asynchronous thread)
greetings
follow-up: 15 comment:14 by , 14 years ago
You have a point here. The problem is also present for styles with a lot of "http://"
images like http://dev.openseamap.org/josm/seamark_styles.xml. If the server has slow response, JOSM may freeze for a long time. (At most 5 seconds times the number of displayed icons.) And restart does not really help, the user has to remove the style before it works again. The situation has been improved a little since it loads only those icons that are actually used.
But how can we solve this? Display an hourglass symbol instead of the image and repaint when the icon is fully fetched? This would be somewhat similar to asynchronous thumbnail generation for photos.
As the issue is not new, we could apply your patch first and think about this later.
comment:15 by , 14 years ago
Replying to bastiK:
You have a point here. The problem is also present for styles with a lot of
"http://"
images like http://dev.openseamap.org/josm/seamark_styles.xml. If the server has slow response, JOSM may freeze for a long time. (At most 5 seconds times the number of displayed icons.) And restart does not really help, the user has to remove the style before it works again. The situation has been improved a little since it loads only those icons that are actually used.
But how can we solve this? Display an hourglass symbol instead of the image and repaint when the icon is fully fetched? This would be somewhat similar to asynchronous thumbnail generation for photos.
As the issue is not new, we could apply your patch first and think about this later.
Yes, currently I use "misc/no_icon.png" as long as loading is not finished. I think the correct way to do asynchronous loading is
- 1) putting icon.setImage(scaleblah..) into a Thread that does icon.wait() (put Thread to sleep)
- 2) in ImageProvider in getAvailableImpl do the http fetching in another Thread (i refer to it as HTTP-Thread)
- 3) when HTTP-Thread is done fetching it ought to do icon.notify() and wakes the Thread we created in NodeElemStyles (that waited with the Image loading)
- 4) the Thread that awakes now has the correct image and all the variables of the cascade at hand to do the scaling (all within NodeElemStyle, where it belongs)
I'll provide an updated patch this weekend..
comment:17 by , 14 years ago
Summary: | [PATCH] provide osm wiki image retrieval in ImageProvider → [unfinished PATCH] provide osm wiki image retrieval in ImageProvider |
---|
comment:18 by , 14 years ago
Status: | needinfo → new |
---|
comment:19 by , 14 years ago
Status: | new → needinfo |
---|
Can't this be applied without fixing the remaining (parallel loading) issues?
I think parallel loading is a topic for all files going through MirroredInputStream class anyway.
by , 14 years ago
Attachment: | ImageProvider.rev4398.patch added |
---|
updated patch (using digestutils to build img urls)
comment:21 by , 14 years ago
Summary: | [unfinished PATCH] provide osm wiki image retrieval in ImageProvider → [PATCH] provide osm wiki image retrieval in ImageProvider |
---|
follow-ups: 23 24 comment:22 by , 14 years ago
Very interesting... When 2 or more request for the same wiki image come shortly after one another, it would be better, if the later ones would wait for the first to finish.
Also it might be particularly costly, when an invalid wiki name is requested a lot of times, because it will always enter getImgUrlFromInfoPage(). I'm not aware of anything in JOSM core that requests a certain (user defined) image more than a dozen times, but something may come from a plugin (e.g. script within scripting plugin or whatever).
We could put a marker in cache, so it will not try again when it failed once (or just after a certain period of time).
comment:23 by , 14 years ago
I think those issues should in some way all be dealt with in MirroredInputStream.
Basically, MirroredInputStream could be aware of all it's asynchronously running instances and, if a request comes in that is not mirrored (i.e. in cache) yet, look if the request is currently fetched in another thread. Then wait for it to finish and
- on success return the mirrored image for all waiting instances.
- on failure add the url to an ExpiringCache
This would benefit all requests to MirroredInputStream, not just those done in ImageProvider. But those are issues separate to this patch, since they already existed before. Note that this works for FileInfoPages, that are fetched through MirroredInputStream, as well:
wget -O - http://wiki.openstreetmap.org/wiki/File:DS13.JPG --2011-09-04 14:33:01-- http://wiki.openstreetmap.org/wiki/File:DS13.JPG Auflösen des Hostnamen »wiki.openstreetmap.org«... 193.63.75.26 Verbindungsaufbau zu wiki.openstreetmap.org|193.63.75.26|:80... verbunden. HTTP-Anforderung gesendet, warte auf Antwort... 404 Not Found 2011-09-04 14:33:01 FEHLER 404: Not Found.
I.e. the parser does not run if MirroredInputStream does not return a document or throws an Exception (possibly because the url was in the proposed ExpiringCache).
Greetings
follow-ups: 25 26 comment:24 by , 14 years ago
Actually, even without a polished MirroredInputStream (MIS) the parser should not run, if the image does not exist. Since a request to a FileInfoPage to this non-existing image would return 404 as well. It is possible though that MediaWiki returns different http codes depending on the user agent. E.g. if I feed a browser with
http://wiki.openstreetmap.org/wiki/File:DS13.JPG
I get a full blown html page, explaining that the image does not exist. Using wget, a 404 is returned. I have just confirmed that MIS also gets an 404 which causes an Exception to be thrown, so the sax parser does not run at all, if the image does not exist.
Nevertheless, not fetching the same resource in parallel remains an issue that is left for MIS to provide. But please notice that the issue exists only until the resource is cached.
follow-ups: 30 31 comment:25 by , 14 years ago
Replying to cmuelle8:
Actually, even without a polished MirroredInputStream (MIS) the parser should not run, if the image does not exist.
That's good.
The whole asynchronous loading business probably needs a little more attention. I'd prefer to test it on normal "http://..."
images first because these are already in use and we will get bug reports if there are problems.
Your approach (to replace the image of an ImageIcon object) looks elegant, but there are some drawbacks:
- First the preliminary icon is shown on the map, but when loading is finished, no repaint is done.
- The ElemStyles are cached and the preliminary Image causes wrong hash value. Consequently the wrong icon may be shown.
The "wiki part" of the patch is ready to be applied (after next release). I'd say we skip the parallel loading for now (as stoecker suggested) and open another ticket for it. If you like to proceed, that's great. Otherwise we can postpone the issue.
follow-up: 27 comment:26 by , 14 years ago
Replying to cmuelle8:
I get a full blown html page, explaining that the image does not exist. Using wget, a 404 is returned. I have just confirmed that MIS also gets an 404 which causes an Exception to be thrown, so the sax parser does not run at all, if the image does not exist.
A 404 is only a status code. A full html web-page can nevertheless have a 404 code (and also should have). Proper webservice give correct status code and a human readable web-page explaining the error.
follow-up: 28 comment:27 by , 14 years ago
Replying to stoecker:
A 404 is only a status code. A full html web-page can nevertheless have a 404 code (and also should have). Proper webservice give correct status code and a human readable web-page explaining the error.
Thx for the insight, but see above - the html content is, in fact, not delivered when using wget. I only get the full blown version using a browser, so MediaWiki probably messes with the user-agent for bandwidth reasons..
Greetings
comment:28 by , 14 years ago
Replying to cmuelle8:
Thx for the insight, but see above - the html content is, in fact, not delivered when using wget. I only get the full blown version using a browser, so MediaWiki probably messes with the user-agent for bandwidth reasons..
Wget gets the same response (you can check e.g. with wireshark). It simply does not display the data. See here.
comment:30 by , 14 years ago
- First the preliminary icon is shown on the map, but when loading is finished, no repaint is done.
This should be solvable.
- The ElemStyles are cached and the preliminary Image causes wrong hash value. Consequently the wrong icon may be shown.
This really sucks. There is a comment in StyleCache I suppose you are refering to: "StyleCache must not be changed after it has been added to the intern pool." This would mitigate all efforts to do asynchronous resource loading.
Since resources effect hashCode calculation and hashCodes must not be manipulated after intern pool is built, we would have to wait building the intern pool until all resources are fetched.
Would it not suffice to just hash the name of the image? After all, if images use the same icon-image value in a style sheet, they evaluate to the same hashCode, just as the name values should, if they are identical. There would only be deviations to current behavior if a style writer uses the same image under different names - a case highly improbable.
friendly greetings
comment:31 by , 14 years ago
Replying to bastiK:
- First the preliminary icon is shown on the map, but when loading is finished, no repaint is done.
In MapPainter.java, function drawNodeIcon(), line
icon.paintIcon ( nc, g, [...]
nc is used as a Component implementing ImageObserver. Maybe we can leverage this and use AWT functionality for asynchronous image loading, no need to write code twice.
comment:34 by , 13 years ago
Description: | modified (diff) |
---|
Note to self:
There is a mediawiki API to get image metadata, e.g.
http://wiki.openstreetmap.org/w/api.php?action=query&prop=imageinfo&titles=Image:Residential.jpg&iiprop=url%7Ccontent&format=xml
fetch wiki:// style links in ImageProvider to interoperate with osm wiki as an icon source