Modify

Opened 5 years ago

Closed 5 years ago

#11193 closed defect (fixed)

[patch] Honor Expires/Cache-Control in OsmFileCacheTileLoader

Reported by: wiktorn Owned by: team
Priority: normal Milestone: 15.03
Component: Core imagery Version: tested
Keywords: WMS, TMS, cache Cc:

Description

I just noticed, that JOSM serves me old tiles, though Expires and Cache-Control header gives quite short time to live on server.

I've made two version, how we can approach this problem:

  1. Create additional metadata (expires) and store it along the tile file, but I find this quite wasteful (if no other metadata is stored, then it creates very small file). And then - compare this additional metadata when deciding, whether to serve the file from disk, or download

Although - in my case, I get etags, so I have these files anyway.

  1. Align last modification date of file, so it will take into account Expires / Cache-Control date, and probably rewind it to the past, so we will ask for the file exactly, when Expires-Cache-Control says so.

Attachments (8)

expires-version1.patch (3.1 KB) - added by wiktorn 5 years ago.
patch version 1
expires-version2.patch (2.4 KB) - added by wiktorn 5 years ago.
Patch version 2
expires-version1-v2.patch (8.2 KB) - added by wiktorn 5 years ago.
expires-version1-v3.patch (8.5 KB) - added by wiktorn 5 years ago.
expires-version1-v4.patch (8.3 KB) - added by wiktorn 5 years ago.
final version (I hope ;-))
expires-version1-v5.patch (8.4 KB) - added by wiktorn 5 years ago.
:-)
expires-version1-v6.patch (8.7 KB) - added by wiktorn 5 years ago.
fixed redownloads of no-tile-at-this-level, so they are not requeried from server
OsmFileCacheTileLoader.java.patch (1.1 KB) - added by wiktorn 5 years ago.
Overflow patch

Download all attachments as: .zip

Change History (27)

Changed 5 years ago by wiktorn

Attachment: expires-version1.patch added

patch version 1

Changed 5 years ago by wiktorn

Attachment: expires-version2.patch added

Patch version 2

comment:1 Changed 5 years ago by bastiK

Thanks for the patches! It will affect caching time a lot and fix #10849.

From the two options, I definitely prefer the first one. Here is my review:

  • When the server is down or there is no internet connection, the tile should always be loaded from disk, disregarding all expiry times (except maxCacheFileAge, which is currently set to infinity). For this, I suggest to keep the method loadTileFromFile unchanged, but adapt the argument when calling it the first time.
  • max-age should be in seconds, so I'm missing a factor of 1000 there.
  • getExpiration() returns 0 when the header is not present (not null)
Last edited 5 years ago by bastiK (previous) (diff)

comment:2 Changed 5 years ago by wiktorn

Thank you for comments. All are included. I reviewed the version 1 patch, and went with a bit bigger refactor:

  • try to load file/metadata only once from disk
  • resign from 7 days limit at all, it was used only if service was down -> so according to #10849, I decided just to use always what we have on disk

So now it goes like this:

  1. Try to load (open) file from disk - loadTileFromFile()
  2. Verify if its valid (i.e. not expired using old and new ways) - isTileFileValid()
  3. Load binary data - publishTile()
  4. Notify listeners (and redraw as I suppose)

If verification fails, new job is submitted and:

  1. Fetches the tile from network.
  2. If 304 is returned, return the file on disk (no matter how old, we trust the ETags)
  3. In other cases - rewrite the file
  4. Load binary data - publishTile()
  5. Notify listneres

If loading from network fails then:

  1. Fetch whatever we have on disk - publishTile()
  2. Notify listeners

I've also resigned from removing the files, if the case is that "no tiles at this zoom level", so we will keep this information (using our expiration policy) and reuse when needed.

I assume that we have at most one job to update specific tiles and it's guaranteed by synchronized(tile) section

Changed 5 years ago by wiktorn

Attachment: expires-version1-v2.patch added

comment:3 in reply to:  2 ; Changed 5 years ago by bastiK

Replying to osm@…:

Thank you for comments. All are included. I reviewed the version 1 patch, and went with a bit bigger refactor:

Thanks, great work! :)

  • try to load file/metadata only once from disk
  • resign from 7 days limit at all, it was used only if service was down -> so according to #10849, I decided just to use always what we have on disk

This may be a misunderstanding. Currently it will fetch a new version from the server, whenever the image is older than one week. It doesn't really matter if the service is down or not. (Refresh will fail if it is down.) If I read your patch correctly, this time has been changed to one day:

if (now - fileMtime > FILE_AGE_ONE_DAY) {

With your patch applied, this becomes the fallback value, in case the server doesn't send a max-age value in the header. I would prefer to go back to 7 days, as the most common use besides OSM tiles is aerial imagery and there we can have longer caching times.

So now it goes like this:

  1. Try to load (open) file from disk - loadTileFromFile()
  2. Verify if its valid (i.e. not expired using old and new ways) - isTileFileValid()
  3. Load binary data - publishTile()
  4. Notify listeners (and redraw as I suppose)

If verification fails, new job is submitted and:

  1. Fetches the tile from network.
  2. If 304 is returned, return the file on disk (no matter how old, we trust the ETags)
  3. In other cases - rewrite the file
  4. Load binary data - publishTile()
  5. Notify listneres

If loading from network fails then:

  1. Fetch whatever we have on disk - publishTile()
  2. Notify listeners

I like the new structure, it is clearer now. One minor thing that could be improved is the naming of methods and variables:

  • method loadTileFromFile() - doesn't actually load the tile. I would delete this method and put the code directly into loadTagsFromFile()
  • variable fileLoaded - rename to cachePresent or tagsLoaded?
  • method publishTile() - this is the function I would call loadTileFromFile()


I've also resigned from removing the files, if the case is that "no tiles at this zoom level", so we will keep this information (using our expiration policy) and reuse when needed.

Nice, I also thought about this.

I assume that we have at most one job to update specific tiles and it's guaranteed by synchronized(tile) section

Hm, I don't really trust the synchronization code here, but I don't think we need to fix it unless there is a problem.

comment:4 in reply to:  3 Changed 5 years ago by wiktorn

Replying to bastiK:

This may be a misunderstanding. Currently it will fetch a new version from the server, whenever the image is older than one week.
It doesn't really matter if the service is down or not. (Refresh will fail if it is down.) If I read your patch correctly, this time has been changed to one day:

if (now - fileMtime > FILE_AGE_ONE_DAY) {

With your patch applied, this becomes the fallback value, in case the server doesn't send a max-age value in the header. I would prefer to go back to 7 days, as the most common use besides OSM tiles is aerial imagery and there we can have longer caching times.

Ok, I've misunderstood current code. I've renamed the value to "DEFAULT_EXPIRES_TIME" and bumped its value to 7 days, so I hope, it will be clearer.

Changed 5 years ago by wiktorn

Attachment: expires-version1-v3.patch added

comment:5 Changed 5 years ago by bastiK

There seem to be problems with the X-VE-Tile-Info: no-tile header.. You can test this with Bing by zooming in to level 22 where this level isn't available (almost everywhere). I get messages like

WARNING: TMS - Error while loading image from tile cache: $JOSM_HOME/cache/tms/Bing/z20/x0y0/x5y3/x5y3/x5y7/x7y3/x2y5/x5y8.png (No such file or directory); Tile 20/555725/337358@Bing Aerial Maps

on the command line and the tags file doesn't get saved. Not sure if it worked before, but we might as well fix it now. :)

I can apply the current state of the patch (expires-version1-v3.patch) if you like.

comment:6 Changed 5 years ago by bastiK

Just noticed that Bing sends a max-age value 31535972, which is one year. I think this is too long: Bing updates their imagery regularly. If there was higher-resolution imagery available for my region, I wouldn't want to use the outdated cache for an entire year.
I would suggest to cap the max-age value to avoid excessively large caching times. (something between one week and a month for the max-age limit)

I don't expect you to address this with the current patch as it is not exactly a bug, but a "higher order problem". But if you do, please save the original header value in the tags file and check the limit when reading it back in. This way we can make the limit configurable in the preferences, if needed.

comment:7 Changed 5 years ago by wiktorn

I'll check both X-VE-Tile-Info: no-tile and max-age from Bing and send updated patch later today

comment:8 Changed 5 years ago by wiktorn

I fixed no-tile Exceptions. Enclosed revised patch.

As for max-age - I decided not to touch it at all. To have this working, I'd need to save both - max-age, and current time to make it work. And on the other hand - if Bing gives us such high max-age, then I'd say, that their cache policy is misaligned with their imagery update policy :-)

Anyway - there is a bigger problem that needs solving, and it will also fix this problem. I guess that there is no background jobs, that do clear the downloaded cached. For people that tend to map usually different places, it means that cache folder will grow indefinitely. If we will create a background job, that clears the cache of files older ~1 month, then we will solve both problems in one shot.

Changed 5 years ago by wiktorn

Attachment: expires-version1-v4.patch added

final version (I hope ;-))

Changed 5 years ago by wiktorn

Attachment: expires-version1-v5.patch added

:-)

comment:9 Changed 5 years ago by wiktorn

I just found one more bug here - that tiles that are valid in cache (those, for which http status 304 was returned) was not properly drawn

comment:10 in reply to:  8 ; Changed 5 years ago by bastiK

Replying to osm@…:

I fixed no-tile Exceptions. Enclosed revised patch.

The warnings on the command line are gone, but it still doesn't work. I.e. the no-tile information in the cache is ignored and the tile is downloaded again every time. Instead we should already know, that the tile doesn't exist and skip the download.

As for max-age - I decided not to touch it at all. To have this working, I'd need to save both - max-age, and current time to make it work.

What about the modification time of the tile file compared to the current time? This should give the tile age which can be compared to the max-age limit.

And on the other hand - if Bing gives us such high max-age, then I'd say, that their cache policy is misaligned with their imagery update policy :-)

That's my point. The only question is if we say "not our problem" or provide a fix for JOSM users. As I said, we can ignore this for now.

Anyway - there is a bigger problem that needs solving, and it will also fix this problem. I guess that there is no background jobs, that do clear the downloaded cached. For people that tend to map usually different places, it means that cache folder will grow indefinitely. If we will create a background job, that clears the cache of files older ~1 month, then we will solve both problems in one shot.

In a browser cache, you would set a limit, like 1 GB, and start deleting the oldest files when the limit is exceeded.

No matter how it will be implemented, this is pretty high on the top list of missing features in JOSM. But it's quite a challenging task, nothing one could hack on one or two evenings. :)

comment:11 in reply to:  10 ; Changed 5 years ago by wiktorn

Replying to bastiK:

Replying to osm@…:

I fixed no-tile Exceptions. Enclosed revised patch.

The warnings on the command line are gone, but it still doesn't work. I.e. the no-tile information in the cache is ignored and the tile is downloaded again every time. Instead we should already know, that the tile doesn't exist and skip the download.

I will look into this.

In a browser cache, you would set a limit, like 1 GB, and start deleting the oldest files when the limit is exceeded.
No matter how it will be implemented, this is pretty high on the top list of missing features in JOSM. But it's quite a challenging > task, nothing one could hack on one or two evenings. :)

Have you considered using for example:

The latter has the advantage of limiting the disk store size, while Commons limits only the number. I can look into incorporating one of this solution into JOSM within some other ticket, and close this ticket, with static limitation of max-age to one month.

Changed 5 years ago by wiktorn

Attachment: expires-version1-v6.patch added

fixed redownloads of no-tile-at-this-level, so they are not requeried from server

comment:12 Changed 5 years ago by bastiK

In [o31052]: applied #11193 - Honor Expires/Cache-Control in OsmFileCacheTileLoader (patch by anonymous)

comment:13 Changed 5 years ago by bastiK

In [o31053]: see ​#11193 - handle the case X-VE-Tile-Info: no-tile and no max-age header (use mtime from tags file)

comment:14 Changed 5 years ago by bastiK

In [o31054]: see ​#11193 - save original expires time send by server and apply limit when reading it back in

comment:15 in reply to:  11 Changed 5 years ago by bastiK

Milestone: 15.03

Replying to osm@…:

Replying to bastiK:

Replying to osm@…:

I fixed no-tile Exceptions. Enclosed revised patch.

The warnings on the command line are gone, but it still doesn't work. I.e. the no-tile information in the cache is ignored and the tile is downloaded again every time. Instead we should already know, that the tile doesn't exist and skip the download.

I will look into this.

Thanks for fixing this in the latest patch. I've added handling for one unlikely corner case in [o31053]. It doesn't really matter that much, but I moved the static max-age limit to a later stage in [o31054] (so the value can be changed later, if needed).

In a browser cache, you would set a limit, like 1 GB, and start deleting the oldest files when the limit is exceeded.
No matter how it will be implemented, this is pretty high on the top list of missing features in JOSM. But it's quite a challenging > task, nothing one could hack on one or two evenings. :)

Have you considered using for example:

The latter has the advantage of limiting the disk store size, while Commons limits only the number.

Yes, this is worth considering, but there is one major limitation: If we bundle it with JOSM core, the download size of the binary jar file shouldn't increase too much. So about 100, maybe 200 kByte should be okay, if it's above, we have to discuss with the team.

Sometimes, the size of a library can be reduces by stripping it to the bare minimum. If that's is not possible (or too much work) in this case, we have to look for more light-weight solutions.

I can look into incorporating one of this solution into JOSM within some other ticket, and close this ticket, with static limitation of max-age to one month.

Yes, please do. This ticket can be closed from my point of view.

comment:16 Changed 5 years ago by wiktorn

Resolution: fixed
Status: newclosed

Changed 5 years ago by wiktorn

Overflow patch

comment:17 Changed 5 years ago by wiktorn

Resolution: fixed
Status: closedreopened

When working futher on changes, I've noticed, that there is an Integer overflow. Forcing longs fixes issue.

comment:18 Changed 5 years ago by bastiK

In [o31055]: see ​#11193 - fix integer overflow (patch by wiktorn)

comment:19 Changed 5 years ago by bastiK

Resolution: fixed
Status: reopenedclosed

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain team.
as The resolution will be set.
The resolution will be deleted.

Add Comment


E-mail address and name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.