To cache or not to cache

Retired
9th May 2012

A colleague recently asked us about offering some advice on delivering a fast web service  to a client who was using a combination of ArcGIS for Desktop and the web. The datasets are from the Ordnance Survey’s OpenData offering.

 

 

The datasets included:

1. OS Street View.

2. OS Vector Map District.

3. OS 50k  Raster.

4. OS 250k Raster.

5. OS Miniscale.

 

Their challenge was working out whether to deliver a cached map service to a client, a dynamic map service or a mixture of a cached map service and the original data as a Geodatabase? The issues he needed to deal with included the time it took to create the map caches versus performance versus utility versus cost. A multi-layered issue but ultimately it boiled down to answering the question ‘to cache or not to cache?’

Actually, his question was answered initially by another question, why bother? Consider signing up with Esri UK’s free cloud-enabled map services and become productive immediately. Leave the caching work to us. This is a route that a lot of organisations are already doing and Esri UK has been happily serving up fast, visually pleasing map services since the service went live in late 2011. However, the client in question was ‘air-gapped’ and could not connect to the internet for security reasons.

So back to the original question. Our recommendation is that it’s often best to cache your data rather than serve it out dynamically.

Server Performance

The resultant CPU and RAM for serving a cached map is tiny compared to the very CPU and RAM intensive operations of rendering the same image dynamically. Most applications will be unable to scale up to heavier use if the base map is not cached.

A map service that is cached can therefore scale very easily as more and more users are added.  Esri UK’s cloud-enabled map services are happily serving a vast number of map views a month and a high level of concurrent users with little effort.

A dynamic service would be unable to get close to the performance of a cached map service as users increase. Parity between dynamic and cached map services exists only at very low levels of usage, say one user. Also, LocalView Fusion[1] templates are designed for cached base maps and some functionality is limited when a dynamic base map is used.

Architecture

There are also architectural differences when using a cached map service:

 o   A web browser can cache tiles it has requested previously, thus reducing the number of hits each client makes on the services over time. Similarly, most users may have some sort of cache or proxy server that also performs the same task but at the network level.

 o   Web servers are designed and optimised for serving lots of files off disk and/or memory; it is how web apps should work.

 o   By comparison dynamic images usually involve a number of resource intensive tasks including: data search, results retrieval, image generation, disk I/O to save the image to disk (even when streaming as MIME[2]) and other such steps.

Other Considerations

Of course, a cached map service isn’t a magic bullet to all of one’s requirements. There are a number of issues to think about: 

  1. Time: One needs to spend time (in some cases, a lot of time) to produce the map cache. The process is dependent on many factors: the number of scales, image type and resolution to pick out three. With the need to potential roll out updates quickly, cached map services may not be the most agile option.[3]
  2. Size on disk: Disk space usage will be a concern as a map cache can take up a significant amount of space. The Esri UK map services currently have a disk foot-print of 3 terabytes (and growing) – all this has a cost in storing and managing.  
  3. Caching Processes: Resources required for creating cache – multi-servers may be required to reduce time but it increases cost. Faster servers may need to be provisioned.
  4. Update Frequency: If the data changes a lot, then it potentially means there is a need to re-cache the data; either all of it the changed portions. Workflows need to be developed to enable this introducing an overhead.
  5. Less Dynamic: If a client needs to perform client side processing such as online digitising, dynamic rendering of layers or some other function; a map cache may not be suitable.
  6. Scale limits: The largest available scale of a map cache may not contain enough detail for certain workflows and organisations.

So in some cases a more subtle and nuanced workflow may be needed such as combining the advantage of both dynamic and cached map services. For example, a client may require data to be available down to 1:500, 1:250 or 1:100 which is just not practical to build map caches using current prices and resources.  So a viable solution would be to use vector data from a File Geodatabase as a base map for these scales, then utilising cached raster maps as the base map once you’ve zoomed out far enough.

For those interested you can see the cache levels and hardware specifications we typically use for building national caches here.

 


[1] Esri UK’s Platform-as-a-service offering

[2] http://en.wikipedia.org/wiki/MIME

[3] Though many options and alternative workflows exist to maintain a continuously changing map cache; we will explore some of these options in future blog posts.

[4] http://aws.amazon.com/ec2/instance-types/

Retired