Transcript Slide 1

11
Web Caching
1
Web Protocols and Practice
WEB CACHING
Topics












2
Cache Definition
Goals of Web Caching
Motivations for Caching
What is Cacheable?
Protocol-specific Considerations
Content-specific Considerations
Where is Caching Done?
How is Caching Done?
Returning a Cached Response
Maintaining a Cache
Cache Replacement
Cache Coherency
Web Protocols and Practice
WEB CACHING
Cache Definition
 With the rapid increase of traffic on the Web,
caching was the first major technique that
attempted to
 reduce user-perceived latency
 reduce transmission of redundant traffic on the
network
 Cache is a local store of response messages.
 Cache is the movement of Web content closer to
the users.
3
Web Protocols and Practice
WEB CACHING
Goals of Web Caching
 The goals of caching are to reduce:
 The user-experienced latency between the time of
the initial Web requests and the time the response
is displayed by the user agent
» Reducing user-perceived latency has an important
implications not just for the user’s Web experience,
but also for content developers.
 The load on the network, which could be a local
area network or the Internet, by avoiding repeated
transmission of the same response
» Transferring only necessary information reduces the
overall congestion in the network
4
Web Protocols and Practice
WEB CACHING
Goals of Web Caching
» Reduction of congestion leads to improved
performance for everyone using the network,
because fewer packets are lost and there is less
need for retransmission resulting from packet drops.
 The load on the origin server by having an
intermediary on the path between the client and
the origin server handle the requests
» The origin server can handle more requests from a
diverse set of clients
5
Web Protocols and Practice
WEB CACHING
Motivations for Caching
 Web-hosting companies must pay for the
bandwidth they use and might want to increase
cacheability to reduce costs.
 The end users gain significantly from caching,
because their latency in obtaining a response is
lowered.
 Reducing traffic or moving it to the edge of the
network and away from the backbone would be
beneficial:
6
 Only necessary data traverses the network
 There is bandwidth available for other data
Web Protocols and Practice
WEB CACHING
Motivations for Caching
 Following delay factors affect on fetching a
resource:
 The network connectivity of the user to their ISP
and the connection between the ISP and the
Internet
 Unless the DNS lookup is cached, the DNS
lookup time to locate the server to contact, even if
the server being contacted is a proxy
 The congestion in the network and the bandwidth
available on the path between user and origin
server
7
Web Protocols and Practice
WEB CACHING
Motivations for Caching
 The load on the origin server
 The time to generate the response
 The time to render the response by the browser
8
Web Protocols and Practice
WEB CACHING
What is Cacheable?
 A cache can decide whether a response is
cacheable based on two factors:
 Protocol-specific considerations
» Protocol-specific caching considerations require that
a cache obey the various directives regarding
cacheability of a message.
 Content-specific considerations
» The content-specific requirements are affected by the
business requirements of a cache and policies that
affect the frequency of cache revalidation.
» The policies in turn may be affected by attributes of
the message, such as size or content type.
9
Web Protocols and Practice
WEB CACHING
Protocol-specific Considerations
 The request method, request header fields,
response status, and response headers all have
to indicate that the response is cacheable.
 Responses to the OPTIONS, PUT, and DELETE
methods are not cacheable.
 Responses to the POST method are not
cacheable unless the response has the
necessary Cache-Control and Expires headers.
 If a cache does not support the range header,
any response that has a response status code of
206 Partial Content cannot be cached.
10
Web Protocols and Practice
WEB CACHING
Protocol-specific Considerations
 Some responses include resource-specific
information from the origin server that may
preclude caching of the message. Such
information is of two kinds:
 Cacheability information
» If the response includes the cacheability information,
the decision to cache should be driven by that.
» For example, the server might provide explicit
freshness duration via headers such as Expires.
» If the time specified in Expires is a short time away
from the time the response was received, the source
may not be cached.
11
Web Protocols and Practice
WEB CACHING
Protocol-specific Considerations
 Cache directives
» The Cache-Control directive may preclude caching of
certain responses.
Cache-Control: private – A shared cache must not
cache the response.
Cache-Control: no-store – A cache must not store a
response message. This directive can appear in a
request or response.
Cache-Control: no-cache – A cache must not cache
the response, because the cached response would
have to be revalidated each time before it is returned
as a possible cache hit.
The Authorization request header indicates that the
requested resource is not available for everyone and
can not be cached.
12
Web Protocols and Practice
WEB CACHING
Protocol-specific Considerations
The Vary header indicates that an acceptable cached
response would be constrained by the values specified
in the Vary header.
13
Web Protocols and Practice
WEB CACHING
Content-specific Considerations
 Just because a resource is cacheable does not
mean that it will be cached.
 Messages could be large, dynamically
generated, or include cookies, all of which could
affect cacheability of a message.
 Cache policy may be driven by factors such as
attributes of a message.
 The frequency with which caches revalidate
resources with the origin server.
14
Web Protocols and Practice
WEB CACHING
Content-specific Considerations
 A shared cache may not want to cache
responses to queries that have personal
information.
 Active Server Pages (ASP) and requests for
documents triggering authentication are not
good candidates for caching.
 large resources may not be cached even though
they may be cacheable.
15
Web Protocols and Practice
WEB CACHING
Content-specific Considerations
 The basic assumption in caching is that the
same response is likely to be generated in the
future, and a request for such a response might
occur in the near future.
 The presence of cacheability information in a
dynamic response such as an Expires or ETag
header may indicate that the resource is actually
cacheable.
16
Web Protocols and Practice
WEB CACHING
Content-specific Considerations
 Responses that include data tailored to a
specific user may be viewed as uncacheable.
 Responses with cookie information in them are
considered uncacheable.
 The decision to cache is affected by the rate of
change of resources.
 Examining the rate of change of a resource is a
valid metric for deciding cacheability.
17
Web Protocols and Practice
WEB CACHING
Content-specific Considerations
 One early heuristic for deciding on the
cacheability of a resource was the last
modification time of a resource.
 The load on a cache may also have impact on
whether a response should be cached.
18
Web Protocols and Practice
WEB CACHING
Where is Caching Done?
 Caches are found in browsers and in any of the
Web intermediaries between the user agent and
the origin server.
 A cache is located in a proxy, in addition to in a
browser.
 A browser cache can avoid having to refetch
pages the user examined during the same
session. However, a browser cache does not
take advantage of frequently requested
resources by other users in the same local
environment.
19
Web Protocols and Practice
WEB CACHING
Where is Caching Done?
 A caching proxy can help dozens of users.
 A browser cache can store a reasonable set of
recently received responses for a longer time
than a caching proxy.
 A caching proxy, being a resource shared by
hundreds of users, may have to evict some
responses sooner than a browser cache.
 A regional cache can help several
geographically colocated caches in one or more
administrative entities.
20
Web Protocols and Practice
WEB CACHING
Where is Caching Done?
 A national cache can group a set of regional
caches and help reduce costs in countries facing
high traffic for moving data across national
boundries.
 In a reverse proxy, caching occurs on behalf of
origin servers and not on behalf of users.
 Interception proxies can be placed anywhere on
the network and can examine the network and
transport layer of the protocol stack.
21
Web Protocols and Practice
WEB CACHING
How is Caching Done?
 First, a cache must decide whether a message
is cacheable, then decide if space is available
and, if not, how to replace some of the existing
cached objects.
 The cache, upon receiving a request must
decide whether it can satisfy the request and, if
so, return the cached response while updating
some information.
 The cache must have a coherency policy for
maintaining freshness information of the cached
resource.
22
Web Protocols and Practice
WEB CACHING
How is Caching Done?
 The common criteria used to decide on
cacheability of a message are as follows:
 Are there protocol requirements that prevent the
response from being cached?
 Is the content typically uncacheable?
 Is the cached response likely to be reused again?
 Will the decision to cache a particular response
lead to replacement of one or more resources?
23
Web Protocols and Practice
WEB CACHING
How is Caching Done?
 After deciding to store the message, the cache
checks to see whether the message can be
stored without evicting other objects from the
cache. If not, the cache replacement algorithm is
triggered.
 Often, resources known to be stale are evicted
from a cache even if the cache is not full.
24
Web Protocols and Practice
WEB CACHING
How is Caching Done?
 This reduces the need for triggering the cache
replacement algorithm at the time a request is
being handled, thus lowering user-perceived
latency.
 Once space becomes available, the cache
extracts information about the message, such as
last modification time, and expiry, or stalenessrelated information.
 Message headers like Expire and CacheControl: max-stale carry information about
expiration.
25
Web Protocols and Practice
WEB CACHING
How is Caching Done?
 Expire and Cache-Control header fields help the
cache comply with restrictions on the length of
time a cached response can be returned as a
valid response.
 In the absence of specific expiration time
information in the message, the cache uses a
heuristic expiration time to decide when the
message becomes stale.
 The heuristic expiration time could be based on
the Last-Modified time associated with the
resource.
26
Web Protocols and Practice
WEB CACHING
How is Caching Done?
 A cache could add a fixed amount of time, say
ten minutes, to the Last-Modified value and use
that as a freshness interval.
27
Web Protocols and Practice
WEB CACHING
Returning a Cached Response
 When a response is found in the cache, a
“cache hit” has occurred.
 A revalidation may be performed to ensure that
the cashed response is still fresh.
 If revalidation indicates that the response is still
fresh, the request is satisfied from the cache.
 Otherwise, the cache gets a new copy of the
resource and uses its caching policy while
forwarding it to the client.
28
Web Protocols and Practice
WEB CACHING
Returning a Cached Response
 If the request is not found in the cache (i.e., a
“cache miss”), the request is forwarded.
29
Web Protocols and Practice
WEB CACHING
Maintaining a Cache
 Periodically, a cache may check to see if the
objects in the cache are still fresh and trigger
eviction of stale objects.
 A cache might want to prevalidate popular
objects to ensure that more frequently requested
objects are fresh.
 Prevalidation could be done via the HTTP HEAD
request.
30
Web Protocols and Practice
WEB CACHING
Maintaining a Cache
 A cache could also contact the origin server to
see if the resource has changed and, if so,
prefetch it to update its cache.
 Such approaches trade off bandwidth against
latency.
31
Web Protocols and Practice
WEB CACHING
Cache Replacement
 Once the cache is full, the objects must be
removed to make room to cache new responses.
 The caching approaches consist of a
combination of a set of metrics that includes the
size of cached objects, their content type, and
even a notion of network distance to the origin
server.
 The usefulness of retaining a response in the
cache can be gauged by the following factors:
 Cost of fetching the resource
» keep resources that were expensive to fetch
32
Web Protocols and Practice
WEB CACHING
Cache Replacement
 Cost of storing the resource
» Large resources take more space, but if they were
replaced, fetching them again would also be more
expensive.
 The number of accesses to the resource in the
past
» keep objects that have been accessed many times in
the past
 The probability of the resource being accessed in
the future
» If a resource is likely to be retrieve in the near future,
it would not make sense to remove it from the cache.
33
Web Protocols and Practice
WEB CACHING
Cache Replacement
 The time since the last modification of the
resource
» keep resources that have not been modified for a
long time.
 The heuristic expiration time
» remove resources that are close to their expiration
time.
34
Web Protocols and Practice
WEB CACHING
Cache Replacement
 Several algorithms for replacement have been
proposed:
 Least Recently Used (LRU)
» Removes the oldest object (in terms of the time at
which it was last accessed) from the cache.
» Objects that have been accessed more recently are
likely to be accessed again, and so less accessed
objects should be evicted.
 Least Frequently Used (LFU)
» Ranks the objects in terms of frequency of access
» Removes the object that is the least frequently used
35
Web Protocols and Practice
WEB CACHING
Cache Replacement
 Size of object (SIZE)
» Delete the largest object in the cache
 Hyper-G (LFU/LRU/SIZE)
» Combines the LFU, LRU, and SIZE policies.
» First consideration for replacement is LFU, then
LRU, then SIZE
 GreedyDual-Size
» Associates a utility value for each resource
» Replaces the resource that has the lowest utility
» utility uses the cost of fetching the resource, its size,
and age (that is updated as resources leave the
cache).
36
Web Protocols and Practice
WEB CACHING
Cache Replacement
 Cache replacement has generally faded from the
practical arena for the following four main
reasons:
 Steadily falling cost of storage leads to caches of
sizes large enough to hold most of the resources
requested.
 An overall reduction in the fraction of traffic that is
cacheable.
 The “good-enough” algorithms that satisfy most
situations in which cache replacement is used.
Algorithms such as Greedy Dual-Size and HyperG are in the good enough category.
37
Web Protocols and Practice
WEB CACHING
Cache Replacement
 Change in resources over time reduces the value
of having a large cache that can store them
longer.
38
Web Protocols and Practice
WEB CACHING
Cache Coherency
 A cache may have to ensure that a cached
response is still fresh before returning it to the
client requesting the resource.
 Caches may simply return an older cached value
due to:
 The connection to the origin server is down
 The cache is busy
 The most common approach in the Web to
check the coherency is to send a GET or a
HEAD request with an If-Modified-Since request
header.
39
Web Protocols and Practice
WEB CACHING
Cache Coherency
 Entity tags, in conjunction with the If-Match
header, can be used to perform coherency
checks against specific variants of a resource.
 If a caching proxy sends a revalidation request
each time a cache hit occurs, the policy is called
strong consistency.
 If the cache uses a heuristic to decide whether
the cached response is still fresh, without the
consulting the origin server each time a cache
hit occurs, such a policy is called weak
consistency.
40
Web Protocols and Practice
WEB CACHING
Cache Coherency
 The following two heuristics are among weak
consistency approaches:
 A lease-based approach
» A cache agrees to store a response for a fixed
amount of time (the lease period) without
revalidating.
» The server promises to notify the cache if a cached
resource changes within the lease period.
 A time to live (TTL) approach
» Responses have a cache expiration time associated
with them.
» When the time interval passes, the responses are
considered stale.
41
Web Protocols and Practice
WEB CACHING
Cache Coherency
» The TTL value can vary with the response and can
be based on the following factors:
The expiration time specified in the response header
field
The frequency of request for a cached resource
Mobile environment
The last modification time of the resource
 Maintaining consistency can have a serious
impact on cache response time because each
revalidation request has the overhead of
contacting the origin server.
42
Web Protocols and Practice
WEB CACHING
Cache Coherency
 The dominance of the connection cost to the
origin server points to the need for reducing the
number of revalidation requests.
43
Web Protocols and Practice