Consistency Management - Wayne State University

Download Report

Transcript Consistency Management - Wayne State University

Caching And Prefetching For Web
Content Distribution
ECE -7995 Fall 2007
Presented By:-
Harpreet Singh
Sidong Zeng
Contents








Introduction
Overview: Proxy Caching Systems
Caching Challenges and solutions.
Cache Replacement and Prefetching.
Consistency Management.
Cache Co-operation.
Further Enhancements
Conclusion.
Introduction



WWW is internet widely used tool for
information access.
But now users often experience long access
latency due to network congestion.
In order to solve this problem Caching and
prefetching Techniques plays an important
role.
Proxy Caching System



Proxy caching :-Proxy is generally deployed at
network edge as an enterprise network gateway or
fire wall.
Proxy process internal client request either locally or
forward the request to remote server.
Proxy is shared by internal clients having similar
interest ,so its natural it cache commonly requested
objects.
Proxy Caching System

Proxy can’t satisfy a request, a cache
miss occurs.
Proxy Caching System

Proxy hit
Caching Challenges

Cache replacement and prefetching

Consistency management

Co-operative management
Issue regarding Web caching



Issue of size:-Proxy cache must be capable of handling
numerous concurrent user requests.
Heterogeneity in hardware and software
configurations,connection bandwidth, and access
behaviors which makes cache management tough.
loose coupling :-Proxy cache consumers (Web
browsers) and suppliers (servers) are loosely coupled. It
makes managing consistency and cooperation among
proxy caches particularly difficult.
Cache Replacement
Insufficient disk space, a proxy must
decide which existing objects to purge
when a new object arrives.
 Replacement algorith like LRU are used.
 LRU offers limited room for improvement.

Prefetching policies

Three types of prefetching policies

Mixed access pattern.

Per-client access pattern.

Object structural information.
Mixed access pattern.



This policy uses aggregate access patterns
from different clients, but doesn’t explore
which client made the request.
E.g Top 10 proposal ,Which uses popularity
based prediction.
This scheme determine how many objects to
prefetch from which servers using two
parameters.
Mixed access pattern.



This policy uses aggregate access patterns
from different clients, but doesn’t explore
which client made the request.
E.g Top 10 proposal ,Which uses popularity
based prediction.
This scheme determine how many objects to
prefetch from which servers using two
parameters.
Mixed access pattern(cont..)



M, the number of times the client has
contacted a server before it can prefetch.
N, the maximum number of objects the client
can prefetch from a server.
If the number of objects fetched in the
previous measurement period L reaches the
threshold N,the client will prefetch the K most
popular objects from the server, where K =
min{N, L}.
Per-client access pattern
Policy first analyzes access patterns on a
per-client basis, then uses the aggregated
access patterns for prediction.
 One client access which object at
particular time is analysed and according
to that prefetching is predicted.

Per-client access pattern


Markov modeling analysis tool,in which the
policy establishes a Markov graph based on
access histories and uses the graph to make
prefetching predictions.
Set of Web objects is represented as a node;
if the same client accesses two nodes (A and
B) in order within a certain period of time, the
policy draws a direct link from A to B and
assigns a weight with the transition
probability from A to B .
Per-client access pattern
Per-client access pattern



The probability of accessing B after A is 0.3.
The probability of accessing C after A is 0.7.
To make a prefetching prediction a search
algorithm traverses the graph starting from the
current object set and computes the access
likelihood for its successors; then prefetching
algorithm decide how many successors to
preload,depending on factors such as access
likelihood and the bandwidth available for
prefetching.
Object structural information



This scheme exploit the local information
contained in objects themselves.
Hyperlinks, for example, are good indictors of
future accesses because users tend to access
objects by clicking on links rather than typing
new URLs.
This Algorithm can also combine object
information with access-pattern based policies to
further improve predication efficiency and
accuracy.
Consistency Management
If the origin server updates an object
after a proxy caches it, the cached copy
becomes stale.
 Consistency Algorithm should ensure
the consistency between the cached
copy and the original object.

Consistency Algorithm

Consistency algorithm can be
classified:
–
–

strong consistency
weak consistency
If t is the delay between the proxy and
server, a strong consistency algorithm
returns object outdated by t at most.
Enforce Strong Consistency

Server-driven invalidation
–
–

Clients-driven validation
–
–

Server must invalidate a proxy’s copies before it
can update the objects.
Require extra space to maintain all objects’
states
The proxy validates the cached copies freshness
with the server for every cache-hit access
Generate numerous unnecessary messages
A hybrid approach is developed to balance
the space required to maintain states with
message volume that validations required
Weak Consistency

Generally supported by validation, in
which proxies verify the validity of their
cached objects with the origin server
–
–
TTL-based validation
Proactive polling
Cache Cooperation

The stand alone proxy has disadvantage
–
–


A single point failure
Performance bottleneck
Caching proxies collaborate with one another in
serving requests
Three kind of architectures for cooperative
caching proxies
–
–
–
Hierarchical caches architecture
Distributed cache architecture
Hybrid architecture
Cache Cooperation

Limitation of hierarchy
depth: most operational
hierarchies have only
three levels:
–
–
–
Institutional
Regional
National.
Hierarchical caches
Distributed cache architecture

All the participating proxy caches are peers.
Hybrid architecture

Combine the advantages of the hierarchical
and distributed caching.
Recent Researches



Caching dynamic content
Caching streaming objects
Security and integrity issues
Caching Dynamic Content

Contributes up to 40 percent of the total Web
traffic.

To improve the performance, developer have
deployed reverse caches near the origin
server to support dynamic content caching.
Caching streaming objects


Represent a significant portion of Web traffic,
such as music or video clips
Streaming objects have three distinctive
features
–
–
–

huge size
intensive bandwidth use
high interactivity
One solution is partial caching
Security and integrity
Difficult to protect it from various
attacks for stand alone proxy
 Establishing a trust model among
participants is a challenging for
cooperative proxies
 Intermediate proxy violates the SSL’s
functionality.

Conclusion
Proxy caching effectively reduces the
network resources that Web services
consume, while minimizing user access
latencies.
 Deploying Web caching proxies over
the Internet is complicated and difficult.
