Content Distribution Network (CDN) Performance

Download Report

Transcript Content Distribution Network (CDN) Performance

Content Distribution Network
(CDN) Performance
Punit Shah ([email protected])
CSE581 Internet Technologies
OGI, OHSU
2002, Jan 16th
Papers
• CDN, CDN Performance
– The measured performance of CDNs.
– On the use and performance of CDNs.
• Analytical model for CDN performance in
multi-level caching.
– Web caching and content distribution: A view
from the interior.
01/16/2002
CSE581, Winter 2002 | Punit Shah
([email protected])
2
What is CDN ?
• The CDNs are means to offload some or all of the
(mainly static content) content delivery burden
from the origin server. A replica server, which
delivers content on behalf of the origin server is
called a CDN server.
• Aimed to address …
– Client perceived latency (e.g. web browsers).
– Capacity management of the server.
– Caching as a side-effect.
01/16/2002
CSE581, Winter 2002 | Punit Shah
([email protected])
3
Request Redirection
• Primarily two ways to redirect request to the CDN servers.
– DNS redirection
Authoritative DNS server is controlled by the CDN infrastructure.
Distributes the load to the various CDN servers depending whatever
policy (e.g. round-robin, least loaded CDN server, geographical
distance etc.) using DNS trick.
– URL rewriting
Main page still comes from the origin server, but URL for the
embedded objects, e.g. images, clips are rewritten, which points to a
any of the CDN server. Some vendors rewrite using hostname and
some uses IP addr directly.
Some vendors do employ a combination of these two methods.
Not simple to find a nearest CDN server (in terms of latency).
01/16/2002
CSE581, Winter 2002 | Punit Shah
([email protected])
4
Full Site DNS redirection example
Origin Server
111.222.100.1
GET index.html
<HTML> …
<HTML>
www.yahoo.com/GET index.html
10.20.30.1
10.20.30.1 (not 111.222.100.1)
IP for yahoo.com
10.20.30.4
10.20.30.1
10.20.30.2
10.20.30.3
10.20.30.4
10.20.30.2
10.20.30.3
01/16/2002
CDN controlled
DNS Server
CSE581,Akami
Winter 2002
| Punit
Shah Island (Partial)
Vendors: Adero(Full),
and
Digital
([email protected])
5
Partial DNS redirect/URL rewriting example
index.html
<HTML>
<BODY>
<A HREF=“/about_us.html”> About Us </A>
<IMG SRC=“www.clearway1.net/www.yahoo.com/img1.gif”>
<IMG SRC=“www.clearway2.net/www.yahoo.com/img2.gif”>
<IMG SRC=“10.20.30.2/www.yahoo.com/img3.gif”>
</BODY>
</HTML>
Vendors: Clearway (URL RW)
01/16/2002
CSE581, Winter 2002 | Punit Shah
([email protected])
6
CDN performance elements
• Client perceived latency.
– That’s what most of the papers focused, as an
outsider.
• Load balancing among the CDN servers.
• Number of request offloaded from an origin
server.
01/16/2002
CSE581, Winter 2002 | Punit Shah
([email protected])
7
Analytical Model
• Gadde et al. derives CDN cachable ratio as
(Cni - Cnl)/(1 - Cnl)
– where
• Cni = CDN hit ratio for client population of size ‘ni’
who forwards to this CDN server for some fixed
object ‘x’
• Cnl = cache hit ratio at leaf node (e.g. proxy)
serving client population of size ‘nl’
01/16/2002
CSE581, Winter 2002 | Punit Shah
([email protected])
8
Model Performance
• More clients, less CDN cache
hit ratio.
• If number of clients increased
further, curve take a bell shape,
indicating cache ‘thrashing’.
• Model validated with the
NLANR cache hierarchy at the
‘root’ level (considering all root
level cache as an unified cache).
32% cache hit ratio in Oct
1999.
01/16/2002
CSE581, Winter 2002 | Punit Shah
([email protected])
9
CDN Server Selection
•
•
•
Primary paper [Johnson et al.] focuses on how ‘good’ (good == minimal client
latency) CDN server is selected by the Akamai and Digital Island. Both of
these uses partial site DNS lookup.
Used three distinct client locations in the US. Two east coast and one western
state. Clients were running different OS and different internet bandwidth.
Test Procedure
– Determine set of CDN servers (hostnames) used by the particular CDN.
– Obtain IP address of the CDN servers.
– Identify a GIF file (3-4KB), and retrieve this GIF from each of the CDN servers 25
times. Record time taken. Notice that DNS lookup time is not a factor, as IP addrs
are used.This test was conducted at all three client sites.
– Fetch same GIF via CDN server identified by contacting an origin server. Record
time taken. Modified gethostbyname()? or /etc/resolv.conf order.
Because TTL was quite small (10s of seconds). This tests were also conducted at all
three client sites for both of these vendors.
01/16/2002
CSE581, Winter 2002 | Punit Shah
([email protected])
10
Results
•
•
•
•
•
•
01/16/2002
Both vendors demonstrated identical
results.
Not very best CDN is chosen at some
locations.
Performance is highly location
dependent. Some location performed
much better than the others. Indicating
CDN server placement.
However >90% times reasonably good
server, with respect to particular
location is chosen.
For around 10% of times, rather
random choice would done better.
Conclusion: Doesn’t choose an optimal
CDN server, but avoids notably bad
CDN server.
CSE581, Winter 2002 | Punit Shah
([email protected])
11
Another location
01/16/2002
CSE581, Winter 2002 | Punit Shah
([email protected])
12
Other Results
• Focus is to compare Sep 2000 and Jan 2001 results.
• CDN server selection test results are identical to the what
we saw earlier.
• HTTP/1.1 results are better than HTTP/1.0 parallel
connection. V1.1 pipeline is faster than serial.
01/16/2002
CSE581, Winter 2002 | Punit Shah
([email protected])
13
Load balancing and DNS Lookup Overhead
• Till now we ignored DNS lookup time to focus on measuring quality
of the CDN server chosen.
• However not an insignificant overhead. Esp. considering very small
download time and TTL, e.g. Adero 10sec, Akamai and Digital Island
20sec. TTL for non-CDN origin site, cnn.com 15min, espn.com
6hours.
• Bala et al. conducted a test to measure DNS lookup overhead (and
latency) introduced by the CDN load balancing mechanism.
– Test procedure
• Store (fixed) IP addr for each CDN server at every 8 hours.
• During this 8 hours period, at every 30 mins., compare new IP returned with
previously retrieved (fixed) IP addr.
• Access DNS lookup time and download time for new IP addr returned.
• Compare download time with fixed IP addr.
01/16/2002
CSE581, Winter 2002 | Punit Shah
([email protected])
14
Results
•
•
•
•
•
•
In Jan 2001, 15% (Fasttide) to 70%(Digital Island) time new IP is same as fixed.
In above cases a new IP download time is identical to the fixed IP, but DNS
lookup overhead undermines overall performance.
10% of times, download from new IP addr is faster, but again DNS lookup …
30-40%(Akamai) times new download time is more then a fix IP addr, again
DNS lookup ...
New download time are more than fixed IP addr download time.
Overall redirection is not efficient.
01/16/2002
CSE581, Winter 2002 | Punit Shah
([email protected])
15
Some Facts ...
• CDN mainly used for image files (static contents).
• Content server by the CDN is a static in the nature. Only 0.3% content
changed for existing URLs and at the most 13% new URLs were
introduced.
• Black-box performance testing. So no data about load-balancing, only
latency.
• Large increase in deployment in the CDN between Nov 99 (only 1-2%
of top 670 sites) and Dec 2000 (25% of the popular sites).
• Akamai seems to be most popular CDN vendor.
• Images are 96-98% of the CDN served contents. But only 40-46% of
the CDN-served bytes. Rest is dynamic content ?
• CDN images cache-hit rate is 30-80%. Only 25-60% for non-CDN
served.
• Needs to map IP addrs with the geography for better CDN server
selection.
•01/16/2002
CDNs can not used forCSE581,
something
that involves authentication etc. 16
Winter 2002 | Punit Shah
([email protected])