Technologies for Building Content Delivery Networks

Download Report

Transcript Technologies for Building Content Delivery Networks

Technologies for Building
Content Delivery Networks
Pei Cao
Cisco Systems, Inc.
[email protected]
What are Content Delivery
Networks
• A centrally managed network of devices
that collectively facilitate the delivery of
content to end users
• Solve network bandwidth bottleneck
• Solve server throughput bottleneck
CDN Categories
• Network Infrastructure:
– Single ISP
– Overlay networks
– Enterprise premise
• Content types:
– Static images and texts
– Multimedia content: audio and video streams
– Dynamic HTML and XML pages
• Customers:
– Content providers
– Enterprise
Technology Components
• Content distribution
– Placing the content to the devices
• Request routing
– Steer users to a delivery node that is close
• Content delivery
– Protocol processing, access control, QoS mechanisms
• Resource accounting
– Logging and billing
Content Distribution
• Goal: position content objects into delivery
devices
• Different content types use different
techniques
– Static images and texts: pulled & cached, or
pushed
– Multimedia contents: usually pre-positioned
– Dynamic pages: requires prior setup
Distribution Mechanisms
• HTTP request for pulling
– Example: standard HTTP reverse proxy
• FTP of tar files
– Some equipment vendors use this technique
• Rate limited tree-form replication
– Example: Cisco’s “Soda” algorithm
Distribution Mechanisms using
Multicast
• Application-level reliable multicast
– Example: Inktomi’s Fast-Forward
• Unreliable IP multicast with file-level error
correction
– Example: Digital Fountain, multicast-ftp
• Unreliable IP multicast
– Example: RealNetworks
Content Consistency
Mechanisms
• Expiration times or TTL
• Renaming in the HTML file
• Web Cache Invalidation Protocol (WCIP)
– Nodes receive invalidations when objects
change
– Objects are organized into channels
– Nodes subscribe to a channel to receive
invalidation
Request Routing
• Goal: steer the client such that it fetches the
content from a close node
• Methods
– DNS selection
– HTTP redirection
– Transparent interception
Overview of Request Arrival
Process
How a request for www.xyz.com/index.html arrives at 1.2.3.4:
DNS
server
1. what is IP addr of
www.xyz.com?
Client
6. 1.2.3.4
Router
2. where is name server Root NS
of xyz.com?
3. NS record: 1.2.3.1
4. what is IP of
www.xyz.com?
xyz.com IP: 1.2.3.1
NS
5. A record:1.2.3.4
7. GET /index.html
s
w
i
t
c
h
Server
IP: 1.2.3.4
DNS selection
• Basic idea: xyz.com’s NS returns node close to
client
• How to become xyz.com’s NS?
– Rewrite URLs (aka Akamizer)
– Take a subdomain cdn.xyz.com and put all content
there
• Accuracy limited to client’s name server
– Only suitable for ISP or overlay networks
– Not suitable for some enterprise or cable networks
HTTP Redirection
• Basic idea: web server tells client to go
somewhere else
– Returns “302 redirect … 1.2.4.5/index.html…”
• Mostly used for multimedia objects
– These objects are usually put together in an
index file (.sml or .asx) and clients fetch the
index file via HTTP before streaming
• Accuracy is at individual client level
– More suitable for enterprise and cable networks
Transparent Interception
• Router and switch along the request path
can send the request elsewhere
• Mostly used for distributed data centers
front-ended with L7 switches
– Example: Cisco’s CSS11k WebNS
Algorithms for Request Routing
• Map-based
– Create a map of the Internet based on AS
domains, pick the node with the shortest hop
count to client
– Or, set up coverage zones mapping a node to a
collection of subnets
• Racing-based
– Let the delivery nodes all race to the client with
A-records
– Winner is selected by client automatically
The Boomerang Algorithm
• Cisco’s research published in WCW’01
– xyz.com’s NS server forwards lookup of
www.xyz.com to all delivery nodes
– Delivery nodes all send “A record” response
with its own IP address to the client
– The one that reaches the client first wins
– NS server times the forwarding so that lookup
message arrives at all nodes around the same
time
– Use “simulated annealing” for scalability
Interaction between Content
Distribution and Request Routing
• Don’t route request to a node that doesn’t
have the content!
• Particularly important for large streaming
contents
– Such content are usually pre-positioned to
ensure high-bandwidth playbacks
• Nodes need to report its content acquisition
status to the “request router”
Content Delivery
• Goal: serve content to each client at desired
quality of service
• Supported protocols
–
–
–
–
HTTP
Microsoft MMS
Open standard RTP/RTSP
RealNetworks RTP/RTSP
• Usually part of the larger CDN system
Content Access Control
• Content object attributes
– “Publication date” and “Expiration date”
– ACL based on user/group/IP
• User authentication
– HTTP basic
– Microsoft NTLM for enterprise environment
– other schemes
• Media Rights Management
QoS of Content Delivery
• Server QoS
– Server needs to make sure it has enough CPU and
disk to service the stream at specified bit rate
• Network QoS
– Interoperate with routers via DiffServ bits
• Coordination with request router
– delivery devices should communicate load
information to the “Request Router”
Resource Accounting
• Mining the log files
– Log file aggregation: all device sending log
files to a central location
– Local mining: analyzing the log file at each
delivery device
• Real-time statistics
– Real-time statistics on throughput/latency based
on domain, content type or any HTTP header
– Example: Cisco CSS switch billing MIB
Cisco’s CDN Products
•
•
•
•
Content Distribution Manager (CDM)
Content Router (CR)
Content Engine (CE)
CSS switch
Summary
• Main components of building a CDN:
–
–
–
–
Content distribution
Request routing
Content Delivery
Resource accounting
• A CDN system requires the four components
to work in concert with each other!
• Cisco is the only vendor that provide the full
solution!