EuroSYS'08 - Fudan University
Download
Report
Transcript EuroSYS'08 - Fudan University
Having Fun with P2P at HUST
(gridcast+pktown)
Bin Cheng, Xiaofei Liao
HUST & MSRA
Huazhong University of Science & Technology
Microsoft Research Asia
AisaSys, Shanghai, Nov. 12, 2008
1
GridCast: Video-on-Demand with Peer Sharing
Motivation
― Desirable but costly
Key issues
― Peer organization, scheduling, data distribution
Previous studies
― Measurement, optimization, caching, replication
Ongoing work
― Examine scheduling policy based on simulation
Future work
― Analysis, model
2
Motivation of P2P VoD
VoD is more and more popular, but costly
―Hulu, YouTube, Youku
P2P has helped lots of applications
―File sharing
―Live streaming
How does P2P help VoD?
―Real-time playback
―Random seek
―The long tail of content
―With acceptable UX, how to minimize server load?
3
GridCast: Hybrid Architecture
―
―
―
―
Tracker: indexes all joined peers
Source Server: stores a complete copy of every video
Peer: fetches chunks from source servers or other peers
Web Portal: provides the video catalog
tracker
Web portal
Source Server
4
What does GridCast look like?
http://www.gridcast.cn
5
Deployment of GridCast
GridCast has been deployed on CERNET since May of 2006
― Network (CERNET)
•
•
1,500 Universities, 20 million hosts
Good bandwidth, 2 to 100Mbps to the desktop (core is complicated)
•
•
1 Windows server 2003, shared by the tracker and the web portal
2 source servers (share 100Mbps uplink)
•
•
•
2,000 videos
48 minutes on average
400 to 800Kbps, 600 Kbps on average
•
•
100,000 users (23% behind NATs)
400 concurrent users at peak time (limited by our current infrastructure)
•
40GB log (from Sep. 2006 to Oct. 2007)
― Hardware
― Content
― Users
― Log (two logs, one for SVC, the other for MVC)
6
Key research issues in GridCast
How to organize online peers for better sharing?
How to schedule requests for smooth playback?
How to optimize chunk distribution over peers?
7
Previous work: ring-assisted overlay
Assumptions
―Huge number of peers watching the same video
―Each peer only caches the recently-watched 5-minute data
RINDY: ring-assisted overlay network for P2P VoD
―Each peer keeps a set of neighbor
―Near neighbor for sharing, far neighbor for routing
―Gossip + exchange neighbor list
Advantages
―Fast relocation of new neighborhood
―Load balance
―Efficient content sharing
8
Previous work: measurement study
User behavior
― Random seek is not uncommon (4 seeks per view session on average)
― Forward is dominated (forward/backward = 7/3)
― short seek is dominated (80% < 5 minutes)
― The long tail
Performance
― Simple prefetching helps to reduce seek latency
― Even moderate concurrency improves system utilization and UX
― The correlation of UX to source server stress and concurrency
9
Previous work: from SVC to MVC
Single Video Caching (SVC)
― Only cache the current video for sharing
Multiple Video Caching (MVC)
― Cache recently watched videos with at most 1GB disk space
― Join in multiple swarming
From SVC to MVC
― 90% users have over 90% unused upload and 60% unused download
― Upper bound achieved from simulation
100
upload
100
90
80
70
download
60
50
40
30
20
10
0
0
10
20
30
40
50
60
70
users (normalized)
80
90
100
source server load (Mbps)
unused bandwidth capacity (%)
110
single video caching
multiple video caching without resource constraints
80
60
40
20
0
Wed.
Thur.
Fri.
Sat.
day of week
Sun.
Mon.
Tue.
10
Previous work: examining caching policies
Major results
―Improve both UX and scalability!
―Higher concurrency, better sharing
―Larger scale, higher sharing utilization
11
Previous work: examining caching policies
Limitation
―Larger cache is not always better
• Hot-spots, load imbalance is more serious
―Departure miss is a major issue
• 43% chunk misses are caused by peer departure
percentage of all played chunks (%)
40
43%
35
30
27.6
25
20
15.6
15
11.3
10
5.3
4
5
0
new content
peer departure
peer eviction connection issue
insuf. BW
12
Previous work: proactive replication
Basic idea
―Push chunks to other peers before leaving
Fundamental tradeoff
―Cost: use more bandwidth and disk space
―Benefit: cover more future requests, reduce misses
Three questions
―Which, where, when?
Two predictors as its building blocks
―Peer departure predictor
―Chunk request predictor
13
Previous work: proactive replication
Major results
― 50% decrease of chunk misses, 15% decrease of server load
― Lazy simple is close to lazy oracle
― Aggressive replication leads to bad performance due to higher cost
5000000
before replication
eager replication (efficiency = 0.21)
lazy-oracle (a=0.0) (efficiency = 0.78)
lazy-simple (a=0.0) (efficiency = 0.33)
number of chunks
4000000
3000000
2000000
1000000
0
replication
SS Load new content departure
eviction
connection bandwidth
14
Ongoing work: understanding scheduling
Scheduling
―Which chunk to which neighbor
―When
• Periodically
• When the last requested chunk comes back
Adaptive scheduling algorithm
―Be more suitable for random seek
―Metric: continuity, chunk cost, # of redundant
transmission
―Preliminary results generated now
―More analysis required
15
Ongoing work: reducing hot-spots
Why does a peer become over-utilized?
―Too many requests
―Essentially, larger cache but limited bandwidth
Solutions
―Announce fake BM to other peers when overloaded
―Transfer hot content to other peers with more upload
capacity
16
Future work
Use a log-mining approach to helping scheduling
Optimize content distribution with social network
Develop some models to understand caching
17
PKTown: Gaming Platform with Peer-assistance
Basic idea
―Objective, features
Research issues
―Routing, relay and so on
Current status
18
Basic idea of PKTown
Launch a platform for gaming
Construct a large scale virtual LAN by using p2p
Using application-layer multicast to deliver
broadcast message
Self-organized
19
Major research issues
How to organize all of peers with small latency
together?
How to find out a better relay node to optimize
the communication latency between two peers?
How to determine the best path to efficiently
deliver one message to all of others, possibly
with a latency bound?
20
Current status
Deployed over CERNET
about1000 concurrency users at peak time
21
Summary
Discuss major issues in P2P VoD and P2P
gaming ( bandwidth-intensive & latency
intensive)
Present some observations from a live P2P VoD
system
Launch some open questions for further studies
―Load balance
―Best relay
22
http://www.gridcast.cn
http://www.pktown.net
Thanks!
Any questions……
23