SEEDING CLOUD-BASED SERVICES: DISTRIBUTED RATE LIMITING (DRL) Kevin Webb, Barath Raghavan, Kashi Vishwanath, Sriram Ramabhadran, Kenneth Yocum, and Alex C.
Download
Report
Transcript SEEDING CLOUD-BASED SERVICES: DISTRIBUTED RATE LIMITING (DRL) Kevin Webb, Barath Raghavan, Kashi Vishwanath, Sriram Ramabhadran, Kenneth Yocum, and Alex C.
SEEDING CLOUD-BASED SERVICES:
DISTRIBUTED RATE LIMITING (DRL)
Kevin Webb, Barath Raghavan, Kashi Vishwanath, Sriram
Ramabhadran, Kenneth Yocum, and Alex C. Snoeren
Seeding the Cloud
Technologies to deliver on the promise cloud computing
Previously: Process data in the cloud (Mortar)
Produced/stored
across providers
Find Ken Yocum or Dennis Logothetis for more info
Today: Control resource usage: “cloud control” with DRL
Use
resources at multiple sites (e.g., CDN)
Complicates resource accounting and control
Provide cost control
DRL Overview
Example: Cost control in a Content Distribution Network
Abstraction: Enforce global rate limit across multiple sites
Simple example: 10 flows, each limited as if there was a single, central limiter
10 flows
Limiter
Src
100 KB/s
Dst
2 flows
Limiter
Src
20 KB/s
Dst
DRL
Limiter
Src
8 flows
80 KB/s
Dst
Goals & Challenges
Up to now
Develop architecture and protocols for distributed rate limiting (SIGCOMM 07)
Particular approach (FPS) is practical in the wide area
Current goals:
Move DRL out of the lab and impact real services
Validate SIGCOMM results in real-world conditions
Provide Internet testbed with ability to manage bandwidth in a distributed fashion
Improve usability of PlanetLab
Challenges
Run-time overheads: CPU, memory, communication
Environment: link/node failures, software quirks
PlanetLab
World-wide test bed
and
systems research
Resources donated by
Universities, Labs, etc.
Experiments divided
into VMs called
“slices” (Vservers)
Controller
Networking
Web server PLC API
PostgreSQL
Linux 2.6
Internet
Slice
1
Slice
2
Slice
N
Slice
1
Slice
2
Vservers
Vservers
Linux 2.6
Linux 2.6
Nodes
Slice
N
PlanetLab Use Cases
PlanetLab needs DRL!
Donated bandwidth
Ease of administration
Machine room
Per slice
Limit local-area nodes to a single rate
Limit experiments in the wide area
Per organization
Limit all slices belonging to an organization
PlanetLab Use Cases
Machine room
Limit
local-area nodes with a single rate
DRL
DRL
5 MBps
1 MBps
DRL
DRL
DRL
DRL Design
Each limiter - main event loop
Input Traffic
Estimate: Observe and record
outgoing demand
Allocate: Determine rate share
of each node
Enforce: Drops packets
Two allocation approaches
GRD: Global random drop
(packet granularity)
FPS: Flow proportional share
Estimate
Other
Limiters
FPS
Allocate
Regular
Interval
Enforce
Flow count as proxy for demand
Output traffic
Implementation Architecture
Abstractions
Limiter
Ulogd
Estimate
Parameters (limit, interval, etc.)
Machines and Subsets
Built upon standard Linux tools…
Communication
Manages identities
Identity
Input Data
Userspace packet logging (Ulogd)
Hierarchical Token Bucket
Mesh & gossip update protocols
Integrated with PlanetLab software
FPS
Regular
Interval
Enforce
HTB
Output Data
Estimation using ulogd
Userspace logging daemon
Already
used by PlanetLab for efficient abuse tracking
Packets tagged with slice ID by IPTables
Receives outgoing packet headers via netlink socket
DRL implemented as ulogd plug-in
Gives
us efficient flow accounting for estimation
Executes the Estimate, Allocate, Enforce loop
Communicates with other limiters
Enforcement with Hierarchical Token Bucket
Linux Advanced Routing &
Traffic Control
Packet (1500)
Root 200b
1000b
Hierarchy of rate limits
Enforces DRL’s rate limit
100b
A 0b
X
Packets attributed to
leaves (slices)
Packets move up,
borrowing from parents
B
C
D
600b
0b Y
Packet (1500b)
Z
Enforcement with Hierarchical Token Bucket
Uses same tree
structure as PlanetLab
Root
Efficient control of
sub-trees
A
X
Updated
every loop
Root limits whole node
B
Replenish each level
C
D
Y
Z
Citadel Site
The Citadel (2 nodes)
Wanted
1 Mbps traffic limit
Added (horrible) traffic shaper
Poor responsiveness (2 – 15 seconds)
Running right now!
Cycles
on and off every four minutes
Observe
DRL’s impact without ground truth
DRL
Shaper
Citadel Results – Outgoing Traffic
Outgoing Traffic
1Mbit/s
Off
On
Off
On
Time
Off
On
Data logged from running nodes
Takeaways:
Without
DRL, way over limit
One node sending more than other
Off
On
# of Flows
Citadel Results – Flow Counts
Time
FPS uses flow count as proxy for demand
FPS Weight
Rate Limit
Citadel Results – Limits and Weights
Time
Lessons Learned
Flow counting is not always the best proxy for demand
FPS state transitions were irregular
Added checks and dampening/hysteresis in problem cases
Can estimate after enforce
Ulogd only shows packets after HTB
FPS is forgiving to software limitations
HTB is difficult
HYSTERESIS variable
TCP Segmentation offloading
Ongoing work
Other use cases
Larger-scale tests
Complete PlanetLab administrative interface
Standalone version
Continue DRL rollout on PlanetLab
UCSD’s
PlanetLab nodes soon
Questions?
Code is available from PlanetLab svn
http://svn.planet-lab.org/svn/DistributedRateLimiting/
Citadel Results