View the slides

Download Report

Transcript View the slides

EyeQ:
(An engineer’s approach to)
Taming network performance
unpredictability in the Cloud
Vimal
Mohammad Alizadeh
Balaji Prabhakar
David Mazières
Changhoon Kim
Albert Greenberg
What are we depending on?
Many customers
… ineven
the Netflix
data centers,
we have a high
don’t
realise
network
issues:
capacity, super fast, highly reliable
5 Lessons We’ve Learned Using AWS
network. This has afforded us the luxury of
designing around chatty APIs to remote
systems. AWS networking has more variable
latency.
Just “spin up more VMs!”
Makes
app
more
network
dep.
http://techblog.netflix.com/2010/12/5lessons-weve-learned-using-aws.html
Overhaul apps
to deal with variability
2
Cloud: Warehouse Scale Computer
Multi-tenancy: To increase cluster utilisation
Provisioning the Warehouse
CPU, memory, disk
Network
http://research.google.com/people/jeff/latency.html
6/11/12
3
Sharing the Network
• Policy
Can we achieve this?
– Sharing model
• Mechanism
2Ghz VCPU
15GB memory
1Gbps network
– Computing rates
– Enforcing rates on entities…
Customer X
• Per-VM (multi-tenant)
specifies
Tenant X’s
Tenant Y’s
the thickness of
Virtual Switch
Virtual Switch
• Per-service
(search,
map-reduce, etc.)
each
pipe.
No traffic matrix.
(Hose Model)
… VMn
…
VM1
VM2
VM3
VM1
VM2
VM3
6/11/12
VMi
4
Why is it hard? (1)
• Default policy insufficient: 1 vs many TCP flows, UDP, etc.
• Poor scalability of traditional QoS mechanisms
• Bandwidth demands can be…
– Random, bursty
– Short: few millisecond requests
• Timescales matter!
10–100KB
10–100MB
– Need guarantees on the order of few RTTs (ms)
6/11/12
5
Seconds: Eternity
1 Long lived
TCP flow
Shared
10G pipe
Switch
Bursty UDP session
ON: 5ms
OFF: 15ms
6/11/12
6
Under the hood
Switch
6/11/12
7
Why is it hard? (2)
• Switch sees contention, but lacks VM state
• Receiver-host has VM state, but does not see
contention
(1) Drops in network: servers don’t see
true demand
Switch
(2) Elusive TCP (back-off) makes true
demand detection harder
6/11/12
8
Key Idea: Bandwidth Headroom
• Bandwidth guarantees: managing congestion
• Congestion: link util reaches 100%
– At millisecond timescales
• Don’t allow 100% util
– 10% headroom: Early detection at receiver
C
C
C
γC
γ<1
Single Switch: Headroom
Shared pipe
Limit to 9G
What about a network?
TCP
N x 10G
UDP
6/11/12
9
Network design: the old
Over-subscription
http://bradhedlund.com/2012/04/30/network-that-doesnt-suckfor-cloud-and-big-data-interop-2012-session-teaser/
6/11/12
10
Network design: the new
(1) Uniform capacity across racks
(2) Over-subscription only at
Top-of-Rack
http://bradhedlund.com/2012/04/30/network-that-doesnt-suckfor-cloud-and-big-data-interop-2012-session-teaser/
6/11/12
11
Mitigating Congestion in a Network
VM
VM
10Gbps pipe
10Gbps pipe
Server
Load
balancing:
ECMP,
etc.
Fabric
Server
Fabric
Admissibility: e2e congestion
Aggregate rate < 10Gbps
Aggregate rate > 10Gbps
control
(EyeQ)
Congestion free Fabric
Fabric gets congested
6/11/12
Load balancing + Admissibility =
Hotspot free network core
[VL2, FatTree, Hedera, MicroTE]
12
TX
EyeQ Platform
RX
untrusted
VM
VM
VM
untrusted
VM
VM
Software VSwitch
Software VSwitch
Adaptive Rate
Limiters
Congestion
Detectors
TX packets
RX packets
DataCentre
Fabric
3Gbps
6Gbps
Congestion Feedback
TX component
reacts
6/11/12
End-to-end
flow control
(VSwitch—VSwitch)
RX component
detects
13
Does it work?
TCP: 6Gbps
UDP: 3Gbps
Improves utilisation
Provides protection
Without EyeQ
6/11/12
With EyeQ
14
State: only at edge
One Big Switch
EyeQ
15
EyeQ
Load balancing
+ Bandwidth headroom
+ Admissibility at millisec timescales
= Network as one big switch
= Bandwidth sharing at edge
Linux, Windows implementation for 10Gbps
~1700 lines C code
http://github.com/jvimal/perfiso_10g (Linux kmod)
No documentation, yet. 
Thanks!
[email protected]
16
6/11/12
17