Transcript Slide 1

Sizing Router Buffers
Guido Appenzeller
Thesis Defense
May 24th, 2004
Routers need Packet Buffers
 It’s well known that routers need packet buffers
 It’s less clear why and how much
 Goal of this work is to answer the question:
How much buffering do routers need?
2
How much Buffer does a Router need?
Source
Router
Destination
C
2T
 Universally applied rule-of-thumb:
 A router needs a buffer size: B  2T  C
 2T is the two-way propagation delay (or just 250ms)
 C is capacity of bottleneck link
 Context
 Mandated in backbone and edge routers.
 Appears in RFPs and IETF architectural guidelines..
 Usually referenced to Villamizar and Song: “High Performance
TCP in ANSNET”, CCR, 1994.
 Already known by inventors of TCP [Van Jacobson, 1988]
 Has major consequences for router design
3
Example
 10Gb/s linecard
 Requires 300Mbytes of buffering.
 Read and write 40 byte packet every 32ns.
 Memory technologies
 DRAM: require 4 devices, but too slow.
 SRAM: require 80 devices, 1kW, $2000.
 Problem gets harder at 40Gb/s
 Hence RLDRAM, FCRAM, etc.
4
Outline of this Work
 Main Results
 The rule of thumb is wrong for a core routers today
2T  C
 Required buffer is
instead of 2T  C
n
 Outline of this talk






Where the rule of thumb comes from
Why it is incorrect for a core router in the internet today
Correct buffer requirements for a congested router
Buffer requirements for short flows (slow-start)
Experimental Verification
Conclusion
5
Outline
 The Rule of Thumb
 Where does the rule of thumb comes from?
 Interaction of TCP flows and a router buffers




(Answer: TCP)
The buffer requirements for a congested router
Buffer requirements for short flows (slow-start)
Experimental Verification
Conclusion
6
TCP
Only W=2 packets
may be outstanding
Source
Router
C’ > C
C
Dest
 TCP Congestion Window controls the sending rate
 Sender sends packets, receiver sends ACKs
 Sending rate is controlled by Window W,
 At any time, only W unacknowledged packets may be outstanding
 The sending rate of TCP is R 
W
RTT
7
Single TCP Flow
Router without buffers
Only W packets
may be outstanding
Source

Router
C’ > C
Congestion
Window
W adjusting
Rule for
8
C
Link not
fully utilized
W
 If an ACK is received:
 If a packet
is lost:
4
Dest
W ← W+1/W
W ← W/2
t
8
Single TCP Flow
Router with large enough buffers for full link utilization
For every W ACKs received,
send W+1 packets
B
Source
C’ > C
Dest
C
RTT
Window size
Wmax
R
Wmax
2
W
RTT
t
9
Required buffer is height of
sawtooth
B
0
t
10
Buffer = rule of thumb
11
Microscopic TCP Behavior
When sender pauses, buffer drains
Drop
one RTT
12
Over-buffered Link
13
Under-buffered Link
14
Origin of rule-of-thumb
 Before and after reducing window size, the sending rate of the
TCP sender is the same
Rold  Rnew
 Inserting the rate equation we get
Wold
Wnew

RTTold RTTnew
 The RTT is part transmission delay T and part queuing delay B/C .
We know that after reducing the window, the queuing delay is zero.
Wnew  Wold (1  1n )
Wold
Wold / 2

2T  B / C
2T
B  ( n11 )2T  C

B  2T  C
15
Rule-of-thumb
 Rule-of-thumb makes sense for one flow
 Typical backbone link has > 20,000 flows
 Does the rule-of-thumb still hold?
 Answer:
 If flows are perfectly synchronized, then Yes.
 If flows are desynchronized then No.
16
Outline
 The Rule of Thumb
 The buffer requirements for a congested router
 Synchronized flows
 Desynchronized flows
 The 2T×C/sqrt(n) rule
 Buffer requirements for short flows (slow-start)
 Experimental Verification
 Conclusion
17
If flows are synchronized
W
max
Wmax
 2
Wmax
Wmax
2
t
 Aggregate window has same dynamics
 Therefore buffer occupancy has same dynamics
 Rule-of-thumb still holds.
18
When are Flows Synchronized?
 Small numbers of flows tend to synchronize
 In ns2 simulation they are synchronized
 In at least some cases holds for real networks as well
 Large aggregates of flows are not synchronized
 For > 500 flows, synchronization disappears in ns2
 On a Cisco GSR, 100 flows were not synchronized
 Measurements in the core give no indication of
synchronization
 C. Fraleigh, “Provisioning Internet Backbone Networks to
support Latency Sensisitve Applications”, Ph.D. Thesis,
Stanford
 Hohn, Veitch, Papagiannaki and Diot – “Bridging Router
Performance and Queuing Theory”
19
If flows are not synchronized
W
B
0
Buffer Size
Probability
Distribution
20
Quantitative Model
 Model congestion window of a flow as random variable
Wi (t )
model as
Wi
where
P[Wi  x]  f ( x)
 For many de-synchronized flows
 We assume congestion windows are independent
 All congestion windows have the same probability distribution
E[Wi ]  W
var[Wi ]  W2
 Now central limit theorem gives us the distribution of the
sum of the window sizes
W (t )
i
n
 nW  n W N (0,1)
21
Buffer vs. Number of Flows
for a given Bandwidth
 If for a single flow we have
E[W ]  n1
var[W ]   n21
 For a given C, the window W scales with 1/n and thus
E[W ] 
n1
n
  n1 
2
var[W ]  

 n 
 Standard deviation of sum of windows decreases with n
1
n Wi (t )  n1  n  n1 N (0,1)
 Thus as n increases, buffer size should decrease
B 
Bn 1
n
22
Required buffer size
2T  C
n
Simulation
23
Summary
 Flows in the core are desynchronized
 Substantial experimental evidence
 Supported by ns2 simulations
 For desynchronized flows, routers need only buffers of
2T  C
B
n
24
Outline
 The Rule of Thumb
 The buffer requirements for a congested router
 Buffer requirements for short flows (slow-start)
 M/G/1 Model
 Experimental Verification
 Conclusion
25
Short Flows
 So far we were assuming a congested router
with long flows in congestion avoidance mode.
 What about flows in slow start?
 Do buffer requirements differ?
 Answer: Yes, however:
 Required buffer in such cases is independent of line
speed and RTT (same for 1Mbit/s or 40 Gbit/s)
 In mixes of flows, long flows drive buffer requirements
 Short flow result relevant for uncongested routers
26
A single, short-lived TCP flow
Flow length 62 packets, RTT ~140 ms
32
Flow Completion Time (FCT)
16
8
4
syn
2
RTT
fin ack
received
27
Modelling Short Flows
Idea: Find buffer size by modelling queue behaviour
 Problem: Arrival process is hard to model
 Simplify by modelling bursts as independent
 Buffer empties several times during one RTT
Poisson arrivals of flows
Poisson arrivals of bursts
Service time is Lflow, the
flow length in packets
Service time is the length
of the burst: 2,4,8,16…
28
M/G/1 Model for short flows
 TCP flows generate independent bursts
 Service times is burst length :
 Poisson arrivals of rate
Si  { 2,4,8,16...}
λburst
ρ

E[S i ]
 To verify if this approach works, let’s compare
the average queue length in the model and in
simulation
2
ρ 2 E[S i ]
E[Q]  E[N Q ]  E[S i ] 
2( 1- ) E[S i ]
29
Average Queue length
ρ2E[S2 ]
2(1- ρ)E[S]
capacity:C 40Mbit s
load:
0.8
30
Buffer Requirements
for Short Flows
 Buffers absorb fluctuations in queue, reduce packet loss
 Reduce retransmits, Timeouts and thereby flow completion time
 Utilization not a good measure of QoS as load << 1
 We can find a good upper bound for loss, if we have the
queue length distribution for an infinite buffer
 If a packet arrives and queue length is shorter than buffer, packet
will not be dropped
Buffer B
PDrop   P(Q  x)
Q
Packet
Loss
xB
P(Q = x)
 Problem: For M/G/1 there is no closed form expression for the
queue distribution
31
Queue Distribution
We derived closed-form estimates of the queue
distribution using Effective Bandwidth
 Gives very good closed form approximation
P(Q  b)  e
bκ
2( 1  ρ) E[S]
κ
ρ
E[S 2 ]
32
Short Flow Summary
 Buffer requirements for short flows
 Can be modeled by M/G/1 model
 Only depends on load and burst size distribution
 Example - for bursts of up to size 16 at load 0.8
 For 1% loss probability B = 115 Packets
 For 0.01% loss probability B = 230 packets etc.
 Bursts of size 12 is maximum for Windows XP
 Independent of line speed and RTT
 In mixes of flows, long flows dominate buffer
requirements
33
Outline





The Rule of Thumb
The buffer requirements for a congested router
Buffer requirements for short flows (slow-start)
Experimental Verification
Conclusion
34
Experimental Evaluation Overview
 Simulation with ns2
 Over 10,000 simulations that cover range of settings
 Simulation time 30s to 5 minutes
 Bandwidth 10 Mb/s - 1 Gb/s
 Latency 20ms -250 ms,
 Physical router
 Cisco GSR with OC3 line card
 In collaboration with University of Wisconsin
 Experimental results presented here




Long Flows - Utilization
Mixes of flows - Flow Completion Time (FCT)
Mixes of flows - Heavy Tailed Flow Distribution
Short Flows – Queue Distribution
35
Long Flows - Utilization (I)
Small Buffers are sufficient - OC3 Line, ~100ms RTT
99.9%
2T  C
2×
n
99.5%
98.0%
2T  C
n
36
Long Flows – Utilization (II)
Model vs. ns2 vs. Physical Router
GSR 12000, OC3 Line Card
TCP
Flows
Router Buffer
2T  C
n
Pkts
RAM
Link Utilization
Model
Sim
Exp
100
0.5 x
1x
2x
3x
64
129
258
387
1Mb
2Mb
4Mb
8Mb
96.9%
99.9%
100%
100%
94.7%
99.3%
99.9%
99.8%
94.9%
98.1%
99.8%
99.7%
400
0.5 x
1x
2x
3x
32
64
128
192
512kb
1Mb
2Mb
4Mb
99.7%
100%
100%
100%
99.2%
99.8%
100%
100%
99.5%
100%
100%
99.9%
37
Mixes of Flows
Flow Completion Time
FCT of 14 packet flows that share a link with long-lived flows.
2T  C
2T  C
n
38
Heavy-tailed flow length distribution
 Experiment
 Flow arrivals are a Poisson process
 Flow lengths are Pareto distributed
 Results
2T  C
 Buffers in the order of
are still sufficient
n
 Number of “long-lived” flows n is now defined as
number of flows in congestion avoidance mode
39
Pareto Flow Distribution
No. of flows in CA mode
Finding the number of flows
For buffer sizing,
pick n = 100
time [seconds]
40
Buffer
Occupancy
Bottleneck
utilization
Flow arrivals
on link 1
Pareto Flow Distribution
41
Short Flows – Queue Distribution
M/G/1 Model vs. GSR 12000, OC3 Line Card
42
Outline





The Rule of Thumb
The buffer requirements for a congested router
Buffer requirements for short flows (slow-start)
Experimental Verification
Conclusion
43
Related Work
 Related Publications
 Buffer sizing
 “High Speed TCP in ANS Net” - Villamizar and Song,
ACCR 1994
 “TCP behaviour with many flows” – R. Morris,
IEEE ICNP 1997
 “Scalable TCP congestion control” – R. Morris,
INFOCOM 2000
 Queue Modelling
 “Modelling, Simulation and Measurement of Queuing
Delay” – Garetto and Towsley, SIGMETRICS 2003
44
Original Contributions
 Main original contributions of this work
 Routers only require buffers of
2T  C
n
instead of 2T  C
 Models on TCP buffer interaction for
 Congestion avoidance mode
 Slow start
 Experimental Verification
 Publication
 “Sizing Router Buffers – Guido Appenzeller, Isaac
Keslassy and Nick McKeown, to appear at SIGCOMM
2004
45
The Commercial Internet Today
 Today’s internet differs from our assumptions
 Core is overprovisioned, almost never congested
 Access links are usually the bottleneck (DSL, Modem)
 Flows are usually limited by Maximum Window size
 Maximum Window is from 6-12 (Windows) to 42 (Unix)
 Are the results still relevant?
 Answer: Yes, we were intentionally pessimistic
 Routers still needs to work in case of congestion
 Even if this “worst case” scenario is rare
 Slow access links and small TCP windows reduce buffer
requirements further
 Bursts are smoothed out
 Converges towards constant rate source, poisson packet arrivals
 We verified reduced buffer requirements experimentally
46
How much buffer does a router need?
The old “Rule-of-Thumb”
Scenario
Single flow saturates router
Few synchronized flows
with congestion
Buffer
2T  C
Comments
Still applicable in a few select
cases (e.g. Internet2 speed
record)
Our Contribution
Scenario
Buffer
Comments
Many flows
Congestion
2T  C
n
Applicable for the core and
edge of the internet today.
One or many flows
Not Congested, ρ < 1
M/G/1 model
Only if there is never any
congestion
47
Impact on Router Design
 10Gb/s linecard with 200,000 x 56kb/s flows
 Rule-of-thumb: Buffer = 2.5Gbits
 Requires external, slow DRAM
 Becomes: Buffer = 6Mbits
 Can use on-chip, fast SRAM
 Completion time halved for short-flows
 40Gb/s linecard with 40,000 x 1Mb/s flows
 Rule-of-thumb: Buffer = 10Gbits
 Becomes: Buffer = 50Mbits
48
Questions?
Backup
Two TCP Flows
Two flows sharing a bottleneck synchronize
51
Impact on Protocol Design
 The ‘rule-of-thumb’ is due to the nature of TCP
 We can easily build a TCP that requires less buffers
TCP Reno:
Wnew  Wold (1  12 ) requires B  2T  C
General:
Wnew  Wold (1 1n )
requires
B  ( n11 )2T  C
 Latency based TCP (Vegas, FAST) suffer from smaller buffers
 Queuing delay too small to be used as a channel
 Buffer occupancy similar for different levels of congestion
 Unclear if these protocols work with aggregated traffic at all
 For future protocol design, buffer requirements should
be considered
52
Smaller buffers increase loss
But this in itself is not a bad thing
 Routers drop packets to throttle senders
 Loss rate is a function of number of flows and sum of
TCP windows
l  0.76
n2
W 
2
i
 Rule-of-thumb vs. minimal buffers changes number of
outstanding packets by a factor of 2
 Loss will quadruple with minimal buffers
 This is just the way how TCP works
 Quality of service better even with higher loss…
53