Transcript Slide 1
Sizing Router Buffers
Guido Appenzeller
Thesis Defense
May 24th, 2004
Routers need Packet Buffers
It’s well known that routers need packet buffers
It’s less clear why and how much
Goal of this work is to answer the question:
How much buffering do routers need?
2
How much Buffer does a Router need?
Source
Router
Destination
C
2T
Universally applied rule-of-thumb:
A router needs a buffer size: B 2T C
2T is the two-way propagation delay (or just 250ms)
C is capacity of bottleneck link
Context
Mandated in backbone and edge routers.
Appears in RFPs and IETF architectural guidelines..
Usually referenced to Villamizar and Song: “High Performance
TCP in ANSNET”, CCR, 1994.
Already known by inventors of TCP [Van Jacobson, 1988]
Has major consequences for router design
3
Example
10Gb/s linecard
Requires 300Mbytes of buffering.
Read and write 40 byte packet every 32ns.
Memory technologies
DRAM: require 4 devices, but too slow.
SRAM: require 80 devices, 1kW, $2000.
Problem gets harder at 40Gb/s
Hence RLDRAM, FCRAM, etc.
4
Outline of this Work
Main Results
The rule of thumb is wrong for a core routers today
2T C
Required buffer is
instead of 2T C
n
Outline of this talk
Where the rule of thumb comes from
Why it is incorrect for a core router in the internet today
Correct buffer requirements for a congested router
Buffer requirements for short flows (slow-start)
Experimental Verification
Conclusion
5
Outline
The Rule of Thumb
Where does the rule of thumb comes from?
Interaction of TCP flows and a router buffers
(Answer: TCP)
The buffer requirements for a congested router
Buffer requirements for short flows (slow-start)
Experimental Verification
Conclusion
6
TCP
Only W=2 packets
may be outstanding
Source
Router
C’ > C
C
Dest
TCP Congestion Window controls the sending rate
Sender sends packets, receiver sends ACKs
Sending rate is controlled by Window W,
At any time, only W unacknowledged packets may be outstanding
The sending rate of TCP is R
W
RTT
7
Single TCP Flow
Router without buffers
Only W packets
may be outstanding
Source
Router
C’ > C
Congestion
Window
W adjusting
Rule for
8
C
Link not
fully utilized
W
If an ACK is received:
If a packet
is lost:
4
Dest
W ← W+1/W
W ← W/2
t
8
Single TCP Flow
Router with large enough buffers for full link utilization
For every W ACKs received,
send W+1 packets
B
Source
C’ > C
Dest
C
RTT
Window size
Wmax
R
Wmax
2
W
RTT
t
9
Required buffer is height of
sawtooth
B
0
t
10
Buffer = rule of thumb
11
Microscopic TCP Behavior
When sender pauses, buffer drains
Drop
one RTT
12
Over-buffered Link
13
Under-buffered Link
14
Origin of rule-of-thumb
Before and after reducing window size, the sending rate of the
TCP sender is the same
Rold Rnew
Inserting the rate equation we get
Wold
Wnew
RTTold RTTnew
The RTT is part transmission delay T and part queuing delay B/C .
We know that after reducing the window, the queuing delay is zero.
Wnew Wold (1 1n )
Wold
Wold / 2
2T B / C
2T
B ( n11 )2T C
B 2T C
15
Rule-of-thumb
Rule-of-thumb makes sense for one flow
Typical backbone link has > 20,000 flows
Does the rule-of-thumb still hold?
Answer:
If flows are perfectly synchronized, then Yes.
If flows are desynchronized then No.
16
Outline
The Rule of Thumb
The buffer requirements for a congested router
Synchronized flows
Desynchronized flows
The 2T×C/sqrt(n) rule
Buffer requirements for short flows (slow-start)
Experimental Verification
Conclusion
17
If flows are synchronized
W
max
Wmax
2
Wmax
Wmax
2
t
Aggregate window has same dynamics
Therefore buffer occupancy has same dynamics
Rule-of-thumb still holds.
18
When are Flows Synchronized?
Small numbers of flows tend to synchronize
In ns2 simulation they are synchronized
In at least some cases holds for real networks as well
Large aggregates of flows are not synchronized
For > 500 flows, synchronization disappears in ns2
On a Cisco GSR, 100 flows were not synchronized
Measurements in the core give no indication of
synchronization
C. Fraleigh, “Provisioning Internet Backbone Networks to
support Latency Sensisitve Applications”, Ph.D. Thesis,
Stanford
Hohn, Veitch, Papagiannaki and Diot – “Bridging Router
Performance and Queuing Theory”
19
If flows are not synchronized
W
B
0
Buffer Size
Probability
Distribution
20
Quantitative Model
Model congestion window of a flow as random variable
Wi (t )
model as
Wi
where
P[Wi x] f ( x)
For many de-synchronized flows
We assume congestion windows are independent
All congestion windows have the same probability distribution
E[Wi ] W
var[Wi ] W2
Now central limit theorem gives us the distribution of the
sum of the window sizes
W (t )
i
n
nW n W N (0,1)
21
Buffer vs. Number of Flows
for a given Bandwidth
If for a single flow we have
E[W ] n1
var[W ] n21
For a given C, the window W scales with 1/n and thus
E[W ]
n1
n
n1
2
var[W ]
n
Standard deviation of sum of windows decreases with n
1
n Wi (t ) n1 n n1 N (0,1)
Thus as n increases, buffer size should decrease
B
Bn 1
n
22
Required buffer size
2T C
n
Simulation
23
Summary
Flows in the core are desynchronized
Substantial experimental evidence
Supported by ns2 simulations
For desynchronized flows, routers need only buffers of
2T C
B
n
24
Outline
The Rule of Thumb
The buffer requirements for a congested router
Buffer requirements for short flows (slow-start)
M/G/1 Model
Experimental Verification
Conclusion
25
Short Flows
So far we were assuming a congested router
with long flows in congestion avoidance mode.
What about flows in slow start?
Do buffer requirements differ?
Answer: Yes, however:
Required buffer in such cases is independent of line
speed and RTT (same for 1Mbit/s or 40 Gbit/s)
In mixes of flows, long flows drive buffer requirements
Short flow result relevant for uncongested routers
26
A single, short-lived TCP flow
Flow length 62 packets, RTT ~140 ms
32
Flow Completion Time (FCT)
16
8
4
syn
2
RTT
fin ack
received
27
Modelling Short Flows
Idea: Find buffer size by modelling queue behaviour
Problem: Arrival process is hard to model
Simplify by modelling bursts as independent
Buffer empties several times during one RTT
Poisson arrivals of flows
Poisson arrivals of bursts
Service time is Lflow, the
flow length in packets
Service time is the length
of the burst: 2,4,8,16…
28
M/G/1 Model for short flows
TCP flows generate independent bursts
Service times is burst length :
Poisson arrivals of rate
Si { 2,4,8,16...}
λburst
ρ
E[S i ]
To verify if this approach works, let’s compare
the average queue length in the model and in
simulation
2
ρ 2 E[S i ]
E[Q] E[N Q ] E[S i ]
2( 1- ) E[S i ]
29
Average Queue length
ρ2E[S2 ]
2(1- ρ)E[S]
capacity:C 40Mbit s
load:
0.8
30
Buffer Requirements
for Short Flows
Buffers absorb fluctuations in queue, reduce packet loss
Reduce retransmits, Timeouts and thereby flow completion time
Utilization not a good measure of QoS as load << 1
We can find a good upper bound for loss, if we have the
queue length distribution for an infinite buffer
If a packet arrives and queue length is shorter than buffer, packet
will not be dropped
Buffer B
PDrop P(Q x)
Q
Packet
Loss
xB
P(Q = x)
Problem: For M/G/1 there is no closed form expression for the
queue distribution
31
Queue Distribution
We derived closed-form estimates of the queue
distribution using Effective Bandwidth
Gives very good closed form approximation
P(Q b) e
bκ
2( 1 ρ) E[S]
κ
ρ
E[S 2 ]
32
Short Flow Summary
Buffer requirements for short flows
Can be modeled by M/G/1 model
Only depends on load and burst size distribution
Example - for bursts of up to size 16 at load 0.8
For 1% loss probability B = 115 Packets
For 0.01% loss probability B = 230 packets etc.
Bursts of size 12 is maximum for Windows XP
Independent of line speed and RTT
In mixes of flows, long flows dominate buffer
requirements
33
Outline
The Rule of Thumb
The buffer requirements for a congested router
Buffer requirements for short flows (slow-start)
Experimental Verification
Conclusion
34
Experimental Evaluation Overview
Simulation with ns2
Over 10,000 simulations that cover range of settings
Simulation time 30s to 5 minutes
Bandwidth 10 Mb/s - 1 Gb/s
Latency 20ms -250 ms,
Physical router
Cisco GSR with OC3 line card
In collaboration with University of Wisconsin
Experimental results presented here
Long Flows - Utilization
Mixes of flows - Flow Completion Time (FCT)
Mixes of flows - Heavy Tailed Flow Distribution
Short Flows – Queue Distribution
35
Long Flows - Utilization (I)
Small Buffers are sufficient - OC3 Line, ~100ms RTT
99.9%
2T C
2×
n
99.5%
98.0%
2T C
n
36
Long Flows – Utilization (II)
Model vs. ns2 vs. Physical Router
GSR 12000, OC3 Line Card
TCP
Flows
Router Buffer
2T C
n
Pkts
RAM
Link Utilization
Model
Sim
Exp
100
0.5 x
1x
2x
3x
64
129
258
387
1Mb
2Mb
4Mb
8Mb
96.9%
99.9%
100%
100%
94.7%
99.3%
99.9%
99.8%
94.9%
98.1%
99.8%
99.7%
400
0.5 x
1x
2x
3x
32
64
128
192
512kb
1Mb
2Mb
4Mb
99.7%
100%
100%
100%
99.2%
99.8%
100%
100%
99.5%
100%
100%
99.9%
37
Mixes of Flows
Flow Completion Time
FCT of 14 packet flows that share a link with long-lived flows.
2T C
2T C
n
38
Heavy-tailed flow length distribution
Experiment
Flow arrivals are a Poisson process
Flow lengths are Pareto distributed
Results
2T C
Buffers in the order of
are still sufficient
n
Number of “long-lived” flows n is now defined as
number of flows in congestion avoidance mode
39
Pareto Flow Distribution
No. of flows in CA mode
Finding the number of flows
For buffer sizing,
pick n = 100
time [seconds]
40
Buffer
Occupancy
Bottleneck
utilization
Flow arrivals
on link 1
Pareto Flow Distribution
41
Short Flows – Queue Distribution
M/G/1 Model vs. GSR 12000, OC3 Line Card
42
Outline
The Rule of Thumb
The buffer requirements for a congested router
Buffer requirements for short flows (slow-start)
Experimental Verification
Conclusion
43
Related Work
Related Publications
Buffer sizing
“High Speed TCP in ANS Net” - Villamizar and Song,
ACCR 1994
“TCP behaviour with many flows” – R. Morris,
IEEE ICNP 1997
“Scalable TCP congestion control” – R. Morris,
INFOCOM 2000
Queue Modelling
“Modelling, Simulation and Measurement of Queuing
Delay” – Garetto and Towsley, SIGMETRICS 2003
44
Original Contributions
Main original contributions of this work
Routers only require buffers of
2T C
n
instead of 2T C
Models on TCP buffer interaction for
Congestion avoidance mode
Slow start
Experimental Verification
Publication
“Sizing Router Buffers – Guido Appenzeller, Isaac
Keslassy and Nick McKeown, to appear at SIGCOMM
2004
45
The Commercial Internet Today
Today’s internet differs from our assumptions
Core is overprovisioned, almost never congested
Access links are usually the bottleneck (DSL, Modem)
Flows are usually limited by Maximum Window size
Maximum Window is from 6-12 (Windows) to 42 (Unix)
Are the results still relevant?
Answer: Yes, we were intentionally pessimistic
Routers still needs to work in case of congestion
Even if this “worst case” scenario is rare
Slow access links and small TCP windows reduce buffer
requirements further
Bursts are smoothed out
Converges towards constant rate source, poisson packet arrivals
We verified reduced buffer requirements experimentally
46
How much buffer does a router need?
The old “Rule-of-Thumb”
Scenario
Single flow saturates router
Few synchronized flows
with congestion
Buffer
2T C
Comments
Still applicable in a few select
cases (e.g. Internet2 speed
record)
Our Contribution
Scenario
Buffer
Comments
Many flows
Congestion
2T C
n
Applicable for the core and
edge of the internet today.
One or many flows
Not Congested, ρ < 1
M/G/1 model
Only if there is never any
congestion
47
Impact on Router Design
10Gb/s linecard with 200,000 x 56kb/s flows
Rule-of-thumb: Buffer = 2.5Gbits
Requires external, slow DRAM
Becomes: Buffer = 6Mbits
Can use on-chip, fast SRAM
Completion time halved for short-flows
40Gb/s linecard with 40,000 x 1Mb/s flows
Rule-of-thumb: Buffer = 10Gbits
Becomes: Buffer = 50Mbits
48
Questions?
Backup
Two TCP Flows
Two flows sharing a bottleneck synchronize
51
Impact on Protocol Design
The ‘rule-of-thumb’ is due to the nature of TCP
We can easily build a TCP that requires less buffers
TCP Reno:
Wnew Wold (1 12 ) requires B 2T C
General:
Wnew Wold (1 1n )
requires
B ( n11 )2T C
Latency based TCP (Vegas, FAST) suffer from smaller buffers
Queuing delay too small to be used as a channel
Buffer occupancy similar for different levels of congestion
Unclear if these protocols work with aggregated traffic at all
For future protocol design, buffer requirements should
be considered
52
Smaller buffers increase loss
But this in itself is not a bad thing
Routers drop packets to throttle senders
Loss rate is a function of number of flows and sum of
TCP windows
l 0.76
n2
W
2
i
Rule-of-thumb vs. minimal buffers changes number of
outstanding packets by a factor of 2
Loss will quadruple with minimal buffers
This is just the way how TCP works
Quality of service better even with higher loss…
53