Microscopic Behavior of TCP Congestion Control
Download
Report
Transcript Microscopic Behavior of TCP Congestion Control
Microscopic Behavior
of
Internet Control
Xiaoliang (David) Wei
NetLab, CS&EE
California Institute of Technology
Internet Control
Problem -> solution -> understanding ->
1986: First Internet
Congestion Collapse
1986
1989
1995
1999
2003
…
Internet Control
Problem -> solution -> understanding ->
First Internet
Congestion
Collapse
1988~1990: TCP-Tahoe
DEC-bit
1986
1989
1995
1999
2003
…
Internet Control
Problem -> solution -> understanding ->
First Internet
Congestion
Collapse
TCP Tahoe;
DEC-bit
1986
1989
1993~1995: Tri-S, DUAL,
TCP-Vegas
1995
1999
2003
…
Outline
Motivation
Overview of Microscopic behavior
Stability of Delay-based Congestion Control
Algorithms
Fairness of Loss-based Congestion control
algorithms
Future works
Summary
Outline
Motivation
Overview of Microscopic behavior
Stability of Delay-based Congestion Control
Algorithms
Fairness of Loss-based Congestion control
algorithms
Future works
Macroscopic View of TCP Control
TCP/AQM: A feedback control system
C
TCP Sender 1
TCP Sender 2
xi(t)
TCP:
Reno
Vegas
FAST
TCP Receiver 1
TCP Receiver 2
τF
q(t)
τB
x i t F xi t , qt B
AQM:
DropTail / RED
Delay
ECN
qt G qt , xi t F c
i
Fluid Models
x i t F xi t , qt B
qt G qt , xi t F c
i
Assumptions:
TCP algorithms directly control the transmission rates;
The transmission rates are differentiable (smooth);
Each TCP packet observes the same congestion
price (loss, delay or ECN)
Methodology based on Fluid Models
x i t F xi t , qt B
qt G qt , xi t F c
i
Equilibrium:
Efficiency?
Fairness?
Dynamics:
Stability?
Responsiveness?
Gap 1: Stability of TCP Vegas
Analysis: “TCP Vegas is stable if (and only if)
the number of flows is large, and capacity is
small, and delay is small.”
Experiment: a single TCP Vegas flow is stable
with arbitrary delay and capacity.
Gap 2: Fairness of Scalable TCP
Analysis: “Scalable TCP
is fair in homogeneous
network” [Kelly’03]
Analysis:
[Chiu&Jain’90] →
Scalable TCP is unfair.
Experiment: in most cases, Scalable TCP is
unfair in homogeneous network.
Gap 3: TCP vs TFRC
Analysis: “We designed TCP Friendly Rate
Control (TFRC) algorithm to have the same
equilibrium as TCP when they co-exist.”
Experiment: TCP flows do not fairly coexist
with TFRC flows.
Gaps
Stability: TCP-Vegas
Fairness: Scalable TCP
Friendliness: TCP vs
TFRC
Current analytical
models ignore
microscopic
behavior in TCP
congestion control
Outline
Motivation
Overview of Microscopic behavior
Stability of Delay-based Congestion Control
Algorithms
Fairness of Loss-based Congestion control
algorithms
Future works
Microscopic View (Packet level)
Two level timescales
On each RTT -- TCP congestion control
algorithm;
On each packet arrival -- Ack-clocking:
p--;
while (p < w(t) ) do
Send a packet
p++;
(p: number of packets in flight)
W: 0 -> 5
1
2
Sender
3
4
C
Receiver
5
x(t)
c
0
t (time)
Packets queued in bottleneck
C
Sender
1
2
Receiver
3
4
5
x(t)
c
0
t (time)
Packets leaves bottleneck at rate c
C
Sender
3
4
2
1
Receiver
5
x(t)
c
0
t (time)
Acknowledgment returns at rate c
A1
A2
A3
5
4
C
Sender
Receiver
x(t)
c
0
t (time)
New Packets sent at rate c
A4
A5
C
Sender
3
2
1
Receiver
x(t)
c
0
RTT
t (time)
No queue in
nd
2
Round Trip
C
5
Sender
4
3
2
1
Receiver
No need to
control rate x(t) !
x(t)
c
0
RTT
RTT
t (time)
Two Flows
C
4
TCP1
3
Rcv1
2
1
4
3
2
TCP2
1
Rcv2
x(t)
c
0
t (time)
Two Flows
TCP1
C
3
4
1
Rcv1
2
1
2
3
4
TCP2
Rcv2
x(t)
c
0
t (time)
A1
TCP1
A2
A3
C
2
4
Rcv1
3
4
5
TCP2
1
Rcv2
x(t)
c
0
t (time)
A4
A3
TCP1
A1
C
2
Rcv1
4
1
3
2
TCP2
Rcv2
x(t)
c
0
RTT
t (time)
A2
A1
TCP1
A3
C
4
2
3
Rcv1
1
4
TCP2
Rcv2
x(t)
c
0
RTT
t (time)
A3
A4
A1
C
TCP1
4
2
3
Rcv1
1
2
TCP2
Rcv2
x(t)
c
0
RTT
t (time)
A2
A1
C
TCP1
2
3
TCP2
4
Rcv1
1
4
Rcv2
On-off
pattern for
each flow
x(t)
c
0
A3
RTT
RTT
t (time)
Sub-RTT Burstiness: NS-2 Measurement
Two levels of Burstiness
x(t)
c
0
RTT
Micro Burst
Pulse function
Input rate>>c
Extra queue & loss
Transient
RTT
t (time)
Sub-RTT burstiness
On-off function
Input rate <=c
No extra queue & loss
Persistent
Microscopic Effects: known
Micro
Burst
Loss-based TCP
Delay-based TCP
Low throughput with
small buffer – pacing
improves throughput
(Clearly understood)
Noise to delay signal,
should be eliminated
(Partially…)
Sub-RTT Observed in Internet Traffic
Burstiness
(“Why do we care?”)
Microscopic Effects: new
Micro
Burst
Loss-based TCP
Delay-based TCP
Low throughput with
small buffer – pacing
improves throughput
(Clearly Understood)
Fast convergence in
queuing delay and
better stability
Sub-RTT Low loss
No effect
Burstiness synchronization rate
with DropTail routers
New Understandings
Micro Burst with
Delay-based TCP:
fast queue convergence
1.
A single TCP-Vegas flow is always
stable, regardless of delay and
capacity.
Sub-RTT Burstiness
and Loss-based TCP:
low loss sync rate
1.
Scalable TCP is (usually) unfair;
TCP is unfriendly to TFRC;
2.
Outline
Motivation
Overview of Microscopic behavior
Stability of Delay-based Congestion Control
Algorithms
Fairness of Loss-based Congestion control
algorithms
Future works
New Understandings
Micro Burst with
Delay-based TCP:
fast queue convergence
1.
A single TCP-Vegas flow is always
stable, regardless of delay and
capacity.
Sub-RTT Burstiness
and Loss-based TCP:
low loss sync rate
1.
Scalable TCP is (usually) unfair;
TCP is unfriendly to TFRC;
2.
A packet level model: basis
Ack-clocking: on each ack arrival
p--;
while (p < w(t) ) do
Send a packet
p++; (p: number of packets in flight)
Packets can only be sent upon arrival of an
acknowledgment;
A micro burst of packets can be sent at a moment;
Window size w(t) can be an arbitrary given process.
A packet level model: variables
Ack-clocking: on each ack arrival
p--;
while (p < w(t) ) do
Send a packet
p++; (p: number of packets in flight)
pj : Number of packets in flight when j is sent;
sj : sending time of packet j
bj : backlog experienced by packet j
aj : ack arrival time of packet j
A packet level model: variables
A4
C
3
Sender
2
A5
1
Receiver
pj : Number of packets in flight when j is sent;
sj : sending time of packet j
A packet level model: variables
A4
A5
C
Sender
2
3
1
Receiver
pj : Number of packets in flight when j is sent;
sj : sending time of packet j
bj : backlog experienced by packet j
A packet level model: variables
A4
C
A3
Sender
2
1
6
5
Receiver
pj : Number of packets in flight when j is sent;
sj : sending time of packet j
bj : backlog experienced by packet j
aj : ack arrival time of packet j
A packet level model: variables
Ack-clocking: on each ack arrival
p--;
while (p < w(t) ) do
Send a packet
p++; (p: number of packets in flight)
p j max p j 1 k 1 : p j 1 k 1 w a j 1 p j1 k
0k p j 1
s j a j p j
k : number of acks between sj and sj-1 ;
pj : number of packets in flight when i is sent
sj : sending time of packet j
aj-p(j) : ack arrival time of the packet one RTT ago
A packet level model: variables
Ack-clocking: on each ack arrival
p--;
while (p < w(t) ) do
Send a packet
p++; (p: number of packets in flight)
k : number of acks between sj and sj-1 ; For example: k =0
pj
max p
0 k p j 1
j 1
k 1 p j 1 k 1 w...
p j 1 1
s j a j p j a j p j1 1 a j 1 p j1 s j 1
A packet level model: variables
j
C
p3
j-1
cs j s j 1
p2
p1
b j maxb j 1 1 cs j s j 1 ,0
aj sj d
bj
c
bj : experienced backlog
c : bottleneck capacity
aj :ack arrival time
d : propagation delay
A packet level model
p j max p j 1 k 1: p j 1 k 1 w a j 1 p j1 k
0k p j 1
s j a j p j
b j maxb j 1 1 cs j s j 1
aj sj d
bj
c
pj : Number of packets in flight
when j is sent;
sj : sending time of packet j
bj : backlog experienced by packet j
aj : ack arrival time of packet j
Ack-clocking: quick sending process
Theorem: For anytime that a packet j is sent (sj ), there
is always a packet j*:=j*(j) s.t.
sj = sj*
pj* = w (sj )
The number of packets in flight at any packet sending
time is sync-up with the congestion window.
w(t)
p(t)
time (t)
s
Ack-clocking: fast queue convergence
Theorem: If
pk cd
for
Then:
k : j p j k j
p j cd b j
The queue converges
instantly if window
size is larger than BDP
in the entire previous
RTT.
w(t)
q(t)
time (t)
s
Window Control and Ack-clocking
Per RTT Window Control:
makes decision once every RTT
with the measurement from the latest acknowledgement (a
subsequence of sequence number k1, k2, k3, …)
w(t)
p(t)
ak1
ak 2
time (t)
sk1
sk 2
sk3
Stability of TCP Vegas
Theorem: Given the packet level model, if αd>1,
a single TCP Vegas flow converges to
equilibrium with arbitrary capacity c,
propagation delay d. That is: there exists a
sequence number J such that
j J :
cd d 1 ws j cd d 1
d 1 b j d 1
Stability of Vegas : 100-flow simulation
Stability of Vegas : Avg Window Size
Window Oscillation:
1 packet
Stability of Vegas : Queue Size
Queue Oscillation:
100 packets
( because 100 flows
synchronized )
Gap 1: Stability of TCP Vegas
Analysis: “TCP Vegas is stable if (and only if)
the number of flows is large, and capacity is
small, and delay is small.”
Reason: micro burst
leads to fast queue
convergence
Experiment: a single TCP Vegas flow is stable
with arbitrary delay and capacity.
FAST : stable and responsive
Designed based on the intuition that queue is directly
a function of congestion window size.
A FAST flow does the following every other RTT:
p
1
j
wt
d wt
bj
2
d
c
FAST : stability
Theorem: Given the packet level model,
homogeneous FAST flows converge to
equilibrium regardless of capacity c and
propagation delay d and number of flows N.
[Tang, Jacobsson, Andrew, Low’07]: FAST is
stable with single bottleneck link regardless of
capacity c and propagation delay d and
number of flows N. (With an extended fluid
model capturing microburst effects)
Micro-burst: Summary
x(t)
c
0
RTT
RTT
Effects:
Fast queue convergence
Stability of homogeneous Vegas for arbitrary delay
Possibility of very responsive & stable TCP control
Stability of FAST for arbitrary delay
t (time)
Outline
Motivation
Overview of Microscopic behavior
Stability of Delay-based Congestion Control
Algorithms
Fairness of Loss-based Congestion control
algorithms
Future works
New Understandings
Micro Burst with
Delay-based TCP:
fast queue convergence
1.
A single (homogeneous) TCPVegas flow is always stable,
regardless of delay and capacity.
Sub-RTT Burstiness
and Loss-based TCP:
low loss sync rate
1.
Scalable TCP is (usually) unfair;
TCP is unfriendly to TFRC;
2.
Loss Synchronization Rate: Definition
Loss Synchronization Rate [Baccelli,Hong’02]:
The probability that a flow observes a packet
loss during a congestion event.
Congestion event (loss event):
A round-trip time interval in which at least
one packet is dropped by the bottleneck
router due to congestion (buffer overflow at
router)
Loss Synchronization Rate: Effects
Intuitions:
Individual flow: the smaller the better (selfishness)
System design: the higher the better (for fairness and
convergence)
Theoretic Results:
Aggregate throughput [Baccelli,Hong’02]
Instantaneous fairness [Baccelli,Hong’02]
Fairness convergence [Shorten, Wirth, Leith’06]
Loss Sync. Rate: Existing Model
[Shorten, Wirth, Leith’06] No Model. Measure
from NS-2 and feed into a model for
computational results
[Baccelli,Hong’02] Assume each packet has the
same probability of being dropped in the loss
event.
Packet loss is bursty: Internet
~50% losses happen in bursts
Loss process is bursty: on-off
incoming packets during the RTT of loss event from all flows
Legend:
a packet
(from any flow)
a dropped
packet
burst period of loss signal
L incoming packets dropped
In each loss event (one RTT), packet loss process is an
on-off process.
Data packet process is bursty: on-off
incoming packets during the RTT of loss event from all flows
burst period of one flow: w packets
i i i i i i i i i
Legend:
i
a packet
(from any flow)
x(t)
a packet
from flow i
In each loss event (one RTT), TCP data packet
process is an on-off process.
c
0
RTT
RTT
t (time)
Loss Sync. Rate: A Sampling Perspective
incoming packets during the RTT of loss event from all flows
burst period of one flow: w packets
i i i i i i i i i
Legend:
a packet
(from any flow)
i
a packet
from flow i
a dropped
packet
burst period of loss signal
L incoming packets dropped
Loss Sync. Rate: The efficiency of a (bursty) TCP data
process to sample the loss signal in a (bursty) loss process
Assumption 1: Within the RTT of loss event, the position of an
individual flow’s burst is uniformly distributed.
Assumption 2: Loss process does not depend on data packet process
of individual flows.
Loss Sync. Rate Case 1: TCP+DropTail
incoming packets during the RTT of loss event from all flows
burst period of one flow: w packets
i i i i i i i i i
Legend:
a packet
(from any flow)
i
a packet
from flow i
L wi 1
i
cd B L
a dropped
packet
burst period of loss signal
L incoming packets dropped
wi : window of a TCP flow
L : number of dropped packets
cd+B+L : number of packets going through
the bottleneck in the loss event ( c : capacity,
d : propagation delay; B : buffer size)
Loss Sync. Rate: TCP+DropTail
Loss Sync. Rate Case 2: Pacing+DropTail
incoming packets during the RTT of loss event from all flows
w packets distributed in the entire RTT of loss event
i
i
i
Legend:
a packet
(from any flow)
i
i
i
i
i
i
i
a packet
from flow i
L
i 1 1
cd B L
a dropped
packet
wi
burst period of loss signal
L incoming packets
wi : window of a TCP flow
L : number of dropped packets
cd+B+L : number of packets going
through the bottleneck in the loss
event
Loss Sync. Rate: Pacing + DropTail
Loss Sync. Rate Case 3: TCP+RED
incoming packets during the RTT of loss event from all flows
burst period of one flow: w packets
i i i i i i i i i
packet loss distributed over the entire RTT of loss event
wi
i 1 1
cd B L
L
wi : window of a TCP flow
L : number of dropped packets
cd+B+L : number of packets going
through the bottleneck in the loss
event
Model for Loss Sync. Rate: General form
cd+B incoming packets during the RTT of loss event
burst period of Flow i
spanning over K incoming packets
i i i
Legend:
a packet
(from any flow)
i
a packet
from flow i
a dropped
packet
i i i i i
i
i i
burst period of loss signal
randomly drop from M
incoming packets
cd+B : number of packets going through the bottleneck in the
loss event ( c : capacity, d : propagation delay; B : buffer size)
wi : window of a TCP flow in the loss event
L : number of dropped packets in the loss event
?
i
Ki : length of burst period of flow i (in pkt)
M : length of burst period of loss process (in pkt)
Loss Sync. Rate: MatLab Computation
cd+B = 1080; wi = 60; L = 16; K , M vary
Measurement: TCP + DropTail
Averaged sync. Rate
cd+B = 3340
M =L = N/2
K = w = (cd+B)/N
Measurement: Pacing + DropTail
Averaged sync. Rate
cd+B = 3340
M =L = N/2
K = w = (cd+B)/N
Measurement: TCP + RED
Averaged sync. Rate
cd+B = 3340
M =L = N/2
K = w = (cd+B)/N
Loss Sync. Rate: Qualitative Results
With DropTail and bursty TCP (most widely
deployed combination), loss synchronization
rate is very low;
TCP Pacing increases loss synchronization rate;
RED increases loss synchronization rate.
Loss Sync. Rate: Asymptotic Result
If number of flows N is large: L >> wi
TCP:
L wi 1
L
i
cd B L
cd B L
Very weak dependency of Loss Sync Rate to
window size: All flows see the same loss
w
TCP Pacing:
wL
L
i 1 1
i
i
cd B L
cd B L
Loss Sync Rate is proportional to window size:
Rich guys see more loss.
Asymptotic Result: MatLab Computation
cd+B = 1080; L = N/2; N varies
Fair share window size: cd+B/N
Implications
1.
2.
3.
Scalable TCP is (usually) unfair with bursty TCP
TCP is unfriendly to TFRC;
…
Fairness of Scalable TCP
For each RTT without a loss:
wi (t+1) = αwi (t); α=1.01
For each RTT with a loss (loss event):
wi (t+1) = βwi (t); β= 0.875
[Chiu,Jain’90]: MIMD algorithms cannot converges
to fairness with synchronization model
[Kelly’03]: Scalable TCP (MIMD) converges to
fairness in theory with fluid model
[Wei, Jin, Low’06][Li,Leith,Shorten’07]: Scalable
TCP is unfair in experiments
Fairness of Scalable TCP: Chiu vs Kelly
[Chiu,Jain’90]: MIMD is not fair
Assumption:
loss event rate is independent of
window size (simplified synchronization model)
[Kelly’03]: Scalable TCP (MIMD) is fair
Assumption:
loss event rate is proportional to
window size (fluid model)
Fairness of Scalable TCP: Chiu vs Kelly
[Chiu,Jain’90]: MIMD is not fair
Assumption:
loss event rate is independent of
window size (simplified synchronization model)
Sync. Rate Model: many bursty TCP flows
[Kelly’03]: Scalable TCP is fair
Assumption:
loss event rate is proportional to
window size (fluid model)
Sync. Rate Model: true with very few bursty
TCP flows or with paced TCP flows
Scalable TCP: simulations
Capacity=100Mbps; delay=200ms; buffer size: BDP;
MTU=1500; N varies; averaged rate over 600 second runtime
Gap 2: Fairness of Scalable TCP
Analysis: “Scalable TCP
is fair in homogeneous
network” [Kelly’03]
Analysis: “MIMD in
general is unfair.”
[Chiu&Jain’90].
→ Scalable TCP is unfair.
Reason: sub-RTT burstiness
leads to similar loss sync.
rate for different flows
Experiment: in most cases, Scalable TCP is
unfair in homogeneous network.
TFRC vs TCP
incoming packets during the RTT of loss event from all flows
burst period of TCP: w packets
1
1
Legend:
a packet
(from any flow)
1
1
2
1
a packet
from TCP
a packet
from TFRC
TCP:
L wi 1
i
cd B L
2 1 2 2 2 2 1 2 2 2 2 1
a dropped
packet
1
1
burst period of loss signal
L incoming packets
TFRC (same as Pacing):
L
i 1 1
cd B L
wi
TFRC vs TCP: simulation
Gap 3: TCP vs TFRC
Analysis: “We designed TCP Friendly Rate
Control (TFRC) algorithm to have the same
equilibrium as TCP when they co-exist.”
Reason: sub-RTT burstiness
leads to different loss sync.
rate for TFRC and TCP
Experiment: TCP flows do not fairly coexist
with TFRC flows.
Sub-RTT Burstiness: Summary
x(t)
c
0
RTT
RTT
Effects:
Low Loss Sync. Rate
with DropTail router
Poor convergence
MIMD unfairness
TFRC unfriendly
t (time)
Possible solutions
Eliminate sub-RTT burstiness:
Pacing
Randomize loss signal: RED
Persistent loss signal: ECN
Outline
Motivation
Overview of Microscopic behavior
Stability of Delay-based Congestion Control
Algorithms
Fairness of Loss-based Congestion control
algorithms
Future works
Future: a research framework on
microscopic Internet behavior
Experiment tools: help to observe, analyze and
validate microscopic behavior in Internet: WANin-Lab, NS-2 TCP-Linux, …
Theoretic model: more accurate models to
capture the dynamic of Internet in microscopic
timescale.
New algorithms: new algorithms that utilize and
control the microscopic Internet behavior
NS-2 TCP-Linux
The first tool that can run a congestion algorithm
directly from Linux source code with the same
simulation speed (sometimes even faster)
700+ local downloads (2400+ tutorial visits worldwide)
5+ Linux kernel fixes
NS-2 Simulator
2+ papers
Outreach:
BIC/Cubic-TCP (NCSU),
Linux Implementation
H-TCP (Hamilton),
TCP Westwood (UCLA/Politecnico di Bari),
A-Reno (NEC), …
Thank you!
Q&A