Microscopic Behavior of TCP Congestion Control

Transcript Microscopic Behavior of TCP Congestion Control

Microscopic Behavior
of
Internet Control
Xiaoliang (David) Wei
NetLab, CS&EE
California Institute of Technology
Internet Control

Problem -> solution -> understanding ->
1986: First Internet
Congestion Collapse
1986
1989
1995
1999
2003
…
Internet Control

Problem -> solution -> understanding ->
First Internet
Congestion
Collapse
1988~1990: TCP-Tahoe
DEC-bit
1986
1989
1995
1999
2003
…
Internet Control

Problem -> solution -> understanding ->
First Internet
Congestion
Collapse
TCP Tahoe;
DEC-bit
1986
1989
1993~1995: Tri-S, DUAL,
TCP-Vegas
1995
1999
2003
…
Outline






Motivation
Overview of Microscopic behavior
Stability of Delay-based Congestion Control
Algorithms
Fairness of Loss-based Congestion control
algorithms
Future works
Summary
Outline





Motivation
Overview of Microscopic behavior
Stability of Delay-based Congestion Control
Algorithms
Fairness of Loss-based Congestion control
algorithms
Future works
Macroscopic View of TCP Control

TCP/AQM: A feedback control system
C
TCP Sender 1
TCP Sender 2
xi(t)
TCP:
 Reno
 Vegas
 FAST

TCP Receiver 1
TCP Receiver 2
τF
q(t)
τB
x i t   F xi t , qt   B 
AQM:
 DropTail / RED
 Delay
 ECN


qt   G qt ,  xi t   F   c 
i



Fluid Models

x i t   F xi t , qt   B 


qt   G qt ,  xi t   F   c 
i



Assumptions:
 TCP algorithms directly control the transmission rates;
 The transmission rates are differentiable (smooth);
 Each TCP packet observes the same congestion
price (loss, delay or ECN)
Methodology based on Fluid Models

x i t   F xi t , qt   B 


qt   G qt ,  xi t   F   c 
i



Equilibrium:
 Efficiency?
 Fairness?
Dynamics:
 Stability?
 Responsiveness?
Gap 1: Stability of TCP Vegas

Analysis: “TCP Vegas is stable if (and only if)
the number of flows is large, and capacity is
small, and delay is small.”

Experiment: a single TCP Vegas flow is stable
with arbitrary delay and capacity.
Gap 2: Fairness of Scalable TCP

Analysis: “Scalable TCP
is fair in homogeneous
network” [Kelly’03]


Analysis:
[Chiu&Jain’90] →
Scalable TCP is unfair.
Experiment: in most cases, Scalable TCP is
unfair in homogeneous network.
Gap 3: TCP vs TFRC

Analysis: “We designed TCP Friendly Rate
Control (TFRC) algorithm to have the same
equilibrium as TCP when they co-exist.”

Experiment: TCP flows do not fairly coexist
with TFRC flows.
Gaps



Stability: TCP-Vegas
Fairness: Scalable TCP
Friendliness: TCP vs
TFRC
Current analytical
models ignore
microscopic
behavior in TCP
congestion control
Outline





Motivation
Overview of Microscopic behavior
Stability of Delay-based Congestion Control
Algorithms
Fairness of Loss-based Congestion control
algorithms
Future works
Microscopic View (Packet level)
Two level timescales
 On each RTT -- TCP congestion control
algorithm;

On each packet arrival -- Ack-clocking:
 p--;
 while (p < w(t) ) do
 Send a packet
 p++;
(p: number of packets in flight)
W: 0 -> 5
1
2
Sender
3
4
C
Receiver
5
x(t)
c
0
t (time)
Packets queued in bottleneck
C
Sender
1
2
Receiver
3
4
5
x(t)
c
0
t (time)
Packets leaves bottleneck at rate c
C
Sender
3
4
2
1
Receiver
5
x(t)
c
0
t (time)
Acknowledgment returns at rate c
A1
A2
A3
5
4
C
Sender
Receiver
x(t)
c
0
t (time)
New Packets sent at rate c
A4
A5
C
Sender
3
2
1
Receiver
x(t)
c
0
RTT
t (time)
No queue in
nd
2
Round Trip
C
5
Sender
4
3
2
1
Receiver
No need to
control rate x(t) !
x(t)
c
0
RTT
RTT
t (time)
Two Flows
C
4
TCP1
3
Rcv1
2
1
4
3
2
TCP2
1
Rcv2
x(t)
c
0
t (time)
Two Flows
TCP1
C
3
4
1
Rcv1
2
1
2
3
4
TCP2
Rcv2
x(t)
c
0
t (time)
A1
TCP1
A2
A3
C
2
4
Rcv1
3
4
5
TCP2
1
Rcv2
x(t)
c
0
t (time)
A4
A3
TCP1
A1
C
2
Rcv1
4
1
3
2
TCP2
Rcv2
x(t)
c
0
RTT
t (time)
A2
A1
TCP1
A3
C
4
2
3
Rcv1
1
4
TCP2
Rcv2
x(t)
c
0
RTT
t (time)
A3
A4
A1
C
TCP1
4
2
3
Rcv1
1
2
TCP2
Rcv2
x(t)
c
0
RTT
t (time)
A2
A1
C
TCP1
2
3
TCP2
4
Rcv1
1
4
Rcv2
On-off
pattern for
each flow
x(t)
c
0
A3
RTT
RTT
t (time)
Sub-RTT Burstiness: NS-2 Measurement
Two levels of Burstiness
x(t)
c
0
RTT
Micro Burst
 Pulse function
 Input rate>>c
 Extra queue & loss
 Transient
RTT
t (time)
Sub-RTT burstiness
 On-off function
 Input rate <=c
 No extra queue & loss
 Persistent
Microscopic Effects: known
Micro
Burst
Loss-based TCP
Delay-based TCP
Low throughput with
small buffer – pacing
improves throughput
(Clearly understood)
Noise to delay signal,
should be eliminated
(Partially…)
Sub-RTT Observed in Internet Traffic
Burstiness
(“Why do we care?”)
Microscopic Effects: new
Micro
Burst
Loss-based TCP
Delay-based TCP
Low throughput with
small buffer – pacing
improves throughput
(Clearly Understood)
Fast convergence in
queuing delay and
better stability
Sub-RTT Low loss
No effect
Burstiness synchronization rate
with DropTail routers
New Understandings
Micro Burst with
Delay-based TCP:
fast queue convergence
1.
A single TCP-Vegas flow is always
stable, regardless of delay and
capacity.
Sub-RTT Burstiness
and Loss-based TCP:
low loss sync rate
1.
Scalable TCP is (usually) unfair;
TCP is unfriendly to TFRC;
2.
Outline





Motivation
Overview of Microscopic behavior
Stability of Delay-based Congestion Control
Algorithms
Fairness of Loss-based Congestion control
algorithms
Future works
New Understandings
Micro Burst with
Delay-based TCP:
fast queue convergence
1.
A single TCP-Vegas flow is always
stable, regardless of delay and
capacity.
Sub-RTT Burstiness
and Loss-based TCP:
low loss sync rate
1.
Scalable TCP is (usually) unfair;
TCP is unfriendly to TFRC;
2.
A packet level model: basis




Ack-clocking: on each ack arrival
 p--;
 while (p < w(t) ) do
 Send a packet
 p++; (p: number of packets in flight)
Packets can only be sent upon arrival of an
acknowledgment;
A micro burst of packets can be sent at a moment;
Window size w(t) can be an arbitrary given process.
A packet level model: variables





Ack-clocking: on each ack arrival
 p--;
 while (p < w(t) ) do
 Send a packet
 p++; (p: number of packets in flight)
pj : Number of packets in flight when j is sent;
sj : sending time of packet j
bj : backlog experienced by packet j
aj : ack arrival time of packet j
A packet level model: variables
A4
C
3
Sender


2
A5
1
Receiver
pj : Number of packets in flight when j is sent;
sj : sending time of packet j
A packet level model: variables
A4
A5
C
Sender



2
3
1
Receiver
pj : Number of packets in flight when j is sent;
sj : sending time of packet j
bj : backlog experienced by packet j
A packet level model: variables
A4
C
A3
Sender




2
1
6
5
Receiver
pj : Number of packets in flight when j is sent;
sj : sending time of packet j
bj : backlog experienced by packet j
aj : ack arrival time of packet j
A packet level model: variables

Ack-clocking: on each ack arrival
 p--;
 while (p < w(t) ) do
 Send a packet
 p++; (p: number of packets in flight)


p j  max p j 1  k  1 : p j 1  k  1  w a j 1 p j1 k
0k  p j 1
s j  a j p j





k : number of acks between sj and sj-1 ;
pj : number of packets in flight when i is sent
sj : sending time of packet j
aj-p(j) : ack arrival time of the packet one RTT ago
A packet level model: variables


Ack-clocking: on each ack arrival
 p--;
 while (p < w(t) ) do
 Send a packet
 p++; (p: number of packets in flight)
k : number of acks between sj and sj-1 ; For example: k =0
pj


max p
0 k  p j 1
j 1

 k  1 p j 1  k  1  w...
p j 1  1
s j  a j  p j  a j  p j1 1  a j 1 p j1  s j 1
A packet level model: variables
j
C
p3
j-1
cs j  s j 1 
p2
p1
b j  maxb j 1  1  cs j  s j 1 ,0

aj  sj  d 
bj

c


bj : experienced backlog
c : bottleneck capacity
aj :ack arrival time
d : propagation delay
A packet level model


p j  max p j 1  k  1: p j 1  k  1  w a j 1 p j1 k
0k  p j 1
s j  a j p j
b j  maxb j 1  1  cs j  s j 1 
aj  sj  d 
bj
c




pj : Number of packets in flight
when j is sent;
sj : sending time of packet j
bj : backlog experienced by packet j
aj : ack arrival time of packet j

Ack-clocking: quick sending process

Theorem: For anytime that a packet j is sent (sj ), there
is always a packet j*:=j*(j) s.t.



sj = sj*
pj* = w (sj )
The number of packets in flight at any packet sending
time is sync-up with the congestion window.
w(t)
p(t)
time (t)
s
Ack-clocking: fast queue convergence

Theorem: If
pk  cd
for
Then:

k : j  p j  k  j
p j  cd  b j
The queue converges
instantly if window
size is larger than BDP
in the entire previous
RTT.
w(t)
q(t)
time (t)
s
Window Control and Ack-clocking

Per RTT Window Control:
makes decision once every RTT
 with the measurement from the latest acknowledgement (a
subsequence of sequence number k1, k2, k3, …)

w(t)
p(t)
ak1
ak 2
time (t)
sk1
sk 2
sk3
Stability of TCP Vegas

Theorem: Given the packet level model, if αd>1,
a single TCP Vegas flow converges to
equilibrium with arbitrary capacity c,
propagation delay d. That is: there exists a
sequence number J such that
j  J :
cd  d  1  ws j   cd  d  1
d  1  b j  d  1
Stability of Vegas : 100-flow simulation
Stability of Vegas : Avg Window Size
Window Oscillation:
1 packet
Stability of Vegas : Queue Size
Queue Oscillation:
100 packets
( because 100 flows
synchronized )
Gap 1: Stability of TCP Vegas

Analysis: “TCP Vegas is stable if (and only if)
the number of flows is large, and capacity is
small, and delay is small.”
Reason: micro burst
leads to fast queue
convergence

Experiment: a single TCP Vegas flow is stable
with arbitrary delay and capacity.
FAST : stable and responsive
Designed based on the intuition that queue is directly
a function of congestion window size.
A FAST flow does the following every other RTT:



 p



1 
j
wt   
d     wt 
bj

2 

d




c

FAST : stability

Theorem: Given the packet level model,
homogeneous FAST flows converge to
equilibrium regardless of capacity c and
propagation delay d and number of flows N.

[Tang, Jacobsson, Andrew, Low’07]: FAST is
stable with single bottleneck link regardless of
capacity c and propagation delay d and
number of flows N. (With an extended fluid
model capturing microburst effects)
Micro-burst: Summary
x(t)
c
0
RTT
RTT
Effects:
 Fast queue convergence
Stability of homogeneous Vegas for arbitrary delay
 Possibility of very responsive & stable TCP control
 Stability of FAST for arbitrary delay

t (time)
Outline





Motivation
Overview of Microscopic behavior
Stability of Delay-based Congestion Control
Algorithms
Fairness of Loss-based Congestion control
algorithms
Future works
New Understandings
Micro Burst with
Delay-based TCP:
fast queue convergence
1.
A single (homogeneous) TCPVegas flow is always stable,
regardless of delay and capacity.
Sub-RTT Burstiness
and Loss-based TCP:
low loss sync rate
1.
Scalable TCP is (usually) unfair;
TCP is unfriendly to TFRC;
2.
Loss Synchronization Rate: Definition

Loss Synchronization Rate [Baccelli,Hong’02]:
The probability that a flow observes a packet
loss during a congestion event.
 Congestion event (loss event):
A round-trip time interval in which at least
one packet is dropped by the bottleneck
router due to congestion (buffer overflow at
router)
Loss Synchronization Rate: Effects

Intuitions:
Individual flow: the smaller the better (selfishness)
 System design: the higher the better (for fairness and
convergence)


Theoretic Results:
Aggregate throughput [Baccelli,Hong’02]
 Instantaneous fairness [Baccelli,Hong’02]
 Fairness convergence [Shorten, Wirth, Leith’06]

Loss Sync. Rate: Existing Model

[Shorten, Wirth, Leith’06] No Model. Measure
from NS-2 and feed into a model for
computational results

[Baccelli,Hong’02] Assume each packet has the
same probability of being dropped in the loss
event.
Packet loss is bursty: Internet
~50% losses happen in bursts
Loss process is bursty: on-off
incoming packets during the RTT of loss event from all flows
Legend:
a packet
(from any flow)

a dropped
packet
burst period of loss signal
L incoming packets dropped
In each loss event (one RTT), packet loss process is an
on-off process.
Data packet process is bursty: on-off
incoming packets during the RTT of loss event from all flows
burst period of one flow: w packets
i i i i i i i i i
Legend:
i
a packet
(from any flow)
x(t)
a packet
from flow i

In each loss event (one RTT), TCP data packet
process is an on-off process.
c
0
RTT
RTT
t (time)
Loss Sync. Rate: A Sampling Perspective
incoming packets during the RTT of loss event from all flows
burst period of one flow: w packets
i i i i i i i i i
Legend:
a packet
(from any flow)

i
a packet
from flow i
a dropped
packet
burst period of loss signal
L incoming packets dropped
Loss Sync. Rate: The efficiency of a (bursty) TCP data
process to sample the loss signal in a (bursty) loss process


Assumption 1: Within the RTT of loss event, the position of an
individual flow’s burst is uniformly distributed.
Assumption 2: Loss process does not depend on data packet process
of individual flows.
Loss Sync. Rate Case 1: TCP+DropTail
incoming packets during the RTT of loss event from all flows
burst period of one flow: w packets
i i i i i i i i i
Legend:
a packet
(from any flow)
i
a packet
from flow i

L  wi  1
i 
cd  B  L


a dropped
packet
burst period of loss signal
L incoming packets dropped
wi : window of a TCP flow
L : number of dropped packets
cd+B+L : number of packets going through
the bottleneck in the loss event ( c : capacity,
d : propagation delay; B : buffer size)
Loss Sync. Rate: TCP+DropTail
Loss Sync. Rate Case 2: Pacing+DropTail
incoming packets during the RTT of loss event from all flows
w packets distributed in the entire RTT of loss event
i
i
i
Legend:
a packet
(from any flow)
i
i
i
i
i
i
i
a packet
from flow i
L


i  1  1 

 cd  B  L 
a dropped
packet
wi



burst period of loss signal
L incoming packets
wi : window of a TCP flow
L : number of dropped packets
cd+B+L : number of packets going
through the bottleneck in the loss
event
Loss Sync. Rate: Pacing + DropTail
Loss Sync. Rate Case 3: TCP+RED
incoming packets during the RTT of loss event from all flows
burst period of one flow: w packets
i i i i i i i i i
packet loss distributed over the entire RTT of loss event
wi


i  1  1 

 cd  B  L 
L



wi : window of a TCP flow
L : number of dropped packets
cd+B+L : number of packets going
through the bottleneck in the loss
event
Model for Loss Sync. Rate: General form
cd+B incoming packets during the RTT of loss event
burst period of Flow i
spanning over K incoming packets
i i i
Legend:
a packet
(from any flow)





i
a packet
from flow i
a dropped
packet
i i i i i
i
i i
burst period of loss signal
randomly drop from M
incoming packets
cd+B : number of packets going through the bottleneck in the
loss event ( c : capacity, d : propagation delay; B : buffer size)
wi : window of a TCP flow in the loss event
L : number of dropped packets in the loss event
?
i
Ki : length of burst period of flow i (in pkt)
M : length of burst period of loss process (in pkt)

Loss Sync. Rate: MatLab Computation
cd+B = 1080; wi = 60; L = 16; K , M vary
Measurement: TCP + DropTail
Averaged sync. Rate



cd+B = 3340
M =L = N/2
K = w = (cd+B)/N
Measurement: Pacing + DropTail
Averaged sync. Rate



cd+B = 3340
M =L = N/2
K = w = (cd+B)/N
Measurement: TCP + RED
Averaged sync. Rate



cd+B = 3340
M =L = N/2
K = w = (cd+B)/N
Loss Sync. Rate: Qualitative Results



With DropTail and bursty TCP (most widely
deployed combination), loss synchronization
rate is very low;
TCP Pacing increases loss synchronization rate;
RED increases loss synchronization rate.
Loss Sync. Rate: Asymptotic Result



If number of flows N is large: L >> wi
TCP:
L  wi  1
L
i 
cd  B  L

cd  B  L
Very weak dependency of Loss Sync Rate to
window size: All flows see the same loss
w
TCP Pacing:
wL
L


i  1  1 

i
i


cd  B  L 
cd  B  L
Loss Sync Rate is proportional to window size:
Rich guys see more loss.
Asymptotic Result: MatLab Computation
cd+B = 1080; L = N/2; N varies
Fair share window size: cd+B/N
Implications
1.
2.
3.
Scalable TCP is (usually) unfair with bursty TCP
TCP is unfriendly to TFRC;
…
Fairness of Scalable TCP
For each RTT without a loss:
wi (t+1) = αwi (t); α=1.01
 For each RTT with a loss (loss event):
wi (t+1) = βwi (t); β= 0.875
 [Chiu,Jain’90]: MIMD algorithms cannot converges
to fairness with synchronization model
 [Kelly’03]: Scalable TCP (MIMD) converges to
fairness in theory with fluid model
 [Wei, Jin, Low’06][Li,Leith,Shorten’07]: Scalable
TCP is unfair in experiments

Fairness of Scalable TCP: Chiu vs Kelly

[Chiu,Jain’90]: MIMD is not fair
 Assumption:
loss event rate is independent of
window size (simplified synchronization model)

[Kelly’03]: Scalable TCP (MIMD) is fair
 Assumption:
loss event rate is proportional to
window size (fluid model)
Fairness of Scalable TCP: Chiu vs Kelly

[Chiu,Jain’90]: MIMD is not fair
 Assumption:
loss event rate is independent of
window size (simplified synchronization model)
 Sync. Rate Model: many bursty TCP flows

[Kelly’03]: Scalable TCP is fair
 Assumption:
loss event rate is proportional to
window size (fluid model)
 Sync. Rate Model: true with very few bursty
TCP flows or with paced TCP flows
Scalable TCP: simulations
Capacity=100Mbps; delay=200ms; buffer size: BDP;
MTU=1500; N varies; averaged rate over 600 second runtime
Gap 2: Fairness of Scalable TCP

Analysis: “Scalable TCP
is fair in homogeneous
network” [Kelly’03]
Analysis: “MIMD in
general is unfair.”
[Chiu&Jain’90].
→ Scalable TCP is unfair.

Reason: sub-RTT burstiness
leads to similar loss sync.
rate for different flows

Experiment: in most cases, Scalable TCP is
unfair in homogeneous network.
TFRC vs TCP
incoming packets during the RTT of loss event from all flows
burst period of TCP: w packets
1
1
Legend:
a packet
(from any flow)
1
1
2
1
a packet
from TCP
a packet
from TFRC
TCP:
L  wi  1
i 
cd  B  L

2 1 2 2 2 2 1 2 2 2 2 1
a dropped
packet

1
1
burst period of loss signal
L incoming packets
TFRC (same as Pacing):
L


i  1  1 

 cd  B  L 
wi
TFRC vs TCP: simulation
Gap 3: TCP vs TFRC

Analysis: “We designed TCP Friendly Rate
Control (TFRC) algorithm to have the same
equilibrium as TCP when they co-exist.”
Reason: sub-RTT burstiness
leads to different loss sync.
rate for TFRC and TCP

Experiment: TCP flows do not fairly coexist
with TFRC flows.
Sub-RTT Burstiness: Summary
x(t)
c
0
RTT
RTT
Effects:
 Low Loss Sync. Rate
with DropTail router
Poor convergence
 MIMD unfairness
 TFRC unfriendly


t (time)
Possible solutions
Eliminate sub-RTT burstiness:
Pacing
 Randomize loss signal: RED
 Persistent loss signal: ECN

Outline





Motivation
Overview of Microscopic behavior
Stability of Delay-based Congestion Control
Algorithms
Fairness of Loss-based Congestion control
algorithms
Future works
Future: a research framework on
microscopic Internet behavior



Experiment tools: help to observe, analyze and
validate microscopic behavior in Internet: WANin-Lab, NS-2 TCP-Linux, …
Theoretic model: more accurate models to
capture the dynamic of Internet in microscopic
timescale.
New algorithms: new algorithms that utilize and
control the microscopic Internet behavior
NS-2 TCP-Linux




The first tool that can run a congestion algorithm
directly from Linux source code with the same
simulation speed (sometimes even faster)
700+ local downloads (2400+ tutorial visits worldwide)
5+ Linux kernel fixes
NS-2 Simulator
2+ papers
Outreach:




BIC/Cubic-TCP (NCSU),
Linux Implementation
H-TCP (Hamilton),
TCP Westwood (UCLA/Politecnico di Bari),
A-Reno (NEC), …
Thank you!
Q&A

Microscopic Behavior of TCP Congestion Control

Transcript Microscopic Behavior of TCP Congestion Control

Directory