TCP Flow Control Tutorial - Welcome | Computer Science

Download Report

Transcript TCP Flow Control Tutorial - Welcome | Computer Science

Equilibrium & Dynamics
of TCP/AQM
Steven Low
CS & EE, Caltech
netlab.caltech.edu
Sigcomm
August 2001
Copyright, 1996 © Dale Carnegie & Associates, In
Acknowledgments
S. Athuraliya, D. Lapsley, V. Li, Q. Yin
(UMelb)
S. Adlakha (UCLA), J. Doyle (Caltech), F.
Paganini (UCLA), J. Wang (Caltech)
L. Peterson, L. Wang (Princeton)
Matthew Roughan (AT&T Labs)
Outline
Introduction
TCP/AQM algorithms
Duality model
Equilibrium of TCP/AQM
Linear dynamic model
Dynamics of TCP/AQM
A scalable control
Schedule
1:30
2:30
2:30
3:30
4:00
4:30
–
–
–
–
–
–
2:30
3:00
3:00
4:00
4:30
5:00
TCP/AQM algorithms
Duality model (equilibrium)
Break
Duality model (equilibrium)
Linear model (dynamic)
Scalable control
Part 0
Introduction
TCP/IP Protocol Stack
Applications (e.g. Telnet, HTTP)
TCP
UDP
IP
ICMP
ARP
Link Layer (e.g. Ethernet, ATM)
Physical Layer (e.g. Ethernet, SONET)
Packet Terminology
Application Message
MSS
TCP Segment
TCP hdr
IP Packet
IP hdr
Ethernet Frame
TCP data
20 bytes
IP data
20 bytes
Ethernet
Ethernet data
14 bytes
MTU 1500 bytes
4 bytes
Success of IP
Simple/Robust
WWW, Email, Napster, FTP, …
Applications
IP
Transmission
Ethernet, ATM, POS, WDM, …
 Robustness against failure
 Robustness against technological
evolutions
 Provides a service to applications
 Doesn’t tell applications what to do
Quality of Service
 Can we provide QoS with simplicity?
 Not with current TCP…
 … but we can fix it!
IETF
 Internet Engineering Task Force
 Standards organisation for Internet
 Publishes RFCs - Requests For Comment
 standards track: proposed, draft, Internet
 non-standards track: experimental, informational
 best current practice
 poetry/humour (RFC 1149: Standard for the transmission of
IP datagrams on avian carriers)
 TCP should obey RFC
 no means of enforcement
 some versions have not followed RFC
 http://www.ietf.org/index.html
Simulation
 ns-2:
http://www.isi.edu/nsnam/ns/index.html
 Wide variety of protocols
 Widely used/tested
 SSFNET:
http://www.ssfnet.org/homePage.html
 Scalable to very large networks
 Care should be taken in simulations!
 Multiple independent simulations
 confidence intervals
 transient analysis – make sure simulations are long enough
 Wide parameter ranges
 All simulations involve approximation
Other Tools
tcpdump
Get packet headers from real network traffic
tcpanaly (V.Paxson, 1997)
Analysis of TCP sessions from tcpdump
traceroute
Find routing of packets
RFC 2398
http://www.caida.org/tools/
Part I
Algorithms
Outline
 Introduction
 TCP/AQM algorithms
 Window flow control
 Source algorithm: Tahoe, Reno, Vegas
 Link algorithm: RED, REM
 Duality model
 Equilibrium of TCP/AQM
 Linear dynamic model
 Dynamics of TCP/AQM
 A scalable control
Early TCP
Pre-1988
Go-back-N ARQ
Detects loss from timeout
Retransmits from lost packet onward
Receiver window flow control
Prevent overflows at receive buffer
Flow control: self-clocking
Why Flow Control?
October 1986, Internet had its first
congestion collapse
Link LBL to UC Berkeley
400 yards, 3 hops, 32 Kbps
throughput dropped to 40 bps
factor of ~1000 drop!
1988, Van Jacobson proposed TCP flow
control
Window Flow Control
RTT
Source
1 2
W
W
time
ACKs
data
Destination
1 2
1 2
W
1 2
W
time
~ W packets per RTT
Lost packet detected by missing ACK
Source Rate
Limit the number of packets in the network
to window W
W  MSS
Source rate =
bps
RTT
If W too small then rate « capacity
If W too big then rate > capacity
=> congestion
Effect of Congestion




Packet loss
Retransmission
Reduced throughput
Congestion collapse due to
 Unnecessarily retransmitted packets
 Undelivered or unusable packets
 Congestion may continue after the overload!
throughput
load
Congestion Control
TCP seeks to
Achieve high utilization
Avoid congestion
Share bandwidth
Window flow control
W packets/sec
RTT
Adapt W to network (and conditions)
W = BW x RTT
Source rate =
TCP Window Flow Controls
 Receiver flow control
 Avoid overloading receiver
 Set by receiver
 awnd: receiver (advertised) window
 Network flow control
 Avoid overloading network
 Set by sender
 Infer available network capacity
 cwnd: congestion window
 Set W = min (cwnd, awnd)
Receiver Flow Control
Receiver advertises awnd with each ACK
Window awnd
closed when data is received and ack’d
opened when data is read
Size of awnd can be the performance limit
(e.g. on a LAN)
sensible default ~16kB
Network Flow Control
Source calculates cwnd from indication of
network congestion
Congestion indications
Losses
Delay
Marks
Algorithms to calculate cwnd
Tahoe, Reno, Vegas, RED, REM …
Outline
 Introduction
 TCP/AQM algorithms
 Window flow control
 Source algorithm: Tahoe, Reno, Vegas
 Link algorithm: RED, REM
 Duality model
 Equilibrium of TCP/AQM
 Linear dynamic model
 Dynamics of TCP/AQM
 A scalable control
TCP Congestion Control
 Tahoe (Jacobson 1988)
 Slow Start
 Congestion Avoidance
 Fast Retransmit
 Reno (Jacobson 1990)
 Fast Recovery
 Vegas (Brakmo & Peterson 1994)
 New Congestion Avoidance
 RED (Floyd & Jacobson 1993)
 Probabilistic marking
 REM (Athuraliya & Low 2000)
 Clear buffer, match rate
 Others…
Variants
Tahoe & Reno
NewReno
SACK
Rate-halving
Mod.s for high performance
AQM
RED, ARED, FRED, SRED
BLUE, SFB
REM, PI
TCP Tahoe
(Jacobson 1988)
window
SS
CA
SS: Slow Start
CA: Congestion Avoidance
time
Slow Start
Start with cwnd = 1 (slow start)
On each successful ACK increment cwnd
cwnd  cnwd + 1
Exponential growth of cwnd
each RTT: cwnd  2 x cwnd
Enter CA when cwnd >= ssthresh
Slow Start
sender
receiver
cwnd
1
1 RTT
2
data packet
ACK
3
4
5
6
7
8
cwnd  cwnd + 1 (for each ACK)
Congestion Avoidance
Starts when cwnd  ssthresh
On each successful ACK:
cwnd  cwnd + 1/cwnd
Linear growth of cwnd
each RTT: cwnd  cwnd + 1
Congestion Avoidance
sender
receiver
cwnd
1
2
data packet
ACK
1 RTT
3
4
cwnd  cwnd + 1 (for each cwnd ACKS)
Packet Loss
Assumption: loss indicates congestion
Packet loss detected by
Retransmission TimeOuts (RTO timer)
Duplicate ACKs (at least 3)
Packets
1
2
3
4
5
7
6
Acknowledgements
1
2
3
3
3
3
Fast Retransmit
Wait for a timeout is quite long
Immediately retransmits after 3 dupACKs
without waiting for timeout
Adjusts ssthresh
flightsize = min(awnd, cwnd)
ssthresh  max(flightsize/2, 2)
Enter Slow Start (cwnd = 1)
Successive Timeouts
When there is a timeout, double the RTO
Keep doing so for each lost retransmission
 Exponential back-off
 Max 64 seconds1
 Max 12 restransmits1
1 - Net/3 BSD
Summary: Tahoe
 Basic ideas
 Gently probe network for spare capacity
 Drastically reduce rate on congestion
 Windowing: self-clocking
 Other functions: round trip time estimation, error
recovery
for every ACK {
if (W < ssthresh) then W++
else W += 1/W
}
for every loss {
ssthresh = W/2
W = 1
}
(SS)
(CA)
TCP Tahoe
(Jacobson 1988)
window
SS
CA
SS: Slow Start
CA: Congestion Avoidance
time
TCP Reno
(Jacobson 1990)
window
SS
time
CA
SS: Slow Start
CA: Congestion Avoidance
Fast retransmission/fast recovery
Fast recovery
 Motivation: prevent `pipe’ from emptying after
fast retransmit
 Idea: each dupACK represents a packet having
left the pipe (successfully received)
 Enter FR/FR after 3 dupACKs
 Set ssthresh  max(flightsize/2, 2)
 Retransmit lost packet
 Set cwnd  ssthresh + ndup (window inflation)
 Wait till W=min(awnd, cwnd) is large enough;
transmit new packet(s)
 On non-dup ACK (1 RTT later), set cwnd  ssthresh
(window deflation)
 Enter CA
Example: FR/FR
S 1 2 3 4 5 6 7 8
1
9 10 11
time
Exit FR/FR
R
0 0 0 0 0 0 0
cwnd 8
ssthresh
7
4
time
8
9
4
11
4
4
4
Fast retransmit
Retransmit on 3 dupACKs
Fast recovery
Inflate window while repairing loss to fill pipe
Summary: Reno
Basic ideas
Fast recovery avoids slow start
dupACKs: fast retransmit + fast recovery
Timeout: fast retransmit + slow start
dupACKs
congestion
avoidance
FR/FR
timeout
slow start
retransmit
NewReno: Motivation
FR/FR
S 1 2 3 4 5 6 7 8 9 0
D
0
0
0
1
0
3
0
8
2
9
time
time
5
timeout
8 unack’d pkts
 On 3 dupACKs, receiver has packets 2, 4, 6, 8, cwnd=8,
retransmits pkt 1, enter FR/FR
 Next dupACK increment cwnd to 9
 After a RTT, ACK arrives for pkts 1 & 2, exit FR/FR,
cwnd=5, 8 unack’ed pkts
 No more ACK, sender must wait for timeout
NewReno
Fall & Floyd ‘96, (RFC 2583)
 Motivation: multiple losses within a window
 Partial ACK acknowledges some but not all packets
outstanding at start of FR
 Partial ACK takes Reno out of FR, deflates window
 Sender may have to wait for timeout before proceeding
 Idea: partial ACK indicates lost packets
 Stays in FR/FR and retransmits immediately
 Retransmits 1 lost packet per RTT until all lost packets
from that window are retransmitted
 Eliminates timeout
SACK
Mathis, Mahdavi, Floyd, Romanow ’96 (RFC 2018, RFC 2883)
 Motivation: Reno & NewReno retransmit at most
1 lost packet per RTT
 Pipe can be emptied during FR/FR with multiple losses
 Idea: SACK provides better estimate of packets
in pipe
 SACK TCP option describes received packets
 On 3 dupACKs: retransmits, halves window, enters FR
 Updates pipe = packets in pipe
 Increment when lost or new packets sent
 Decrement when dupACK received
 Transmits a (lost or new) packet when pipe < cwnd
 Exit FR when all packets outstanding when FR was
entered are acknowledged
Outline
 Introduction
 TCP/AQM algorithms
 Window flow control
 Source algorithm: Tahoe, Reno, Vegas
 Link algorithm: RED, REM
 Duality model
 Equilibrium of TCP/AQM
 Linear dynamic model
 Dynamics of TCP/AQM
 A scalable control
TCP Reno
(Jacobson 1990)
window
SS
time
CA
SS: Slow Start
CA: Congestion Avoidance
Fast retransmission/fast recovery
TCP Vegas
(Brakmo & Peterson 1994)
window
SS
CA
 Converges, no retransmission
 … provided buffer is large enough
time
Vegas CA algorithm
for every RTT
{
if W/RTTmin – W/RTT < a then W ++
if W/RTTmin – W/RTT > b then W --
for every loss
W := W/2
queue size
}
Implications
Congestion measure = end-to-end queueing
delay
At equilibrium
Zero loss
Stable window at full utilization
Approximately weighted proportional fairness
Nonzero queue, larger for more sources
Convergence to equilibrium
Converges if sufficient network buffer
Oscillates like Reno otherwise
Outline
 Introduction
 TCP/AQM algorithms
 Window flow control
 Source algorithm: Tahoe, Reno, Vegas
 Link algorithm: RED, REM
 Duality model
 Equilibrium of TCP/AQM
 Linear dynamic model
 Dynamics of TCP/AQM
 A scalable control
RED
(Floyd & Jacobson 1993)
 Idea: warn sources of incipient congestion by
probabilistically marking/dropping packets
 Link algorithm to work with source algorithm
(Reno)
 Bonus: desynchronization
 Prevent bursty loss with buffer overflows
marking
window
1
Avg queue
router
B
time
host
RED
Implementation
Probabilistically drop packets
Probabilistically mark packets
Marking requires ECN bit (RFC 2481)
Performance
Desynchronization works well
Extremely sensitive to parameter setting
Fail to prevent buffer overflow as #sources
increases
Variant: ARED
(Feng, Kandlur, Saha, Shin 1999)
Motivation: RED extremely sensitive to
#sources
Idea: adapt maxp to load
If avg. queue < minth, decrease maxp
If avg. queue > maxth, increase maxp
No per-flow information needed
Variant: FRED
(Ling & Morris 1997)
 Motivation: marking packets in proportion to flow
rate is unfair (e.g., adaptive vs unadaptive flows)
 Idea
 A flow can buffer up to minq packets without being
marked
 A flow that frequently buffers more than maxq packets
gets penalized
 All flows with backlogs in between are marked according
to RED
 No flow can buffer more than avgcq packets persistently
 Need per-active-flow accounting
Variant: SRED
(Ott, Lakshman & Wong 1999)
Motivation: wild oscillation of queue in RED
when load changes
Idea:
Estimate number N of active flows
An arrival packet is compared with a randomly
chosen active flows
N ~ prob(Hit)-1
cwnd~p-1/2 and Np-1/2 = Q0 implies p = (N/Q0)2
Marking prob = m(q) min(1, p)
No per-flow information needed
Variant: BLUE
(Feng, Kandlur, Saha, Shin 1999)
Motivation: wild oscillation of RED leads to
cyclic overflow & underutilization
Algorithm
On buffer overflow, increment marking prob
On link idle, decrement marking prob
Variant: SFB
 Motivation: protection against nonadaptive flows
 Algorithm
 L hash functions map a packet to L bins (out of NxL )
 Marking probability associated with each bin is
 Incremented if bin occupancy exceeds threshold
 Decremented if bin occupancy is 0
 Packets marked with min {p1, …, pL}
h1
nonadaptive
adaptive
h2
1
hL-1
hL
1
1
1
Variant: SFB
Idea
A nonadaptive flow drives marking prob to 1
at all L bins it is mapped to
An adaptive flow may share some of its L bins
with nonadaptive flows
Nonadaptive flows can be identified and
penalized
REM
Athuraliya & Low 2000
Main ideas
Decouple congestion & performance measure
Price adjusted to match rate and clear buffer
Marking probability exponential in `price’
REM
RED
1
0.9
1
Link marking probability
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0
2
4
6
8
10
12
14
Link congestion measure
16
18
20
Avg queue
Part II
Models
Congestion Control
Heavy tail  Mice-elephants
Internet
Elephant
efficient & fair sharing
Mice
small delay
queueing + propagation
Congestion control
CDN
TCP
xi(t)
TCP & AQM
pl(t)
xi(t)
Example congestion measure pl(t)
 Loss (DropTail)
 Queue length (RED)
 Queueing delay (Vegas)
 Price (REM)
TCP & AQM
pl(t)
xi(t)
Duality theory  equilibrium
 Source rates xi(t) are primal variables
 Congestion measures pl(t) are dual variables
 Congestion control is optimization process over
Internet
TCP & AQM
pl(t)
xi(t)
Control theory  stability & robustness
 Internet is a gigantic feedback system
 Distributed & delayed
Outline
Introduction
TCP/AQM algorithms
Duality model
Equilibrium of TCP/AQM
Linear dynamic model
Dynamics of TCP/AQM
A scalable control
Overview: equilibrium
Interaction of source rates xs(t) and
congestion measures pl(t)
Duality theory
 They are primal and dual variables
 Flow control is optimization process
Example congestion measure
 Loss (Reno)
 Queueing delay (Vegas)
Overview: equilibrium
 Congestion control problem
U s ( xs )

max
x 0
s
s
subject to
x c,
l
l
l  L
 Primal-dual algorithm
x(t  1)  F ( p(t ), x(t ))
p(t  1)  G ( p(t ), x(t ))
 TCP/AQM protocols (F, G)
 Maximize aggregate source utility
 With different utility functions Us(xs)
Reno, Vegas
DropTail, RED, REM
Overview: Reno
 Equilibrium characterization
(1  qi )
t
Duality
2
i

xi2
qi
2
 U sreno ( xs ) 
2
xi 
t i qi
2
ti
 xt 
t an1  i i 
 2 
 Congestion measure p = loss
 Implications




Reno equalizes window wi = ti xi
inversely proportional to delay ti
1 p dependence for small p
DropTail fills queue, regardless of queue capacity
Overview: AQM
Decouple congestion
& performance measure
DropTail
queue = 94%
REM
queue = 1.5 pkts
utilization = 92%
g = 0.05, a = 0.4, f = 1.15
RED
min_th = 10 pkts
max_th = 40 pkts
max_p = 0.1
Overview: equilibrium
 DropTail
 High utilization
 Fills buffer, maximize queueing delay
 RED
 Couples queue length with congestion measure
 High utilization or low delay
 REM
 Decouple performance & congestion measure
 Match rate, clear buffer
 Sum prices
Overview: Vegas
 Delay
 Congestion measures: end to end queueing delay
ds
 Sets rate xs (t )  a s
qs (t )
 Equilibrium condition: Little’s Law
 Loss
 No loss if converge (with sufficient buffer)
 Otherwise: revert to Reno (loss unavoidable)
 Fairness
 Utility function U s
 Proportional fairness
reno
( x )  a d log x
s
s
s
s
Model
Sources s
L(s) - links used by source s
Us(xs) - utility if source rate = xs
 Network
 Links l of capacities cl
x1
x1  x3  c2
x1  x2  c1
c1
c2
x2
x3
Primal problem
max
U ( x )
subject to
x c,
xs  0
s
s
s
l
l
Assumptions
Strictly concave increasing Us
 Unique optimal rates xs exist
 Direct solution impractical
l  L
Prior Work
 Formulation
 Kelly 1997
 Penalty function approach
 Kelly, Maulloo and Tan 1998
 Kunniyur and Srikant 2000
 Duality approach
 Low and Lapsley 1999
 Athuraliya and Low 2000, Low 2000
 Extensions
 Mo & Walrand 1998
 La & Anantharam 2000
Prior Work
 Formulation
 Kelly 1997
 Penalty function approach
 Kelly, Maulloo and Tan 1998
 Kunniyur and Srikant 2000
 Duality approach
 Low and Lapsley 1999
 Athuraliya and Low 2000, Low 2000
 Extensions
 Mo & Walrand 1998
 La & Anantharam 2000
Example
 log x
m ax
xs  0
s
s
subject t o
x1  x2  1
 Lagrange multiplier:
p1 = p2 = 3/2
x1  x3  1
 Optimal:
x1 = 1/3
x2 = x3 = 2/3
x1
1
x2
1
x3
Example
 xs : proportionally fair
 pl : Lagrange multiplier, (shadow) price, congestion
measure
 How to compute (x, p)?
 Relevance to TCP/AQM ??
x1
1
x2
1
x3
Example
 xs : proportionally fair (Vegas)
 pl : Lagrange multiplier, (shadow) price, congestion
measure
 How to compute (x, p)?
 Gradient algorithms, Newton algorithm, Primal-dual algorithms…
 Relevance to TCP/AQM ??
 TCP/AQM protocols implement primal-dual algorithms
over Internet

p1 (t  1)  p1 (t )  g ( x1 (t )  x2 (t )  1)
p2 (t  1)  p2 (t )  g ( x1 (t )  x3 (t )  1) 

Aggregate rate
x1
1
x1 (t  1)
;
p1 (t )  p2 (t )
x2 (t  1)
x2
1 1
;
p1 (t )
x3 (t  1)
x3
11
;
p2 (t )
Example
 xs : proportionally fair (Vegas)
 pl : Lagrange multiplier, (shadow) price, congestion
measure
 How to compute (x, p)?
 Gradient algorithms, Newton algorithm, Primal-dual algorithms…
 Relevance to TCP/AQM ??
 TCP/AQM protocols implement primal-dual algorithms over Internet
x1
1
x2
1
x3
Duality Approach
Primal : max
xs  0
Dual :
 U s ( xs )
subject to x l  cl , l  L
s

min D( p)   max
p 0
 xs  0
l 
U
(
x
)

p
(
c

x
s s s l l l ) 
Primal-dual algorithm:
x(t  1)  F ( p(t ), x(t ))
p(t  1)  G ( p(t ), x(t ))
Duality Model of TCP
 TCP iterates on rates (windows)
 AQM iterates on congestion measures
 With different utility functions
Primal-dual algorithm:
x(t  1)  F ( p(t ), x(t ))
p(t  1)  G ( p(t ), x(t ))
Reno, Vegas
DropTail, RED, REM
Summary
 Congestion control problem
U s ( xs )

max
x 0
s
s
subject to
x c,
l
l
l  L
 Primal-dual algorithm
x(t  1)  F ( p(t ), x(t ))
p(t  1)  G ( p(t ), x(t ))
 TCP/AQM protocols (F, G)
 Maximize aggregate source utility
 With different utility functions Us(xs)
Reno, Vegas
DropTail, RED, REM
(F, G, U) model
Derivation
Derive (F, G) from protocol description
Fix point (x, p) = (F, G) gives equilibrium
Derive U
regard fixed point as Kuhn-Tucker condition
Application: equilibrium properties
 Performance
Throughput, loss, delay, queue length
 Fairness, friendliness
Outline
Introduction
TCP/AQM algorithms
Duality model (F, G, U)
Queue management G : RED, REM
TCP F and U : Reno, Vegas
Linear dynamic model
Dynamics of TCP/AQM
A scalable control
Active queue management
Idea: provide congestion information by
probabilistically marking packets
Issues
How to measure congestion (p and G)?
How to embed congestion measure?
How to feed back congestion info?
x(t+1) = F( p(t), x(t) )
p(t+1) = G( p(t), x(t) )
Reno, Vegas
DropTail, RED, REM
RED
(Floyd & Jacobson 1993)
 Congestion measure: average queue length
pl(t+1) = [pl(t) + xl(t) - cl]+
 Embedding: p-linear probability function
marking
1
Avg queue
 Feedback: dropping or ECN marking
REM
(Athuraliya & Low 2000)
 Congestion measure: price
pl(t+1) = [pl(t) + g(al bl(t)+ xl (t) - cl )]+
 Embedding:
 Feedback: dropping or ECN marking
REM
(Athuraliya & Low 2000)
 Congestion measure: price
pl(t+1) = [pl(t) + g(al bl(t)+ xl (t) - cl )]+
 Embedding: exponential probability function
1
0.9
Link marking probability
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0
2
4
6
8
10
12
Link congestion measure
14
16
18
20
 Feedback: dropping or ECN marking
Key features
Clear buffer and match rate
pl (t  1)  [ pl (t )  g ( al bl (t )  xˆ l (t )  cl )]
Clear buffer
Match rate
Sum prices
1f
 pl (t )
 1f
 p s (t )
Theorem (Paganini 2000)
Global asymptotic stability for general utility
function (in the absence of delay)
Active Queue Management
DropTail
RED
Vegas
REM
pl(t)
loss
queue
delay
price
G(p(t), x(t))
[1 - cl/xl (t)]+ (?)
[pl(t) + xl(t) - cl]+
[pl(t) + xl (t)/cl - 1]+
[pl(t) + g(al bl(t)+ xl (t) - cl )]+
x(t+1) = F( p(t), x(t) )
p(t+1) = G( p(t), x(t) )
Reno, Vegas
DropTail, RED, REM
Congestion & performance
DropTail
RED
Vegas
REM
pl(t)
loss
queue
delay
price
G(p(t), x(t))
[1 - cl/xl (t)]+ (?)
[pl(t) + xl(t) - cl]+
[pl(t) + xl (t)/cl - 1]+
[pl(t) + g(al bl(t)+ xl (t) - cl )]+
 Decouple congestion & performance measure
 RED: `congestion’ = `bad performance’
 REM: `congestion’ = `demand exceeds supply’
But performance remains good!
Outline
Introduction
TCP/AQM algorithms
Duality model (F, G, U)
Queue management G : RED, REM
TCP F and U : Reno, Vegas
Linear dynamic model
Dynamics of TCP/AQM
A scalable control
Utility functions
 xit i 
( xi ) 
tan 

ti
 2 
2
1
 Reno
U
 Vegas
Uivegas ( xi )  ai di log xi
reno
i
Reno: F
for
{
for
{
every ack (ca)
W += 1/W
}
every loss
W := W/2
}
ws (t  
xs (t )(1  p(t ))
ws (t )

xs (t ) p(t )
ws
2
Primal-dual algorithm:
x(t+1) = F( p(t), x(t) )
p(t+1) = G( p(t), x(t) )
Reno, Vegas
DropTail, RED, REM
Reno: F
for
{
for
{
every ack (ca)
W += 1/W
}
every loss
W := W/2
}
ws (t  
xs (t )(1  p(t ))
ws (t )

xs (t ) p(t )
ws
2
Primal-dual algorithm:
x(t+1) = F( p(t), x(t) )
p(t+1) = G( p(t), x(t) )
Reno, Vegas
DropTail, RED, REM
Reno: F
for
{
for
{
every ack (ca)
W += 1/W
}
every loss
W := W/2
}
ws (t  
xs (t )(1  p(t ))
ws (t )

xs (t ) p(t )
ws
2
(1  p(t )) xs2 (t )
Fs ( p(t ), x(t )   xs (t ) 

p(t )
2
Ds
2
Primal-dual algorithm:
x(t+1) = F( p(t), x(t) )
p(t+1) = G( p(t), x(t) )
Reno, Vegas
DropTail, RED, REM
Implications
 Equilibrium characterization
(1  qi )
t
Duality
2
i

xi2
qi
2
 U sreno ( xs ) 
2
xi 
t i qi
2
ti
 xt 
t an1  i i 
 2 
 Congestion measure p = loss
 Implications




Reno equalizes window wi = ti xi
inversely proportional to delay ti
1 p dependence for small p
DropTail fills queue, regardless of queue capacity
Validation - Reno
 30 sources, 3 groups with RTT = 3, 5, 7ms + 6ms (queueing delay)
Link capacity = 64 Mbps, buffer = 50 kB
 Measured windows equalized, match well with theory (black line)
Validation – Reno/RED
 30 sources, 3 groups with RTT = 3, 5, 7 ms + 6 ms (queueing delay)
 Link capacity = 64 Mbps, buffer = 50 kB
Validation – Reno/REM
 30 sources, 3 groups with RTT = 3, 5, 7 ms
 Link capacity = 64 Mbps, buffer = 50 kB
 Smaller window due to small RTT (~0 queueing delay)
Queue
REM
queue = 1.5 pkts
utilization = 92%
p = Lagrange multiplier!
DropTail
queue = 94%
g = 0.05, a = 0.4, f = 1.15
p decoupled from queue
p increasing in queue!
RED
min_th = 10 pkts
max_th = 40 pkts
max_p = 0.1
Summary
 DropTail
 High utilization
 Fills buffer, maximize queueing delay
 RED
 Couples queue length with congestion measure
 High utilization or low delay
 REM
 Decouple performance & congestion measure
 Match rate, clear buffer
 Sum prices
Gradient algorithm
 Gradient algorithm
source :
xi (t 1)  Ui'1 (qi (t ))
link:
pl (t 1)  [ pl (t )  g ( yl (t )  cl )]
Theorem (ToN’99)
Converge to optimal rates in asynchronous
environment
Reno & gradient algorithm
 Gradient algorithm
source :
xi (t 1)  Ui'1 (qi (t ))
link:
pl (t 1)  [ pl (t )  g ( yl (t )  cl )]
 TCP approximate version of gradient algorithm
Fi (qi (t ), xi (t )   xi (t ) 
(1  qi (t ))
t i2
xi2 (t )

qi (t )
2
Reno & gradient algorithm
 Gradient algorithm
source :
xi (t 1)  Ui'1 (qi (t ))
link:
pl (t 1)  [ pl (t )  g ( yl (t )  cl )]
 TCP approximate version of gradient algorithm
qi (t ) 2


xi (t  1   xi (t ) 
( xi (t )  xi2 (t ))
2


Ui'1 (qi (t ))

Outline
Introduction
TCP/AQM algorithms
Duality model (F, G, U)
Queue management G : RED, REM
TCP F and U : Reno, Vegas
Linear dynamic model
Dynamics of TCP/AQM
A scalable control
Vegas
for every RTT
{
if W/RTTmin – W/RTT < a then W ++
if W/RTTmin – W/RTT > a then W --
}
for every loss
W := W/2
F:

1
xs (t  1   xs (t )  2
Ds


1
xs (t  1   xs (t )  2
Ds

xs (t  1  xs (t )
G:
queue size
if ws (t )  d s xs (t )  a s d s
if ws (t )  d s xs (t )  a s d s
else
pl(t+1) = [pl(t) + xl (t)/cl - 1]+
Implications
Performance
Rate, delay, queue, loss
Interaction
Fairness
TCP-friendliness
Utility function
Performance
Delay
Congestion measures: end to end queueing
delay
ds
Sets rate xs (t )  a s
qs (t )
Equilibrium condition: Little’s Law
Loss
No loss if converge (with sufficient buffer)
Otherwise: revert to Reno (loss unavoidable)
Vegas Utility
Equilibrium (x, p) = (F, G)
U
reno
s
( x )  a d log x
s
s
s
s
Proportional fairness
Vegas & Basic Algorithm
 Basic algorithm
source :
xs (t 1)  Us'1 ( p(t ))
 TCP smoothed version of Basic Algorithm …
Vegas & Gradient Algorithm
 Basic algorithm
source :
xs (t 1)  Us'1 ( p(t ))
 TCP smoothed version of Basic Algorithm …
 Vegas

1
xs (t  1   xs (t )  2
Ds


1
xs (t  1   xs (t )  2
Ds

xs (t  1  xs (t )
if xs (t )  xs (t )
if xs (t )  xs (t )
else
U s'1 ( p(t ))
Validation
Source 1
RTT (ms)
Rate (pkts/s)
Window (pkts)
Avg backlog (pkts)
Source 3
17.1 (17)
1205 (1200)
20.5 (20.4)
9.8 (10)
meausred
21.9 (22)
1228 (1200)
27 (26.4)
Source 5
41.9 (42)
1161 (1200)
49.8 (50.4)
theory
Single link, capacity = 6 pkts/ms
5 sources with different propagation delays, as = 2 pkts/RTT
Persistent congestion
 Vegas exploits buffer process to compute prices
(queueing delays)
 Persistent congestion due to
 Coupling of buffer & price
 Error in propagation delay estimation
 Consequences
 Excessive backlog
 Unfairness to older sources
Theorem
A relative error of es in propagation delay estimation
distorts the utility function to
Uˆ s ( xs )  (1  e s )a s ds log xs  e s ds xs
Evidence
Without estimation error
With estimation error
 Single link, capacity = 6 pkt/ms, as = 2 pkts/ms, ds = 10 ms
 With finite buffer: Vegas reverts to Reno
Evidence
Source rates (pkts/ms)
# src1
src2
1 5.98 (6)
2 2.05 (2)
3.92 (4)
3 0.96 (0.94) 1.46 (1.49)
4 0.51 (0.50) 0.72 (0.73)
5 0.29 (0.29) 0.40 (0.40)
#
1
2
3
4
5
queue (pkts)
19.8 (20)
59.0 (60)
127.3 (127)
237.5 (238)
416.3 (416)
src3
src4
3.54 (3.57)
1.34 (1.35)
0.68 (0.67)
3.38 (3.39)
1.30 (1.30)
baseRTT (ms)
10.18 (10.18)
13.36 (13.51)
20.17 (20.28)
31.50 (31.50)
49.86 (49.80)
src5
3.28 (3.34)
Vegas/REM
 To preserve Vegas utility function & rates
 REM
xs  a
ds
s qs
end2end queueing delay
 Clear buffer : estimate of ds
 Sum prices : estimate of ps
 Vegas/REM

1
xs (t  1   xs (t )  2
Ds


1
xs (t  1   xs (t )  2
Ds

xs (t  1  xs (t )
if xs (t )  xˆ s (t )
if xs (t )  xˆ s (t )
else
Vegas/REM
 To preserve Vegas utility function & rates
 REM
xs  a
ds
s qs
end2end queueing delay
xs :  a
 Clear buffer
 Sum prices : estimate of ps
ds
s Ds of
d s ds
estimate
 Vegas/REM

1
xs (t  1   xs (t )  2
Ds


1
xs (t  1   xs (t )  2
Ds

xs (t  1  xs (t )
if xs (t )  xˆ s (t )
if xs (t )  xˆ s (t )
else
Vegas/REM
 To preserve Vegas utility function & rates
 REM
xs  a s
ds
end2end price
ps
 Clear buffer : estimate of ds
 Sum prices : estimate of ps
 Vegas/REM

1
xs (t  1   xs (t )  2
Ds


1
xs (t  1   xs (t )  2
Ds

xs (t  1  xs (t )
if xs (t )  xˆ s (t )
if xs (t )  xˆ s (t )
else
Vegas/REM
 To preserve Vegas utility function & rates
 REM
xs  a s
ds
end2end price
ps
 Clear buffer : estimate of ds
 Sum prices : estimate of ps
 Vegas/REM

1
xs (t  1   xs (t )  2
Ds


1
xs (t  1   xs (t )  2
Ds

xs (t  1  xs (t )
if xs (t )  xˆ s (t )
if xs (t )  xˆ s (t )
else
Performance
peak = 43 pkts
utilization : 90% - 96%
Vegas/REM
Vegas
Conclusion
Duality model of TCP: (F, G, U)
x(t  1)  F ( p(t ), x(t ))
p(t  1)  G ( p(t ), x(t ))
Reno, Vegas
 Maximize aggregate utility
 With different utility functions
DropTail, RED, REM
 Decouple congestion & performance
 Match rate, clear buffer
 Sum prices
Food for thought
How to tailor utility to application?
Choosing congestion control automatically
fixes utility function
Can use utility function to determine
congestion control
Outline
Introduction
TCP/AQM algorithms
Duality model (F, G, U)
Equilibrium of TCP/AQM
Linear dynamic model
Stability of TCP/AQM
A scalable control
Model assumptions
 Small marking probabilities
 End to end marking probability
1   (1  pl )   pl
l
l
 Congestion avoidance
dominates
 Receiver not limiting
 Decentralized
 TCP algorithm depends only on end-to-end measure
of congestion
 AQM algorithm depends only on local & aggregate
rate or queue
 Constant (equilibrium) RTT
Dynamics
Small effect on queue
AIMD
Mice traffic
Heterogeneity
Big effect on queue
Stability!
Stable: 20ms delay
Window
70
60
Window (pkts)
50
40
individual window
30
20
10
0
0
1000
2000
3000
4000
5000 6000
time (ms)
7000
8000
9000 10000
Window
Ns-2 simulations, 50 identical FTP sources, single link 9 pkts/ms, RED marking
Stable: 20ms delay
Window
Instantaneous queue
70
800
60
700
600
40
Instantaneous queue (pkts)
Window (pkts)
50
individual window
average window
30
20
500
400
300
200
10
0
0
100
1000
2000
3000
4000
5000 6000
time (ms)
Window
7000
8000
9000 10000
0
0
1000
2000
3000
4000
5000 6000
time (ms)
7000
8000
Queue
Ns-2 simulations, 50 identical FTP sources, single link 9 pkts/ms, RED marking
9000 10000
Unstable: 200ms delay
Window
70
individual window
60
Window (pkts)
50
40
30
20
10
0
0
1000
2000
3000
4000 5000 6000
time (10ms)
7000
8000
9000 10000
Window
Ns-2 simulations, 50 identical FTP sources, single link 9 pkts/ms, RED marking
Unstable: 200ms delay
Window
Instantaneous queue
70
800
individual window
700
60
600
Instantaneous queue (pkts)
Window (pkts)
(pkts)
Window
50
40
30
20
500
400
300
200
10
0
average window
0
1000
2000
3000
4000 5000 6000 7000 8000
time (10ms)
Window
100
9000 10000
0
0
1000
2000
3000
4000 5000 6000
time (10ms)
7000
Queue
Ns-2 simulations, 50 identical FTP sources, single link 9 pkts/ms, RED marking
8000
9000 10000
Other effects on queue
Instantaneous queue
20ms
50% noise
700
600
600
400
300
instantaneous queue (pkts)
500
500
400
300
500
400
300
200
200
200
100
100
100
0
0
0
1000
2000
3000
4000
5000 6000
time (ms)
7000
8000
9000 10000
0
0
10
20
40
50
60
time (sec)
70
80
90
100
200ms
500
400
300
200
200
100
100
0
3000
4000 5000 6000
time (10ms)
7000
8000
9000 10000
30
40
50
60
time (sec)
70
80
90
100
600
instantaneous queue (pkts)
instantaneous queue (pkts)
300
2000
20
avg delay 208ms
700
600
400
1000
10
Instantaneous queue (pkts)
50% noise
700
500
0
0
800
800
600
0
30
Instantaneous queue (50% noise)
Instantaneous queue
800
700
avg delay 16ms
700
600
instantaneous queue (pkts)
Instantaneous queue (pkts)
800
800
700
Instantaneous queue (pkts)
Instantaneous queue (pkts)
Instantaneous queue (50% noise)
800
500
400
300
200
100
0
10
20
30
40
50
60
time (sec)
70
80
90
100
0
0
10
20
30
40
50
60
time (sec)
70
80
90
100
Effect of instability
Larger jitters in throughput
Lower utilization
Mean and variance of individual window
160
140
variance
120
window
100
80
60
40
mean
20
0
20
No noise
40
60
80
100
120
delay (ms)
140
160
180
200
Linearized system
x
y
Rf(s)
F1
G1
Network
TCP
FN
q
AQM
GL
Rb
p
’(s)
Fi (qi (t ), xi (t )   xi (t ) 
(1  qi (t ))
t i2
xi2 (t )

qi (t )
2
~
Gl ( pl (t ), yl (t ) : Gl ( zl (t ), pl (t ), yl (t ))
Linearized system
x
y
Rf(s)
F1
G1
FN
GL
q
Rb
p
’(s)
1
ac
1
t i s
L( s ) 



e
*
* *
xn  bt n s
t
p
(
t
s

p
w ) s  ac s  1
i
i
i
e

c n tn
TCP
RED
queue
delay
Stability condition
Theorem
1 b
3 3
TCP/RED stable if c t | H | 
a
Nyquist plot of h(v, theta)
2
w 
0
w0
Im
-2
-4
-6
-8
-10
-10
-5
0
5
Re
10
15
Stability condition
Theorem
1 b
3 3
TCP/RED stable if c t | H | 
a
Nyquist plot of h(v, theta)
2
0
Small a
 Slow response
 Large delay
Im
-2
-4
-6
-8
-10
-10
-5
0
5
Re
10
15
Stability condition
Theorem
1 b
3 3
TCP/RED stable if c t | H | 
a
10
Nyquist plot of h(v, theta)
4.5
2
Stability condition
x 10
N=200
c=50 pkts/ms
4
0
LHS of stability condition
3.5
Im
-2
-4
-6
3
2.5
c=40
2
c=10
1.5
c=30
1
-8
0.5
-10
-10
-5
0
5
Re
10
15
0
20
c=20
30
40
50
60
delay (ms)
70
80
90
100
Validation
Critical frequency (Hz)
Round trip propagation delay at critical frequency (ms)
5
100
4.5
95
4
90
3.5
frequency (model)
delay (model)
85
80
75
70
3
2.5
2
65
1.5
60
1
55
0.5
50
50
dynamic-link model
30 data points
0
55
60
65
70
75
80
delay (NS)
85
90
95
100
0
0.5
1
1.5
2
2.5
3
frequency (NS)
3.5
4
4.5
5
Validation
Critical frequency (Hz)
Round trip propagation delay at critical frequency (ms)
5
100
4.5
95
4
90
3.5
frequency (model)
delay (model)
85
80
75
70
3
2.5
2
65
1.5
60
1
55
0.5
50
50
static-link model
dynamic-link model
30 data points
55
60
65
70
75
80
delay (NS)
85
90
95
100
0
0
0.5
1
1.5
2
2.5
3
frequency (NS)
3.5
44
4.5
4.5
55
Stability region
Round trip propagation delay at critical frequency
100
N=20
95
N=60
Unstable for
 Large delay
 Large capacity
 Small load
90
N=40
delay (ms)
85
80
75
N=30
70
65
60
N=20
55
50
8
9
10
11
12
capacity (pkts/ms)
13
14
15
Role of AQM
(Sufficient) stability condition
K  co{ H (w ; ) }
-1
Role of AQM
(Sufficient) stability condition
K  co{ H (w ; ) }
-1
ct
TCP:
2N
2
2
e  jw
jw  p*w*
AQM: scale down K & shape H
Role of AQM
K 
Problem!!
co{ H (w ; ) }
c 2t 2
TCP:
2N
e  jw
jw  p*w*
cta
RED:
1 b
e  j
a 2t
REM/PI:
1 b
e  j
1

w jw  act
jw  a1t / a2

w
jw
Queue dynamics (windowing)
Scalable control??
pl(t)
xi(t)
Scalable Control
Stability
REM
Utilization
Delay
Outline
Introduction
TCP/AQM algorithms
Duality model (F, G, U)
Equilibrium of TCP/AQM
Linear dynamic model
Stability of TCP/AQM
A scalable control
Delay compensation
 Equilibrium
 High utilization: itegrator at links
 Low loss & delay: VQ, REM/PI
 Stability
 Integrator+network delay always unstable for high t !
a
s
Source
e
ts
Network
1
s
Link
Delay compensation
 Equilibrium
 High utilization: itegrator at links
 Low loss & delay: REM, PI (HMTG01a)
 Stability
 Integrator+network delay always unstable for high t !
 Delay invariance
 Scale down gain by t (known at sources)
 This is self-clocking!
a
t
Source
e
ts
Network
1
s
Link
Compensation for capacity
Speed of adaptation increases with
Link capacity
#bottleneck links in path
Capacity invariance
Scale down by capacity cl at links
Scale up by rate xi(t) at sources
x0i
1

c
i
Scalable control
(Paganini, Doyle, Low 01)
TCP:
xi (t )  xi e
AQM:
1
 l (t ) 
p
cl

ai
qi ( t )
t i mi
( yl (t )  cl 
Theorem (Paganini, Doyle, Low 2001)
Provided R is full rank, feedback loop is locally
stable for arbitrary delay, capacity, load and
topology
Scalable control
(Paganini, Doyle, Low 01)
TCP:
xi (t )  xi e
AQM:
1
 l (t ) 
p
cl

ai
qi ( t )
t i mi
( yl (t )  cl 
1
Theorem (Paganini, Doyle, Low 2001)
Provided R is full rank, feedback loop is locally
stable for arbitrary delay, capacity, load and
topology
Utility function
0.9
Link marking probability
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0
2
4
6
8
10
12
Link congestion measure
14
16
18
20
Stability
Individual window
Instantaneous queue
20
40ms, window
18
queue
12
10
8
6
800
instantaneous queue (pkts)
14
350
300
250
200
150
700
600
500
300
100
200
2
50
100
0
0
5
10
15
0
2
4
time (sec)
Individual window
6
8
time (sec)
10
12
0
14
0
200ms, window
Instantaneous queue
queue
80
60
40
5000
4000
3000
2000
1000
20
0
5
10
time (sec)
15
56% noise
6000
instantaneous queue (pkts)
instantaneous queue (pkts)
100
15
7000
6000
120
10
time (sec)
7000
0
5
Instantaneous queue
160
140
50, 40ms
400
4
0
54% noise
900
400
instantaneous queue (pkts)
Individual window (pkts)
1000
450
16
Individual window (pkts)
Instantaneous queue
500
5000
4000
3000
2000
1000
0
0
2
4
6
8
time (sec)
10
12
14
0
0
5
10
time (sec)
15
20
Papers
netlab.caltech.edu
 Scalable laws for stable network congestion
control (CDC2001)
 A duality model of TCP flow controls (ITC, Sept 2000)
 Optimization flow control, I: basic algorithm &
convergence (ToN, 7(6), Dec 1999)
 Understanding Vegas: a duality model (ACM Sigmetrics,
June 2000)
 REM: active queue management
(Network, May/June 2001)
The End
Backup slides
1
p
Law
 Equilibrium window size
 Equilibrium rate xs 
a
a
w 
p
s
D p
s
 Empirically constant a ~ 1
 Verified extensively through simulations and on
Internet
 References
 T.J.Ott, J.H.B. Kemperman and M.Mathis (1996)
 M.Mathis, J.Semke, J.Mahdavi, T.Ott (1997)
 T.V.Lakshman and U.Mahdow (1997)
 J.Padhye, V.Firoiu, D.Towsley, J.Kurose (1998)
 J.Padhye, V.Firoiu, D.Towsley (1999)
Implications
Applicability
Additive increase multiplicative decrease (Reno)
Congestion avoidance dominates
No timeouts, e.g., SACK+RH
Small losses
Persistent, greedy sources
Receiver not bottleneck
Implications
Reno equalizes window
Reno discriminates against long connections
Derivation (I)
window
4w/3
w = (4w/3+2w/3)/2
2w/3
t
2w/3
Area = 2w2/3
 Each cycle delivers 2w2/3 packets
 Assume: each cycle delivers 1/p packets
 Delivers 1/p packets followed by a drop
 Loss probability = p/(1+p) ~ p if p is small.
 Hence w  3 / 2 p
Derivation (II)
 Assume: loss occurs as Bernoulli process rate p
 Assume: spend most time in CA
 Assume: p is small
 wn is the window size after nth RTT
w / 2, if a packetis lost (prob. pwn )
wn1   n
wn  1, if no packetis lost (prob.(1  pwn ))
w2
w
pw  ( w  1)(1  pw )
2
 2 p
w

w

2 p
Simulations
Refinement
(Padhye, Firoin, Towsley & Kurose 1998)
 Renewal model including
 FR/FR with Delayed ACKs (b packets per ACK)
 Timeouts
 Receiver awnd limitation
 Source rate

 Wr
xs  min ,
 Ds D
s





1



2bp
3bp
 p(1  32 p 2 ) 
 To min1,3

3
8 


 When p is small and Wr is large, reduces to
x 
s
a
D
s
p
Further Refinements
Further refinements of previous formula
Padhye, Firoiu, Towsley and Kurose (1999)
Other loss models
E.Altman, K.Avrachenkov and C.Barakat
(Sigcomm 2000)
Square root p still appears!
Dynamic models of TCP
e.g. RTT evolves as window increases
Link layer protocols
Interference suppression
Reduces link error rate
Power control, spreading gain control
Forward error correction (FEC)
Improves link reliability
Link layer retransmission
Hides loss from transport layer
Source may timeout while BS retransmits
Split TCP
TCP
TCP
 Each TCP connection is split into two
 Between source and BS
 Between BS and mobile
 Disadvantages
 TCP not suitable for lossy link
 Overhead: packets TCP-processed twice at BS (vs. 0)
 Violates end-to-end semantics
 Per-flow information at BS complicates handover
Snoop protocol
TCP
snooper
 Snoop agent
 Monitors packets in both directions
 Detects loss by dupACKs or local timeout
 Retransmits lost packet
 Suppresses dupACKs
 Disadvantages
 Cannot shield all wireless losses
 One agent per TCP connection
 Source may timeout while BS retransmits
Explicit Loss Notification
Noncongestion losses are marked in ACKs
Source retransmits but do not reduce
window
Effective in improving throughput
Disadvantages
Overhead (TCP option)
May not be able to distinguish types of losses,
e.g., corrupted headers
REM
(Athuraliya & Low 2000)
 Congestion measure: price
pl(t+1) = [pl(t) + g(al bl(t)+ xl (t) - cl )]+
 Embedding: exponential probability function
1
0.9
Link marking probability
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0
2
4
6
8
10
12
Link congestion measure
14
16
18
20
 Feedback: dropping or ECN marking
Performance Comparison
RED
1
1
OR
High utilization
B
Low delay/loss
REM
match rate
High utilization
AND
clear buffer
Low delay/loss
B
Comparison with RED
REM
 Goodput
 Queue
 Loss
RED
Comparison with RED
REM
 Goodput
 Queue
 Loss
RED
Application: Wireless TCP
Reno uses loss as congestion measure
In wireless, significant losses due to
Fading
Interference
Handover
Not buffer overflow (congestion)
Halving window too drastic
Small throughput, low utilization
Proposed solutions
Ideas
Hide from source noncongestion losses
Inform source of noncongestion losses
Approaches
Link layer error control
Split TCP
Snoop agent
SACK+ELN (Explicit Loss Notification)
Third approach
Problem
Reno uses loss as congestion measure
Two types of losses
Congestion loss: retransmit + reduce window
Noncongestion loss: retransmit
Previous approaches
Hide noncongestion losses
Indicate noncongestion losses
Our approach
Eliminates congestion losses (buffer overflows)
Third approach
 Router
REM capable
 Host
Do not use loss as congestion measure
Vegas
REM
 Idea
 REM clears buffer
 Only noncongestion losses
 Retransmits lost packets without reducing window
Performance
Goodput
Performance
Goodput
Food for thought
How to tailor utility to application?
Choosing congestion control automatically
fixes utility function
Can use utility function to determine
congestion control
Incremental deployment strategy?
What if some, but not all, routers are ECNcapable
Acronyms
ACK
Acknowledgement
AQM
Active Queue Management
ARP
Address Resolution Protocol
ARQ
Automatic Repeat reQuest
ATMAsynchronous Transfer Mode
BSD
Berkeley Software Distribution
B
Byte (or octet) = 8 bits
bps
bits per second
CA
Congestion Avoidance
ECN
Explicit Congestion Notification
FIFO
First In First Out
FTP
File Transfer Protocol
HTTP
Hyper Text Transfer Protocol
IAB
Internet Architecture Board
ICMP
Internet Control Message Protocol
IETF
Internet Engineering Task Force
IP
Internet Protocol
ISOC
Internet Society
MSS
Maximum Segment Size
MTU
Maximum Transmission Unit
POS
Packet Over SONET
QoS
RED
RFC
RTT
RTO
SACK
SONET
SS
SYN
TCP
UDP
VQ
WWW
Quality of Service
Random Early Detection/Discard
Request for Comment
Round Trip Time
Retransmission TimeOut
Selective ACKnowledgement
Synchronous Optical NETwork
Slow Start
Synchronization Packet
Transmission Control Protocol
User Datagram Protocol
Virtual Queue
World Wide Web