FAST v3 - California Institute of Technology
Download
Report
Transcript FAST v3 - California Institute of Technology
FAST Protocols for Ultrascale Networks
Internet: distributed feedback control system
TCP: adapts sending rate to congestion
AQM: feeds back congestion information
AQM
wi
tan -1i (t ) 1
T (t ) 2
l
p
1
( yl (t ) cl )
cl
Faculty
Doyle (CDS,EE,BE)
Low (CS,EE)
Newman (Physics)
Paganini (UCLA)
Staff/Postdoc
Bunn (CACR)
Jin (CS)
Ravot (Physics)
Singh (CACR)
StarLight
p
Rb’(s)
xi
CERN
y
TCP
q
research & production
networks
Chicago
Rf (s)
x
WAN in Lab
Caltech
Calren2/Abilene
Geneva
xi ( t ) qi ( t )
i di
i (t )qi (t )
Multi-Gbps
50-200ms delay
Theory
Experiment
People
Implementation
Students
Choe (Postech/CIT)
Hu (Williams)
J. Wang (CDS)
Z.Wang (UCLA)
Wei (CS)
155Mb/s
SURFNet
Amsterdam
equilibrium
10Gb/s
slow
start
FAST
retransmit
time
out
FAST
recovery
Industry
Doraiswami (Cisco)
Yip (Cisco)
Partners
CERN, Internet2, CENIC, StarLight/UI, SLAC, AMPATH, Cisco
netlab.caltech.edu/FAST
FAST project
Protocols for ultrascale networks
>100 Gbps throughput, 50-200ms delay
Theory, algorithms, design, implement, demo, deployment
Faculty
Doyle (CDS, EE, BE): complex systems theory
Low (CS, EE): PI, networking
Newman (Physics): application, deployment
Paganini (EE, UCLA): control theory
Research staff
3 postdocs, 3 engineers, 8 students
Collaboration
Cisco, Internet2/Abilene, CERN, DataTAG (EU), …
Funding
NSF, DoE, Lee Center (AFOSR, ARO, Cisco)
netlab.caltech.edu
Outline
Motivation
Theory
Web layout
Content distribution
TCP/AQM (Jin, poster)
TCP/IP (poster)
Enforcing & inducing fairness
Optical switching (future)
netlab.caltech.edu
(poster)
High Energy Physics
Large global collaborations
2000 physicists from 150 institutions in >30 countries
300-400 physicists in US from >30 universities & labs
SLAC has 500TB data by 4/2002, world’s largest database
Typical file transfer ~1 TB
At 622Mbps: ~ 4 hrs
At 2.5Gbps: ~ 1 hr
At 10Gbps: ~15min
Gigantic elephants!
LHC (Large Hadron Collider) at CERN, to open 2007
Generate data at PB (1015B)/sec
Filtered in realtime by a factor of 106 to 107
Data stored at CERN at 100MB/sec
Many PB of data per year
To rise to Exabytes (1018B) in a decade
netlab.caltech.edu
HEP high speed network
… that must change
netlab.caltech.edu
HEP Network (DataTAG)
NewYork
ABILEN
E
UK
SuperJANET4
It
GARR-B
STARLIGHT
ESNET
GENEVA
GEANT
NL
SURFnet
STAR-TAP
CALRE
N
Fr
Renater
2.5 Gbps Wavelength Triangle 2002
10 Gbps Triangle in 2003
netlab.caltech.edu
Newman (Caltech)
Network upgrade
’01 ’02
155 622
netlab.caltech.edu
’03
2.5
’04
5
2001-06
’05
10
Projected performance
’01 ’02
155 622
’03
2.5
’04
5
’05
10
Ns-2: capacity = 155Mbps, 622Mbps, 2.5Gbps, 5Gbps, 10Gbps
100 sources, 100 ms round trip propagation delay
J. Wang (Caltech)
netlab.caltech.edu
Projected performance
FAST
Ns-2: capacity = 10Gbps
100 sources, 100 ms round trip propagation delay
netlab.caltech.edu
TCP/RED
J. Wang (Caltech)
Outline
Motivation
Theory
Web layout
Content distribution
TCP/AQM (Jin, poster)
TCP/IP (poster)
Enforcing & inducing fairness
Optical switching (future)
netlab.caltech.edu
(poster)
Protocol Decomposition
WWW, Email, Napster, FTP, …
Applications
TCP/AQM
IP
Transmission
Ethernet, ATM, POS, WDM, …
netlab.caltech.edu
HOT (Doyle et al)
Minimize user response time
Heavy-tailed file sizes
Duality model
Maximize aggregate utility
Shortest-path routing
Minimize path costs
Power control
Maximize channel capacity
Congestion control
pl(t)
xi(t)
Example congestion measure pl(t)
Loss (Reno)
Queueing delay (Vegas)
netlab.caltech.edu
TCP/AQM
pl(t)
TCP:
Reno
Vegas
xi(t)
AQM:
DropTail
RED
REM/PI
AVQ
Congestion control is a distributed asynchronous algorithm
to share bandwidth
It has two components
TCP: adapts sending rate (window) to congestion
AQM: adjusts & feeds back congestion information
They form a distributed feedback control system
Equilibrium & stability depends on both TCP and AQM
And on delay, capacity, routing, #connections
netlab.caltech.edu
Network model
x
Rf(s)
F1
Network
TCP
y
G1
FN
GL
q
Rb
R
f li
e
Rb li e
netlab.caltech.edu
AQM
s li
s li
’(s)
p
if source i uses link l
if source i uses link l
Vegas model
for every RTT
if W/RTTmin – W/RTT < then W ++
{
if W/RTTmin – W/RTT > then W --
}
queue size
Fi:
Gl:
1
xi 2
Ti (t )
if
xi (t )qi (t ) i di
1
xi 2
Ti (t )
if
xi (t )qi (t ) i di
xi 0
else
p l c1l ( yl (t ) cl )
netlab.caltech.edu
E2E queueing delay
Link queueing delay
Vegas model
x
Rf(s)
F1
Network
TCP
y
G1
FN
GL
q
Rb
1
Fi
sgn 1
2
T (t )
netlab.caltech.edu
AQM
xi ( t ) qi ( t )
i di
’(s)
p
yl (t )
Gl
1
cl
Methodology
Protocol
(Reno, Vegas, RED, REM/PI…)
x(t 1) F ( p(t ), x(t ))
p(t 1) G ( p(t ), x(t ))
Equilibrium
Performance
Throughput, loss, delay
Fairness
Utility
netlab.caltech.edu
Dynamics
Local stability
Cost of stabilization
Summary: duality model
Flow control problem
U ( x )
max
s
xs 0
s
s
Rx c
subject to
Primal-dual algorithm
x(t 1) F ( p (t ), x(t ))
Reno, Vegas
p (t 1) G ( p (t ), x(t ))
DropTail, RED, REM
TCP/AQM
Maximize utility with different utility functions
Theorem
(Low 00):
(x*,p*) primal-dual optimal iff
yl* cl with equality if
netlab.caltech.edu
pl* 0
Equilibrium of Vegas
Network
Link queueing delays: pl
Queue length:
clpl
Sources
Throughput:
xi
E2E queueing delay :
qi
Packets buffered:
xi qi i d i
Ui(x) = i di log x
Utility funtion:
Proportional fairness
netlab.caltech.edu
Persistent congestion
Vegas exploits buffer process to compute prices
(queueing delays)
Persistent congestion due to
Coupling of buffer & price
Error in propagation delay estimation
Consequences
Excessive backlog
Unfairness to older sources
Theorem
(Low, Peterson, Wang ’02)
A relative error of ei in propagation delay estimation
distorts the utility function to
Uˆ i ( xi ) (1 e i )i di log xi e i di xi
netlab.caltech.edu
Validation
(L. Wang, Princeton)
Source rates (pkts/ms)
# src1
src2
1 5.98 (6)
2 2.05 (2)
3.92 (4)
3 0.96 (0.94) 1.46 (1.49)
4 0.51 (0.50) 0.72 (0.73)
5 0.29 (0.29) 0.40 (0.40)
#
1
2
3
4
5
queue (pkts)
19.8 (20)
59.0 (60)
127.3 (127)
237.5 (238)
416.3 (416)
netlab.caltech.edu
src3
src4
3.54 (3.57)
1.34 (1.35)
0.68 (0.67)
3.38 (3.39)
1.30 (1.30)
baseRTT (ms)
10.18 (10.18)
13.36 (13.51)
20.17 (20.28)
31.50 (31.50)
49.86 (49.80)
src5
3.28 (3.34)
Methodology
Protocol
(Reno, Vegas, RED, REM/PI…)
x(t 1) F ( p(t ), x(t ))
p(t 1) G ( p(t ), x(t ))
Equilibrium
Performance
Throughput, loss, delay
Fairness
Utility
netlab.caltech.edu
Dynamics
Local stability
Cost of stabilization
TCP/RED stability
Small effect on queue
AIMD
Mice traffic
Heterogeneity
Big effect on queue
Stability!
netlab.caltech.edu
Stable: 20ms delay
Window
70
60
Window (pkts)
50
40
individual window
30
20
10
0
0
1000
2000
3000
4000
5000 6000
time (ms)
7000
8000
9000 10000
Window
Ns-2 simulations, 50 identical FTP sources, single link 9 pkts/ms, RED marking
netlab.caltech.edu
Stable: 20ms delay
Window
Instantaneous queue
70
800
60
700
600
40
Instantaneous queue (pkts)
Window (pkts)
50
individual window
average window
30
20
500
400
300
200
10
0
0
100
1000
2000
3000
4000
5000 6000
time (ms)
Window
7000
8000
9000 10000
0
0
1000
2000
3000
4000
5000 6000
time (ms)
7000
8000
Queue
Ns-2 simulations, 50 identical FTP sources, single link 9 pkts/ms, RED marking
netlab.caltech.edu
9000 10000
Unstable: 200ms delay
Window
70
individual window
60
Window (pkts)
50
40
30
20
10
0
0
1000
2000
3000
4000 5000 6000
time (10ms)
7000
8000
9000 10000
Window
Ns-2 simulations, 50 identical FTP sources, single link 9 pkts/ms, RED marking
netlab.caltech.edu
Unstable: 200ms delay
Window
Instantaneous queue
70
800
individual window
700
60
600
Instantaneous queue (pkts)
Window (pkts)
(pkts)
Window
50
40
30
20
500
400
300
200
10
0
average window
0
1000
2000
3000
4000 5000 6000 7000 8000
time (10ms)
Window
100
9000 10000
0
0
1000
2000
3000
4000 5000 6000
time (10ms)
7000
Queue
Ns-2 simulations, 50 identical FTP sources, single link 9 pkts/ms, RED marking
netlab.caltech.edu
8000
9000 10000
Other effects on queue
Instantaneous queue
20ms
30% noise
700
600
600
400
300
instantaneous queue (pkts)
500
500
400
300
500
400
300
200
200
200
100
100
100
0
0
0
1000
2000
3000
4000
5000 6000
time (ms)
7000
8000
9000 10000
0
0
10
20
40
50
60
time (sec)
70
80
90
100
200ms
20
500
400
300
200
200
100
100
0
7000
8000
9000 10000
40
50
60
time (sec)
70
80
90
100
600
instantaneous queue (pkts)
instantaneous queue (pkts)
300
30
avg delay 208ms
700
600
400
1000 2000 3000 4000 5000 6000
netlab.caltech.edu
time (10ms)
10
Instantaneous queue (pkts)
30% noise
700
500
0
0
800
800
600
0
30
Instantaneous queue (50% noise)
Instantaneous queue
800
700
avg delay 16ms
700
600
instantaneous queue (pkts)
Instantaneous queue (pkts)
800
800
700
Instantaneous queue (pkts)
Instantaneous queue (pkts)
Instantaneous queue (50% noise)
800
500
400
300
200
100
0
10
20
30
40
50
60
time (sec)
70
80
90
100
0
0
10
20
30
40
50
60
time (sec)
70
80
90
100
Stability: Reno/RED
x
TCP
y
Rf(s)
F1
G1
Network
FN
q
TCP:
Small
Small c
Large N
RED:
Small
Large delay
netlab.caltech.edu
AQM
GL
Rb
p
’(s)
Theorem (Low et al, Infocom’02)
Reno/RED is stable if
c 3 3
2
N
3
(c N )
( 1- ) 2
4 2 2 (1 ) 2
Stability: scalable control
x
TCP
Rf(s)
F1
Network
FN
q
xi (t ) xi e
y
G1
AQM
GL
Rb
p
’(s)
i
q (t )
i mi i
p l (t )
1
yl (t ) cl
cl
Theorem (Paganini, Doyle, Low, CDC’01)
Provided R is full rank, feedback loop is locally stable
for arbitrary delay, capacity, load and topology
netlab.caltech.edu
Stability: Vegas
x
TCP
y
Rf(s)
F1
G1
Network
FN
q
1
xi
sgn 1
2
T (t )
AQM
GL
Rb
xi ( t ) qi ( t )
i di
p
’(s)
p l (t )
1
yl (t ) cl
cl
Theorem (Choe & Low, Infocom’03)
Provided R is full rank, feedback loop is locally stable if
max xiTi ( ; M , k02 )
netlab.caltech.edu
Stability: Stabilized Vegas
x
TCP
Rf(s)
F1
Network
FN
q
y
G1
AQM
GL
Rb
1
xi ( t ) qi ( t )
-1
xi
tan
(
t
)
1
i (t )qi (t )
i di
2
T (t )
p
’(s)
p l (t )
1
yl (t ) cl
cl
Theorem (Choe & Low, Infocom’03)
Provided R is full rank, feedback loop is locally stable if
max xiTi (a, )
netlab.caltech.edu
Stability: Stabilized Vegas
x
TCP
Rf(s)
F1
Network
FN
q
y
G1
AQM
GL
Rb
1
xi ( t ) qi ( t )
-1
xi
tan
(
t
)
1
i (t )qi (t )
i di
2
T (t )
p
’(s)
p l (t )
1
yl (t ) cl
cl
Application
Stabilized TCP with current routers
Queueing delay as congestion measure has right scaling
Incremental deployment with ECN
netlab.caltech.edu
Outline
Motivation
Theory
Web layout
Content distribution
TCP/AQM (Jin, poster)
TCP/IP (poster)
Enforcing & inducing fairness
Optical switching (future)
netlab.caltech.edu
(poster)
Protocol Decomposition
WWW, Email, Napster, FTP, …
Applications
TCP/AQM
IP
Transmission
Ethernet, ATM, POS, WDM, …
netlab.caltech.edu
HOT (Doyle et al)
Minimize user response time
Heavy-tailed file sizes
Duality model
Maximize aggregate utility
Shortest-path routing
Minimize path costs
Power control
Maximize channel capacity
Network model
x
y
R
F1
Network
TCP
G1
FN
q
AQM
GL
R
T
p
Rli 1 if source i uses link l
IP routing
x(t 1) F ( RT p(t ), x(t ))
Reno, Vegas
p(t 1) G ( p(t ), Rx (t ))
DT, RED, …
netlab.caltech.edu
Duality model of TCP/AQM
Primal-dual algorithm
x(t 1) F ( RT p(t ), x(t ))
Reno, Vegas
p(t 1) G ( p(t ), Rx (t ))
DT, RED, REM/PI, AVQ
Flow control problem
max
x 0
subject to
U ( x )
i
i
i
Rx c
TCP/AQM
Maximize utility with different utility functions
Theorem
(Low 00):
(x*,p*) primal-dual optimal iff
yl* cl with equality if
netlab.caltech.edu
pl* 0
Motivation
Primal : max max
R
x 0
Dual :
netlab.caltech.edu
min
p 0
U ( x )
i
i
subject to Rx c
i
U i ( xi ) xi max Rli pl pl cl
i max
Ri
xi 0
l
l
Motivation
Primal : max max
R
x 0
Dual :
min
p 0
U ( x )
i
i
subject to Rx c
i
U i ( xi ) xi max Rli pl pl cl
i max
Ri
xi 0
l
l
Shortest path routing!
Can TCP/IP maximize utility?
netlab.caltech.edu
TCP-AQM/IP
Theorem (Wang et al, Infocom’03)
Primal problem is NP-hard
Proof
Reduce integer partition to primal problem
Given: integers {c1, …, cn}
Find: set A s.t.
c c
iA
netlab.caltech.edu
i
iA
i
TCP-AQM/IP
Theorem (Wang et al, Infocom’03)
Primal problem is NP-hard
Achievable utility of TCP/IP?
Stability?
Duality gap?
Conclusion: Inevitable tradeoff between
achievable utility
routing stability
netlab.caltech.edu
General network
Conclusion: Inevitable tradeoff between
achievable utility
routing stability
random graph
20 nodes, 200 links
netlab.caltech.edu
Achievable utility
Coming together …
Clear & present
Need
Resources
netlab.caltech.edu
Coming together …
Clear & present
Need
Resources
netlab.caltech.edu
Coming together …
Clear & present
Need
Resources
netlab.caltech.edu
FAST
Protocols
FAST Protocols for Ultrascale Networks
Internet: distributed feedback control system
TCP: adapts sending rate to congestion
AQM: feeds back congestion information
AQM
wi
tan -1i (t ) 1
T (t ) 2
l
p
1
( yl (t ) cl )
cl
Faculty
Doyle (CDS,EE,BE)
Low (CS,EE)
Newman (Physics)
Paganini (UCLA)
Staff/Postdoc
Bunn (CACR)
Jin (CS)
Ravot (Physics)
Singh (CACR)
StarLight
p
Rb’(s)
xi
CERN
y
TCP
q
research & production
networks
Chicago
Rf (s)
x
WAN in Lab
Caltech
Calren2/Abilene
Geneva
xi ( t ) qi ( t )
i di
i (t )qi (t )
Multi-Gbps
50-200ms delay
Theory
Experiment
People
Implementation
Students
Choe (Postech/CIT)
Hu (Williams)
J. Wang (CDS)
Z.Wang (UCLA)
Wei (CS)
155Mb/s
SURFNet
Amsterdam
equilibrium
10Gb/s
slow
start
FAST
retransmit
time
out
FAST
recovery
Industry
Doraiswami (Cisco)
Yip (Cisco)
Partners
CERN, Internet2, CENIC, StarLight/UI, SLAC, AMPATH, Cisco
netlab.caltech.edu/FAST