FAST v3 - California Institute of Technology

Download Report

Transcript FAST v3 - California Institute of Technology

WAN in Lab
NSF Site Visit
John Doyle,
CDS/EE/BE
Steven Low (PI),
CS/EE
Harvey Newman, Physics
Demetri Psaltis, EE/CNS
Steven Yip, Cisco
March 5, 2003
Reviewer concerns
 Narrow focus on TCP/AQM
 A range of IST research at Caltech
 Spanning theory, implementation, experiment, deployment
 WAN in Lab a critical component
 Alternatives not discussed
 Use spectrum of tools at different stages
 How to manage and share WAN in Lab
 Part of Federated Emulab
 Both demand and excellent support for global sharing
 Experience in global collaboration, e.g. Newman’s VRVS
netlab.caltech.edu
Agenda
 EAS, IST Initiative, Theory program, FAST
 Intellectual environment in which WAN in Lab fits (Murray,
Doyle, Low)
 WAN in Lab
 Design, capabilities, alternatives, management (Low)
 Cisco example & collaboration (Yip)
 Education, outreach, poster session
 Research talks
 Projects that will use WAN in Lab
 International collaboration, leverage & impact on HENP &
Grids
netlab.caltech.edu
WAN in Lab
Steven Low
netlab.CALTECH.edu
NSF Site Visit
March 5, 2003
Why Testbed in IST
“A lack of wide-area testbeds would contribute to a
growing tendency towards paper solutions to thesisfactory problems, leaving the real networking world
short of new ideas and technologies”
“Prototypes & testbeds are required to gain
acceptance of new concepts with potential user
communities”
A value of testbeds is “… building and maintaining
research collaborations and communities”
- NSF Workshop on Network Research Testbeds (Nov 2002)
Outline
 Proposal summary

Basic design, equipment, costs
 Unique features
 Alternatives


Spectrum of tools
Emulated delay
 Community resource


Demand
Management software
 Why Caltech
 Leverage on Abilene, HENP, CalREN, TeraGrid
 Summary


Reviewer concerns
Review criteria
netlab.caltech.edu
Goal
State-of-the-art WAN
 High speed
 2.5G  10G
 Large distance
 50 – 200ms
 Controlled & repeatable experiments
 Reconfigurable & evolvable
netlab.caltech.edu
S
R
l1
S
OPM
l1
fiber spool
R
S
S
l20
S
EDFA
EDFA
R
l20
S
R
S
S
500 km
H : server
R : router
Max path length = 10,000 km
Max one-way delay = 50ms
electronic
crossconnect
Equipment
 26 Servers

GbE cards ( 10GbE cards)
 12 routers

10 Cisco 15454 with router blades
 2-port GbE, 8-channel OC48

2 Force10 E600




24-port GbE, 2-port OC48
DWDM gears

500km fiber

6 EDFA

2 Dispersion compensation modules

2 optical mux/demux
Tektronix TDS7404 Oscilloscope
Integration with global network
netlab.caltech.edu
Costs
 26 Servers: $104K
 12 routers: $1.03M


2 Force10 E600: $280K ($340K if OC192)

10 Cisco 15454 with router blades: $750K ($810K if OC192)
DWDM gears: $148K

500km fiber: $8K

6 EDFA: $60K

2 Dispersion compensation modules: $40K



 2 optical mux/demux: $40K
Tektronix TDS7404 Oscilloscope: $50K
Integration with global network: $110K
Personnel, software, service & maintenance

Total: $2M (NSF) + $0.67M (cost sharing)
netlab.caltech.edu
Yearly costs
 Year 1: $1.128K
 10 servers, 5 routers, 2.5Gbps
 Year 2: $564K
 20 servers, 8 routers, 2.5Gbps
 Year 3: $124K
 Software development
 Year 4: $733K
 26 servers, 10 routers, 2.5Gbps
 Year 5: $120K
 Software development
 Total: $2M (NSF) + $0.67M (cost sharing)
netlab.caltech.edu
Networking Lab
NetLab
 WAN in Lab
 3 racks, 2 consoles
 Networking Lab
 424 sq ft
 Next to CACR
 Easy connection to
global network
 Renovation (cost
sharing)
 New IST Building
Jorgensen Lab
netlab.caltech.edu
Unique capabilities
 WAN in Lab
 Capacity: 2.5 – 10 Gbps
 Delay: 0 – 100 ms round trip
 Configurable & evolvable
 Topology, rate, delays, routing
 Always at cutting edge
 Risky research
l1
l1
 R1
MPLS, AQM,
routing,
…
l2
l2
 Integral
part
l3 of R&A networks
l3
R2
l4
 Transition from theory, implementation,
demonstration, deployment
l18from lab to marketplace
 Transition
R10 resource
l19
l19
 Global
l20
l20
(a) Physical network
netlab.caltech.edu
Unique capabilities
 WAN in Lab
 Capacity: 2.5 – 10 Gbps
 Delay: 0 – 100 ms round trip
 Configurable & evolvable
 Topology, rate, delays, routing
 Always at cutting edge
 Risky research
 MPLS, AQM, routing, …
l1
l2
 lIntegral
part
of
R&A
networks
R1
20
l3
R2
 Transition from theory, implementation,
demonstration, deployment
l19 Transition from lab to marketplace
l4
R10
R3
 Global resource
(b) Logical network
netlab.caltech.edu
Unique capabilities
WAN in Lab
research & production
 WAN in Lab
networks
Caltech
 Capacity: 2.5 – 10 Gbps
Chicago
CERN
 Delay: 0 – 100 ms round trip
 Configurable & evolvable
StarLight
Calren2/Abilene
 Topology, rate, delays, routing
Geneva
Multi-Gbps
 Always at cutting edge
50-200ms delay
SURFNet
 Risky research
Experiment
Amsterdam
 Dynamic recovery, AQM, MPLS, routing, …
 Integral part of R&A networks
 Transition from theory, implementation,
demonstration, deployment
 Transition from lab to marketplace
 Global resource
 Federated Netlab (Emulab)
netlab.caltech.edu
Outline
 Proposal summary

Basic design, equipment, costs
 Unique features
 Alternatives

Spectrum of tools

Emulated delay
 Community resource

Demand

Management software
 Why Caltech

Leverage on Abilene, HENP, CalREN, TeraGrid
 Summary

Reviewer concerns

Review criteria
netlab.caltech.edu
Spectrum of tools
log(cost)
HENP
Abilene
CalREN
WAIL
PlanetLab
CAIRN
NLR
?
DummyNet
EmuLab
ModelNet
NS
WAIL
SSFNet
QualNet
JavaSim
Mathis formula
Optimization
Linear model
Nonlinear model
Stocahstic model
log(abstraction)
live nk
WANiLab
emulation
simulation
math
…we use them all
netlab.caltech.edu
Spectrum of tools
live nk
WANiLab
emulation
Distance
High
High
High
Speed
High
High
Low
Realism
High
High
Low
Traffic
High
Low
Low
Configurable
Low
Medium
High
Monitoring
Low
Medium
High
Cost
High
Medium
Low
netlab.caltech.edu
simulation
math
Critical in
development
e.g. Web100
Emulated delay
S
S
l1
R
S
l20
S
High speed
electronic memory
R
l1
R
S
l20
S
R
S
S
 Available technology inadequate

Spirent SX/14 Link Simulator: 1ms (155Mbps) – 10s (100bps)
 Adequate technology too expensive

2.5Gbps, 100ms delay: IC expert at least 2 man-years & $200K
 Less realistic
netlab.caltech.edu
HENP testbed
CWND: 5801-5815
Ins. RTT
Sylvain Ravot (Caltech/CERN)
netlab.caltech.edu
Example1: end-to-end delay
Y: RTT (us)
CWND: 5801-5815
Ins. RTT
instantaneous RTT
average RTT
netlab.caltech.edu
Delay between Geneva & Chicago
X:Real Time (us)
Example1: end-to-end delay
Y: RTT (us)
CWND: 5801-5815
Ins. RTT
RTT=270ms
12450 pkts!?
1500instantaneous
pkt time without
RTTbuildup!?
netlab.caltech.edu
Delay between Geneva & Chicago
X:Real Time (us)
Example1: end-to-end delay
Y: RTT (us)
CWND:
8700->4000
RTT=980ms!?
Ins. RTT
Ins. RTT
Avg RTT
Passive monitoring in WANiLab can help debug
netlab.caltech.edu
X:Real
X:Real
Time
Time
(us)
Example2: 10G Expt
inst RTT
avg RTT
ms
Losses & retransmissions
Delay between Geneva & Sunnyvale
netlab.caltech.edu
Real time ms
Example2: 10G Expt
inst RTT
avg RTT
ms
Retransmission without loss!?
Losses & retransmissions
Real time ms
netlab.caltech.edu
Example2: 10G Expt
inst RTT
avg RTT
ms
Retransmission without loss!?
Real time ms
Passive monitoring in WANiLab can help debug
netlab.caltech.edu
Network debugging
 Performance problems in real network
 Simulation will miss
 Emulation might miss
 Live network hard to debug
 Enable or speed up FAST development
 10GExpt: 20 people in 8 organizations for 3 months
 Complete facility available only for a week
 Many mysteries unresolved
 WAN in Lab
 Passive monitoring inside network
 Active debugging possible
netlab.caltech.edu
Passive monitoring
Fiber
splitter
GPS
DAG
Timestamp
Header
RAID
Monitor
 No overhead on system
 Can capture full info at OC48
 UofWaikato’s DAG card captures
at OC48 speed
 Can filter if necessary
 Disk speed = 2.5Gbps*40/1500
= 66Mbps
 Monitors synchronized by GPS
or cheaper alternatives
 Data stored for offline
analysis
David Wei (Caltech)
netlab.caltech.edu
Passive monitoring
Web100, FAST monitor
Fiber
splitter
Server
monitor
GPS
router
DAG
Timestamp
Header
monitor
monitor
monitor
RAID
router
Monitor
monitor
David Wei (Caltech)
netlab.caltech.edu
monitor
Server
Outline
 Proposal summary

Basic design, equipment, costs
 Unique features
 Alternatives

Spectrum of tools

Emulated delay
 Community resource

Demand

Management software
 Why Caltech

Leverage on Abilene, HENP, CalREN, TeraGrid
 Summary

Reviewer concerns

Review criteria
netlab.caltech.edu
DataTAG link
CWND: 5801-5815
Ins. RTT
Sylvain Ravot (Caltech/CERN)
netlab.caltech.edu
DataTAG link
 Funded by EU (CERN), USA (DoE, NSF, Caltech)
 OC48 circuit StarLight-CERN
 Upgrade to OC192 by August 2003
 Linux farms
 StarLight: 20 CPU (P4), 20 Syskonnect
 CERN: 12 CPU (P4), 12 Syskonnect
 50 users, 13 institutes, 7 countries (Feb 2003)
 Heavy utilization
 European hours: 100% reservation
 US hours: 25% reservation, but busy
netlab.caltech.edu
Netbed (Emulab)


Funded by NSF with Cisco donations
Integrates simulation, emulation, live Internet



Emulab Classic

University Utah: 168 PC, 5 100M Ethernet
cards

Connected by 4 Cisco 6409

Testbed backplane limited to 2Gbps

University of Kentucky: 48 PC, similar setup
Netbed: Federated Emulab

University of Utah
netlab.caltech.edu

Dummynet & VLAN
32 nodes in 25 sites
Heavy utilization

July2002
 65 user accounts (40 external)
 54 projects

Feb 2003
 400 user accounts
 94 projects (10 Utah, 78 US, 6 Int’l)
Management software
 Part of Federated Emulab

Tailor Emulab management software

Jay Lepreau’s team consult on setup
 Complementary to existing federated Emulab

WANiLab
 High speed large distance (Gbps WAN)
 Small network (30 nodes)

Emulab
 Low speed (100Mbps LAN, 10M WAN)
 Large network (200+ nodes)
 Instantly available to Emulab community

Web accessible anywhere any time

Virtual machine for network experimentation
netlab.caltech.edu
Experiment life cycle
(White et al)
 Experiment creation

Web based sign-up form by project lead

Approved by Emulab team
 Experiment specification

ns script or Java GUI

Can download own OS, host algorithms, etc

Links emulated by Dummynet nodes with specified rate,
delay, loss
 Experiment realization

Map target configuration to physical resources

Reserve resources for each experiment

Oversubscription

dynamic reallocation, swap in, swap out
netlab.caltech.edu
Why Caltech: synergies



Caltech networking research

FAST project: the missing experimental facility (Doyle, Low)

IST Initiatives: testbed tied to rich theory program (Murray, Psaltis)

Combination of theory, implementation, experiment & deployment
Synergy in research

Caltech’s leadership role in IT for global HENP (Newman)

Vibrant research in HENP, astronomy, geological sci, biology, visualization,
CACR

Early testing ground & adopter of FAST (Newman)

Availability of real data for ultrascale networks
Synergy in facility


Integration with HENP networks, Abilene, CalREN XD, TeraGrid (see
Newman’s talk)
Synergy with Cisco

See Yip’s talk
netlab.caltech.edu
Why Caltech: experience
 Hardware
 Cisco’s testbed
 Psaltis, Yip, Hajimiri, DeHon
 Software
 Netbed management software
 Operation
 Newman’s group
 Testbed driven by networking research
 IST, Theory Program, FAST, optics, scientific
computing, network coding, …
netlab.caltech.edu
Team
 Hardware
 Yip’s team: Doraiswami (Cisco)
 Psaltis (EE/CNS), DeHon (CS), Hajimiri (EE)
 Caltech Information Tech Services, CACR
 Software
 Lepreau’s team
 Low’s team: Almsberger (CS), Jin (CS), Wei (CS), Hu (CS)
 Operation
 Newman’s team: Bunn (Physics), Ravot (Physics/CERN),
Suresh (CACR)
 Testbed driven by networking research
 Caltech IST Institute
netlab.caltech.edu
Global research network
NewYork
ABILEN
E
UK
SuperJANET4
It
GARR-B
STARLIGHT
ESNET
GENEVA
GEANT
NL
SURFnet
STAR-TAP
Fr
Renater
Newman (Caltech)
netlab.caltech.edu
WAN in Lab
Caltech
CALRE
N
Outline
 Proposal summary

Basic design, equipment, costs
 Unique features
 Alternatives

Spectrum of tools

Emulated delay
 Community resource

Demand

Management software
 Why Caltech

Leverage on Abilene, HENP, CalREN, TeraGrid
 Summary

Reviewer concerns

Review criteria
netlab.caltech.edu
Reviewer concerns




Narrow focus on TCP/AQM

A range of IST research at Caltech (Murray, Doyle)
 Spanning theory, implementation, experiment, deployment
 WAN in Lab a critical component

External projects in HENP, Grid & Emulab communities
Alternatives not discussed

Use spectrum of tools at different stages

Each complementary but not replaceable

DWDM gears more realistic and cheaper
How to manage and share WAN in Lab

Part of Federated Emulab

Both demand and excellent support for global sharing

Experience in global collaboration, e.g. VRVS
How much hardware development needed

Mostly off-the-shelf (Yip)

Sample system & experience from Cisco

Local expertise: Psaltis (Optics), Yip (Cisc), Hajimiri (high speed IC), DeHon
(VLSI)
netlab.caltech.edu
Review Criteria
 Intellectual merit
 Theory, implementation, experiment, deployment
 Must inform and influence each other intimately
 Approach validated by pilot project
 Experimental facility tied to rich theory program
 Broader impacts
 HENP’s global collaborations a model for future
corporations & society
 FAST protocols enabling technology
 Shared by & stimulate external research that need high
speed large distance
 HSTCP, Scalable TCP, TCP Westwood, AVQ, REM/PI, …

Internet as simplest complex system
netlab.caltech.edu
Review Criteria
 Integration of research & education

Excellent projects for undergraduates and graduates
 During & after development

Unique teaching platform for advanced networking, distributed
systems, complex systems, optics course
 Bruck, Chandy, Doyle, Hickey, Low, Psaltis
 Diversity

33% women grad students in Netlab

50% women postdocs and grad students in Doyle’s group
 Synergy among projects

Bring together 4 CISE projects (1 ITR, 1 STI, 2 pending)

Leverage for additional funding and industry collaborations
netlab.caltech.edu
NSF Workshop Criteria
 Tested driven by research agenda
 Rich and strong networking effort
 “A network that can break”
 Multi-user experimental facility
 With a clear research focus and foreseeable impact
 Federated testbed
 Leverage on Netbed’s management software
 Integrated monitoring & measurement facility
 Fiber splitter passive monitors
 Technology transfer
 Strong leadership in FAST user community (Newman)
netlab.caltech.edu
Some potential projects















TCP: FAST, HSTCP(Floyd, ICIR), TCP Westwood(Gerla, UCLA), Scalable
TCP(Kelly, Cambridge/CERN), XCP (Dina, MIT)
AQM: REM(Low), PI(Misra/Towsley), AVQ(Srikant, UIUC)
Protocol decomposition (Doyle, Low, Caltech)
Network self-management (Yemini, Columbia)
Content distribution (Bruck, Low, Caltech, Xu, Washington U)
Optical switching (Low, Psaltis, Caltech)
Network separation theory (Doyle, Low, Caltech Paganini, UCLA)
Real-time control over high performance networks (Dolye, Low, Murray,
Caltech)
Simple Optics Smart Router (SOSR) (Yates, AT&T Research)
Optical protection, recovery (Yates, AT&T Research; Nirmalathas,
Melbourne U)
Dynamic lightpath configuration & provisioning (Tucker, Melbourne U)
Active probing (Veitch, CUBIN)
Passive monitoring (Veitch, CUBIN)
Building & testing firewalls (Hoffman, U of Victoria)
High performance active network node (Turner, Washington)
netlab.caltech.edu