Chapter 4 Lecture Presentation

Download Report

Transcript Chapter 4 Lecture Presentation

Request for feedback on
proposed direction for CHEETAH
Malathi Veeraraghavan
University of Virginia
Sept. 22, 2006

Outline

A quick status report on CHEETAH

Proposed direction discussion:

What's our goal for the CHEETAH network:


eScience network or a scalable GP network?
Bandwidth sharing modes:

Book-Ahead (BA) or Immediate-Request (IR)?

If IR, what applications?

Core or edge network?
1
CHEETAH project status

Network deployment





Networking software
Network service
Applications



data plane
control plane
eScience
general-purpose
Basic research on circuit/VC networking
2
CHEETAH network - data plane links
GbEthernet and SONET
UVa
CUNY
TN PoP
SN16000
GbE
GbE
NCSU
GbEs via NCREN
OC192 Control GbE/
10GbE GbE
card
card
card
End hosts
OC-192 (via NLR/SLR/NCREN)
NC PoP
GA PoP
SN16000
End
GbE GbE/
Control OC192
10GbE card
hosts
cards
card
GbE
ORNL
OC-192
SN16000
OC192 Control GbE/ GbE
10GbE
card
card
End
card
hosts
GbE
GaTech
3
CHEETAH network - control plane links
Design goal: scalable GMPLS network
SN16000
OC192 Control GbE/
10GbE
card
card
card
TN
UVa
Openswan
IPsec software
on Linux end hosts
CUNY
NCSU
End hosts
ns5
IPsec device
Call setup
messages
ns5
GA
End hosts
GbE/ Control
OC192
10GbE card
card
card
SN16000
ORNL
Internet2
ns5
OC192 Control GbE/
10GbE
card
card
card
NC
End
hosts
SN16000
GaTech
4
Networking software

Sycamore switch comes with built-in GMPLS
control-plane protocols:



RSVP-TE and OSPF-TE
We developed CHEETAH software for Linux end
hosts:
 circuit-requestor
 allows users and applications to issue RSVP-TE
call setup and release messages asking for
dedicated circuits to remote end hosts
 CircuitTCP (CTCP) code
Described in two ICC06 papers and a JSAC
submission
5
Network service


On-demand circuit-switched service for 1Gb/s
dedicated host-to-host circuits
Call setup delay: 1.5sec




Sycamore gave us a proprietary build for hybrid GbESONET-GbE circuits
No standard yet for such hybrid circuits
Sets up 7 OC3s and VCATs them to carry a GbE signal
In contrast, their GMPLS standards
implementation for pure-SONET circuits incurs a
call setup delay of 166ms (2-hop)
6
Applications

eScience: Terascale Supernova Initiative

File transfers: demo'ed but


Ensight remote visualization: demo'ed but



held up by Cray X1 network I/O problems
connectivity between NCSU CHEETAH-drop site
and physicist's office is low speed
Conclusion: other factors, not the network!
general-purpose:


web caching
video apps
7
Basic research on bandwidth-sharing
mechanisms in circuit/VC networks

Immediate-Request (IR) mode




Purpose: to determine what type of applications are well
served by this mode
Key finding: if m, the link capacity expressed in
channels, is 10, IR (Immediate Request) mode of
bandwidth sharing leads to poor util. or high blocking
Published in an IEEE Globecom 2006 paper
Book-ahead bandwidth-sharing mechanisms


Developed a novel discrete-time Markov chain model for
book-ahead bandwidth-sharing mechanisms
Results are submitted to IEEE/ACM Trans. on
Networking)
8
Outline check


A quick status report on current CHEETAH
Proposed direction discussion:

What's our goal for the CHEETAH network:

eScience network or a scalable GP network?

Bandwidth sharing mode:

Book-Ahead (BA) or Immediate-Request (IR)?
If IR, what applications?

Core or edge network?

9
Observation

"Many e-science experiments ... are
optimized to provide maximum throughput
to a few facilities, as opposed to moderate
throughput to millions of users, which is
the raison d'etre for commercial
networks."
10
eScience networks

eScience network requirements





Number of users small
Hard to achieve high utilization; also not impt.
Overprovision network to keep call blocking
rate low
Focus on creating software to allow scientists
to automatically provision high-speed
application-specific topologies: AST, UCLP,
OSCARS, USN scheduler, BRUW
Bandwidth-sharing algorithms of less concern
11
General-purpose commercial
networks

Have to be scalable: large number of users




Metcalfe's statement: Value of a network
increases exponentially with the number of
users
High utilization is an important goal
Low call blocking probability or low waiting
time for resources
Focus on efficient bandwidth-sharing
algorithms
12
Circuit/VC service on GP
commercial networks

Just for ISPs/enterprise admins:


needs similar to eScience
 router-to-router circuits
 limited number of users
 high-bandwidth, long-held circuits
 low price not a high priority
 need BA mode of bandwidth sharing
For end users



large number of users
can only offer moderate BW and limited call holding
times
IR mode of sharing becomes feasible
13
Outline check


A quick status report on current CHEETAH
Proposed direction discussion:

What's our goal for the CHEETAH network:

eScience network or a scalable GP network?

Bandwidth sharing modes:

Book-Ahead (BA) or Immediate-Request (IR)?
If IR, what applications?

Core or edge network?

14
BW sharing modes in circuit/VC networks
m is the link capacity
expressed in channels
e.g., if 1Gbps circuits
are assigned on a 10Gbps link,
m = 10
Large m
Moderate throughput
Small m
immediate-request
with call blocking + retries
("call queueing")
Short calls
Bank teller
(video, gaming)
immediate-request
with delayed-start times
("call queueing")
(file transfers)


High throughput
Long calls
Doctor's office
book-ahead
Mean waiting time is proportional to mean call holding time
Can afford to have a queueing based solution when m is small
if calls are short
15
Impact of increasing m at different
values of link utilization Ud
1000
U =90%
d
U =90%
d
800
U =80%
d
U =80%
d
U =60%
m=10
d
0.4
Pq=41%
U =60%
d
U =40%
d
0 0
10
400
d
U =40%
0.2
600

0.6
PQ
Prob. of arriving job finding
all m circuits busy
0.8
200
1
2
10
10
m
03
10
Offered load: call arrival rate/call departure rate
1
Link capacity expressed in channels
High-rate per-call circuits
Low-rate per-call circuits
16
Impact of mean call holding time 1 / 
5
10
30

m=1000, =1call/hour
4
m=10
3
10
18

2
m=10, =1call/hour
10
12

1
m=10, =10calls/hour
10
6
0
10
0
Mean waiting time for
delayed calls

E[W d ] (minutes)
24
m=100, =1call/hour
N
Number of ports
aggregating traffic
on to the link
10
5
10
15
20
1/ (minutes)
 ' : per host call-generation rate
Ud: 90%
m=100
m=1000
0
25
30
E[Wd ] 
1
m (1  U d )
17
Main findings of analysis

Two key parameters:

If m is small (per-circuit BW is high)
 and mean call holding time is large


and mean call holding is small (file transfers)


then need BA to avoid long waiting times
then use "call queueing"
If m is large, switch hardware costs increase
 N, number of aggregation ports, high
 level of demultiplexing high
18
Relate BW sharing modes to
network types
Bandwidthsharing
mechanisms
Book-Ahead (BA)
Immediate-Request (IR)
eScience
networks
very large file transfers need
high-BW and long holding time
+ remote viz. need to reserve
other resources such as
displays
None?
general-purpose
networks
circuit service to only
ISPs/enterprise admins
- router-to-router circuits
circuit service for end users
- host-to-host + router-torouter (end-to-end)
- partial-path router-to-router
circuits on congested links
(called in by end user)
19
Support for the BA mechanism of
bandwidth sharing





Since RSVP-TE does not have parameters for BA
calls (call duration, start time), this mode is not
implemented in switch controllers
Cannot utilize the BW management software
implemented in switch controllers as part of
GMPLS control-plane software
Need an external scheduler to manage bandwidth
into the future
Easiest to make it centralized - one per domain
The BA mode is necessary for high-BW, long-held
calls
20
BA implementation approach

Develop and "standardize" protocols for scheduler-toscheduler signaling for interdomain circuits (one
centralized scheduler per domain)





e.g., Chin Guok (ESNET)'s WSDL spec. for ESNET-Abilene
testing (OSCARS-BRUW)
Implement scheduler and test with other networks
Create software tools to enable scientists and
ISP/enterprise admins to visualize network topologies and
request appropriate circuits/VCs
High-BW, long-held: Therefore AAA is a must
Path being pursued by DRAGON, USN, OSCARS, UCLP
21
Argument: IR is just a "now" in BA

Difficult to have BA and IR coexist without some
form of bandwidth partitioning




BA allows for high-BW, long-duration calls
IR calls will suffer a high call blocking rate if supported
through BA scheduler (the "add-now-as-an-option-inscheduler" solution)
Should you admit an IR call if it arrives a few seconds
before start time of a BA call and hope it completes
before the BA call start time, or reject the call and
waste bandwidth?
If BW is partitioned, then implement scalable
solution for IR - distributed bandwidthmanagement
22
Support for the IR mechanism of
bandwidth sharing


Switches have built-in (G)MPLS controlplane software (RSVP-TE/OSPF-TE)
Bandwidth management is part of RSVP-TE
switch controller software



Hence it is distributed bandwidth management
Need to limit call holding time - reminders
for renewals and automatic release
Moderate-to-high per-call bandwidth
23
Is an opportunity being missed if distributed IR
bandwidth sharing mode is not explored?
What opportunity? Four reasons:

1.
2.
3.
4.
Enable the creation of large-scale circuit/VC networks
with moderate-rate circuits that can support a brand
new class of applications
 economic value for the networking industry
A "reservations-oriented" mode of networking to
complement today's connectionless Internet
 ala airlines that complement roadways
Could prove useful to FIND, GENI, net-neutrality
Alternative pricing models for bandwidth
24
Outline check


A quick status report on current CHEETAH
Proposed direction discussion:

What's our goal for the CHEETAH network:

eScience network or a scalable GP network?

Bandwidth sharing mode:

Book-Ahead (BA) or Immediate-Request (IR)?
If IR, what applications?

Core or edge network?

25
What "brand new class of applications?"

Large-m (moderate BW)




Video, video, video
Gaming
Remote software access
Small-m (high BW) short-held calls


Async storage
Web and P2P file transfers
26
Video applications





Improve quality of conferencing, telephony,
surveillance, entertainment and distancelearning by a significant degree
Expend bandwidth for a higher-quality, lower
latency, multi-camera, auto-movement, automixing experience
Make the "flat world" flatter
Energy savings/environmental benefits
Moderate bandwidth - IR with call
blocking/retries
27
Gaming applications



Current gamers buy personal graphics cards
Players talk of "lag" caused by differences in graphics
processing speeds
Moderate-speed circuits can enable a new class of
games in which rapidly-changing scenes are possible
compare movies in which multiple story lines keep
scenes changing vs. gaming scenes
Players connect to graphics servers
Data transferred is not GL commands, but rather
rendered data (doable?)
Moderate bandwidth - IR with call blocking/retries




28
Remote software access

Remote software access





Reduce computer administration cost
Personal computers vs. machine rooms
I loaded 22 new applications on my new laptop
 Instead: connect and run!
Virtual Computing Laboratory: Mladen Vouk, NCSU
Moderate bandwidth - IR with call
blocking/retries
29
Asynchronous storage

Asynchronous storage depots will lower
costs for





backups
disaster recovery
Need for increased storage grows with
multimedia files
High bandwidth, short calls
IR with delayed start
30
Larger files in web sites and P2P apps

Multimedia files in web sites

Increase the use of video/audio files in all sorts of web sites
instead of ASCII


My own course PPT files: I use audio sparingly because of
bandwidth
Think assembly instructions for electric fans, furniture


Think hotel web pages




Kinesthetic learning - show me a video
Show me exactly where the beach is relative to my room; do I
have a balcony - saying it in text format is one thing; seeing it in a
video format quite another!
Content distribution network, mirroring & web caching
High bandwidth, short calls
IR with delayed start
31
Do these apps really require circuit/VC networks with
IR mode or can they simply use higher-BW IP based
networks?

See analogous transportation networks (reason 2)



reservations-oriented airlines network
reservationless roadways network
Hypothesis: The socialistic mode of bandwidth sharing on
the Internet discourages individual investment in network
bandwidth (reason 4)
 should we pay for bandwidth with tax dollars - "free"
for the whole community?
 "Tragedy of the commons" (Tanenbaum)
 should we create a network where individuals can pay for
bandwidth on congested links more directly? - think
higher-toll HOV lanes
32
Invest in a complementary
reservations-oriented network

Build a scalable circuit/VC network in which
bandwidth is shared in IR mode




Scalability will create "Metcalfe's value"
Provides an opportunity to finally recoup our investment
in (G)MPLS technologies
 standards creation effort
 implementation: Cisco, Juniper, Sycamore, Movaz
Assign at least a few of the optical testbeds that we
are investing in now to study whether this IR mode of
bandwidth sharing can help with our understanding of
net-neutrality, economic growth, FIND questions
IR more natural in data world unlike in airlines (BA)
33
Outline check


A quick status report on current CHEETAH
Proposed direction discussion:

What's our goal for the CHEETAH network:

eScience network or a scalable GP network?

Bandwidth sharing mode:

Book-Ahead (BA) or Immediate-Request (IR)?
If IR, what applications?

Core or edge network?

34
Core vs. edge networks

Should the GMPLS IR scalable network be developed first
as a core network or as a set of edge networks that feed
into an IP-based core network?

Edge-network oriented apps: HOV bypass




Video
Gaming
Remote software access
Core-network oriented apps: "3TCP connections" with widearea TCP connection through GMPLS network


Asynchronous storage
Large web sites: CDN/web caching
35
HOV bypass

The "end-to-end" in the CHEETAH name


We chose Ethernet in LANs and SONET in
WANs for the data-plane, given the dominance
of these two technologies, with the thinking
that this would make "end-to-end" circuits
affordable
BUT, even adding control-plane software
modules to each Ethernet switch in a
departmental closet, enterprise closet, etc. is
quite expensive
36
HOV bypass

Further observation


Combining these two:



if a link is lightly loaded, we don't really need to
reserve bandwidth for a particular flow
Concept of "partial-path circuits"
Determine if there is a bottleneck link on the
end-to-end IP path
If so, send signaling request to a router on the
bottleneck link asking for a BW reservation on
bottleneck link (just as with HOV lanes)
37
Key idea

Use control-plane signaling request from end user
or application to ask IP router for a "bypass"
reservation and to get ready to isolate packets of
its flow (Policy Based Routing)


Unlike ATM cut-through VCs solutions that required IP
routers to somehow determine which flows to handle via
ATM VCs
 Ipsilon: long-lived flow detection
Comparable to the user who calls ahead and
reserves a seat for a flight, before showing up at
the airport
38
Use IR mode for HOV bypass
Enterprise
network
WAN-access
router
WAN-access
router
bypass
controller
dynamically
setup MPLS
tunnel for
period of flow
Enterprise
network
Edge network
(e.g., NYSERnet)
Backbone network (e.g., Abilene)
NYC PoP
NC PoP
likely lightly loaded links
Edge network
(e.g., NCREN)
Enterprise
network
Enterprise
network
access: likely congested links
WAN-access
router
WAN-access
router
39
Using CHEETAH ideas in enterprise
access for the HOV bypass

MSPPs were being deployed in enterprises to aggregate
voice and IP traffic on to SONET WAN access links
Enterprise
network
add additional
OCxx circuit
(router needs
extra or higher-BW
interface already
plugged in)


WAN-access
router
bypass
controller
Edge network
(e.g., NYSERnet)
But increasingly Metro Ethernet is replacing SONET
MPLS seems more suitable
40
Core network solution:
3TCP connections
Airlines
Roadways
Roadways
Airport
Airport
Appears that a
comparable reason
exists here for taking
a flight: long-distance
travel
CAG: CHEETAH Application Gateway
A web cache (using squid) with integrated cheetah software
41
Splitting high-BDP TCP
connection

Improves end-to-end delay even if all three TCP
connections used IP paths




Micah Beck, Logistical networking
Martin Swany, Phoebus
Sigcomm 05 Purdue poster
Many others
42
Cheetah improvement

Have the wide-area path go through a circuit/VC
network


Trigger for call setup through CHEETAH arrives
from web proxy connections initiated by clients


if RTT is 50ms, even if circuit rate is 10Mbps when
Internet path bottleneck link rate is 100Mbps, there is
a crossover file size beyond which the circuit path is
faster - even with BIC
Thus, users not physically connected to CHEETAH still
use CHEETAH service
CDN and mirror apps will be triggered by web
server-to-local mirror writes
43
CHEETAH-HOPI interconnection
Web
cache
VLAN
Web
cache
Web
cache
McLean, VA
MPLS
Web
cache
NC
10GbE
Web
cache
Web
cache
CHEETAH
SONET
HOPI network: courtesy of Rick Summerhill
Web
cache
TN
Web
cache
44
GA
CHEETAH as a core network



Job of a core network is to connect edge
networks
Application just described only connects
web servers, web caches, storage devices
to CHEETAH SONET switches at PoPs
Could also provide router-to-router IRtriggered circuits between PoPs


Allow admin-triggered setups
Network management software for load
monitoring and automatic triggers
45
Add routers and servers/storage to
CHEETAH PoP
NC PoP
TN PoP
Routers
Routers
Web servers/
mirrors
OC192
Storage
devices
Web
caches
Storage
devices
Web
caches
Web servers/
mirrors
GA PoP
Routers
OC192
Storage
devices
Web
caches
Web servers/
mirrors
46
Conclusions

Proposed direction:

What's our goal for the CHEETAH network:


Bandwidth sharing modes:


eScience network or a scalable GP network?
Book-Ahead (BA) or Immediate-Request (IR)?
If IR, what applications?

Long-distance file transfers - web caching/CDN/mirrors
and storage + router-to-router triggers

Core or edge network?

Interesting research: delayed-start BW-sharing
47