QoS Support in 802.11 Wireless LANs

Download Report

Transcript QoS Support in 802.11 Wireless LANs

Building a connection-oriented internet

Malathi Veeraraghavan Univ. of Virginia [email protected]

• Outline – What are we doing? - cheetah – Research problems – Engineering problems – Why we are doing this? - vision/motivation – Hasn't this been attempted before?

Talk at Georgia Tech., March. 30, 2005 1

What are we doing?

• Building a wide-area network called CHEETAH: Circuit-switched High-speed End-to-End Transport ArcHitecture • Writing software to run on Linux end hosts to use this network • Applications – File transfers – Remote visualization – Web downloads 2

NSF-funded project

• Participants: – Malathi Veeraraghavan, UVA – Nagi Rao, Bill Wing, Tony Mezzacappa, ORNL – Ibrahim Habib, CUNY – John Blondin, NCSU • $3.5M project for three years, 2004-2007 Acknowledgment: NSF EIN grant ANI-0335190 3

What’s the cheetah network?

• End hosts with two Ethernet NICs each – Primary NIC connected to the enterprise LAN/Internet – Secondary NIC connected to an MSPP • Network nodes are MSPPs – An MSPP is an Ethernet-SONET gateway • Solution leverages “king of LANs” (SONET)” • Key aspect: Dynamic bandwidth sharing 4

Multi-Service Provisioning Platform (MSPP) Sycamore’s SN16000 implements GMPLS protocols PC WAN access 10/100M Ethernet Control 1Gbps Ethernet Crossconnect (VT1.5 or STS1) OC12/OC48/OC192 SONET card • An OC1 rate SONET crossconnect with optional Ethernet interface cards • Ethernet cards implement GFP to map Ethernet frames into SONET frames (EoS) 5

GMPLS protocols

• Triumvirate to build large-scale networks in “plug-and-play” mode – LMP to discover neighbors – OSPF-TE for routing – RSVP-TE for signaling • Should be able to create distributed networks with “minimal” admin support 6

Signaling to setup/release Ethernet-EoS-Ethernet circuits Gigabit Ethernet interface card signaling engine: dynamic call setup/release Time-division or wavelength-division multiplexing optical interface card Circuit based gateway Setup connection (make reservation) Transfer file Release connection Control based gateway Gigabit Ethernet interfaces to hosts Circuit based gateway Circuit based gateway Circuit based gateway Gigabit Ethernet interfaces to hosts • Gateways available that can crossconnect a Gigabit Ethernet division multiplexed signal dynamically 7

Holding times of circuits

• 100ms to few seconds/minutes – upper limit should be imposed for increased sharing; value depends on bandwidth requested – if 1Gbps, upper limit can be a few minutes – if 100Mbps, upper limit can be a couple of hours • Apps – File transfer; e.g. 100MB on 1Gbps: 800ms – Remote viz. session: one or two hours at 100Mbps 8

Cheetah software on end hosts

End-host CHEETAH software Applications Remote viz.

GridFTP Web service FRTP TCP

Fixed Rate Transport Protocol (FRTP) designed for circuits

DNS lookup Routing decision

DNS query (to check if far end host is also on cheetah) Routing decision to check whether to use the TCP/IP path or

Signaling client

to request a circuit

Primary TCP/IP path NIC I End-to-end CHEETAH circuit NIC II 9

Demo #1 (at SC2004): Web Application

Web client (MVSUT3) Web server (MVSTU2) Web Browser (e.g. Mozilla) URL Response Web Server (e.g. Apache) download.cgi

RSVP-TE client RSVP-TE Messages RSVP-TE client Data transfer FRTP TCP CHEETAH FT receiver FRTP TCP CHEETAH FT sender • At the web server side – Hyperlink to file is a CGI script (download.cgi); filename embedded in hyperlink – Download.cgi is started automatically at server when user clicks hyperlink, which triggers CHEETAH FT sender – CHEETAH FT Sender initiates CHEETAH circuit setup by calling RSVP-TE client.

– CHEETAH FT Sender starts data transfer using dual paths: FRTP/circuit and TCP/IP • At the web client side – A RSVP-TE client is running as daemon to accept the circuit setup request.

– A CHEETAH FT receiver is running as daemon to receive the user data 10

File transfers on circuits

• Seems like a good app. for high bandwidth – Can absorb “any” bandwidth you can allocate to the transfer (subject to PC limitations) – No intrinsic burstiness • move bits from one disk to another 11

Better or worse than file transfers on TCP/IP?

• General thinking: – Circuits good for “large” files – eScience apps create large files – Emission delay of 2.2 hours for a 1TB file on a 1Gbps circuit • Not a scalable network solution if used exclusively for such very large files 12

Cheetah solution: Leverage presence of Internet path

• Use second NICs at hosts for circuit connectivity leaving primary NIC for Internet access Connectionless Internet Two paths available End host I End host II Circuit-Switched Network • Attempt circuit setup • If rejected, fall back to using TCP/IP Should we attempt a circuit setup for ALL file transfers?

13

Expected delay on TCP/IP path

Throughput B(p): approximately reciprocal of expected delay

B

(

p

)  min    

W

max

RTT

,

RTT

2

bp

3 

T

0 min 1   1 , 3 3

bp

8  

p

( 1  32

p

2 )     • Main factors: – Round-Trip Time (RTT) – main T – Bottleneck link rate prop – Prob. of packet loss on IP path, p, • Other terms: – W – T 0 max : receiver window size – b= 2 (ACK-every-other segment) : initial time-out J. Padhye, V. Firoiu, D. Towsley, and J. Kurose, “Modeling TCP Throughput: A Simple Model and its Empirical Validation,” Proc. of

ACM SIGCOMM 98

, Aug. 31 - Sep. 4, Vancouver Canada, pp. 303-314.

14

Mean TCP delays

Case 1 Case 2 Case 3 Case 4 Case 5 Case 6 Case 7 Case 8 Case 9 Case 10 Case 11 Case 12 Case 13 Case 14 Case 15 Case 16 Case 17 Case 18 Case 19 Case 20 Case 21 Case 22 Case 23 Input parameters Loss p Rate

r

0.0001 100 M bps 0.0001 1Gbps 0.001 100M bps 0.001 1Gbps 0.01 100 M bps 0.01 1Gbps 0.1 100 M bs 0.1 1Gbps Round- trip prop. delay

T prop

Queuing delay plus service time 0.1ms

5ms 50ms 0.1ms

5ms 50ms 0.1ms

5ms 50ms 0.1ms

5ms 50ms 0.1ms

5ms 50ms 0.1ms

5ms 50ms 0.1ms

5ms 50ms 0.1ms

5ms 0.2ms

0.02ms

0.26ms

0.026ms

0.38ms

0.038ms

0.68ms

0.068ms

Intermediate results RTT (ms) 0.3

5.2

50.2

0.12

5.02

50.02

0.36

5.26

50.26

0.13

5.03

50.03

0.48

5.38

50.38

0.138

5.038

50.04

0.78

5.68

50.68

0.168

5.068

Wmax (pkts) 2.5

41 418 10 418 4168 3 43.8

418.8

10.8

419 4169 4 44.8

419.8

11.5

419.8

4169.8

6.5

47.33

422.33

14 422.33

Final result Mean delay for a 1GB file (s) 82.25

89.45

396.5

8.25

39.6

395.7

82.93

135.4

1293 8.64

129.4

1287 92.41

471.7

4417 12.43

441.7

4387 283.56

2064.9

18424 61.07

1842.4

Impact of propagation delay Low impact of bottleneck link rate in wide-area networks Impact of packet loss rate 15

Delays incurred in using an end-to-end circuit

E

• Circuit setup delay + File transfer delay (

T setup

) 

m sig r s

 ( 1  2 ( 1  

sig

sig

) )  (

k

 1 ) 

T sp

 ( 1  2 ( 1  

sp

sp

) ) 

k

T prop T transfer

f r c

• m sig : message length; r s : signaling link rate • Loads:  sig and  sp : sig. link and processor • T sp : signaling protocol processing delay • k: number of switches; T prop : r.t. prop. delay • f: file size r c : circuit rate 16 Acknowledgment: NSF ANI grant 0087487 for hardware signaling

Should the application attempt a circuit setup or not?

• Mean delay if a circuit setup is attempted

E

[

T cheetah

]  ( 1 

P b

)(

E

[

T setup

] 

T transfer

) 

P b

(

E

[

T fail

] 

E

[

T tcp

]) P b : call blocking probability in the circuit-switched network If circuit setup fails, fall back to Internet path 17

Routing decision

if if  

E

[

T cheetah E

[

T cheetah

] ]  

E

[

T tcp E

[

T tcp

] ]   use the attempt TCP/IP circuit path setup if   if  

E

[

T setup

1 

P b

] 

E

[

T setup

] 1 

P b

E

[

T tcp

] 

T transfer

  use the TCP/IP path

E

[

T tcp

] 

T transfer

  attempt circuit setup 18

Numerical results link rate = 1Gbps

T prop = 0.1ms

T prop = 50ms 19

Crossover file sizes

When r c = 100Mbps and T prop = 0.1ms

r c = T prop 1Gbps, = 0.1ms

Me asu re of l oadi n g on

ckt . sw.

net work T CP /IP pat h

P loss

= 0.0001

P loss

= 0.001

P loss

= 0.01

P b

= 0.01

22MB 9MB 1.2MB

P b

= 0.1

24MB 10MB 1.4MB

P b

= 0.3

30MB 12MB 1.8MB 20

Utilization considerations

• Example: in 50ms scenario, if we transfer a 100KB file over a 100Mbps path, transfer time is only 8ms. Circuit utilization is 8/(50+8) = 13.7% • Two opposing factors – If the crossover file size (beyond which circuit setup is attempted) is increased • per-circuit utilization increases • traffic load decreases (Pareto distribution of file sizes), which means aggregate utilization decreases 21

Aggregate utilization u

a

u a

 ( 1 

P b

)   , where

P b m

 

m k m

  0 

k

/

m

!

/

k

!

For a 1% call blocking probability P b = 0.01  : traffic load m: number of circuits P b : call blocking probability Assuming file size follows Pareto distribution – Define fractional offered load  m u a  '    

X

  ]  

k

   1 1 10 100 4 17 117 24.8% 58.2% 84.6%  40KB 330KB 80MB  (fraction of  ) 81% 71% 51% 22

Plot of utilization u with

r

c

= 100Mbps, k=20

P b =0.3

P b =0.01

23

Cheetah network deployment

ORNL To DC – Dragon Circuit based gateway OC192 card Control card GbE/ 10GbE card NC To cluster computer To Cray GbE/10GbE Ethernet Switch Circuit-based gateway Control card GbE/ 10GbE card

10GbE

GbE/10GbE Ethernet Switch MCNC/NLR OC192 (10 Gbps)  NCSU

NLR WDM NLR GaTech WDM

Atlanta SLR  G. Tech GbE/ OC192 card Control card SLR card  10 Gbps 

ORNL GaTech WDM SOX/SLR WDM

24

Connecting Cheetah to Dragon and Ultrascience networks

DOE Ultrascience network (ORNL) Dragon Cheetah Acknowledgment: DOE grant 25

All this is fun, but

• What are the research problems? – Bandwidth sharing modes • Low load performance • Scheduled vs. immediate-request • Fairness – Mismatch between multitasking end hosts and TDM circuits 26

Fixing the bandwidth for the transfer could be a bad thing: low load problem 2 1 3 .

.

N Packet Switch 1 Capacity C 2 3 N The lone remaining transfer enjoys Each transfer gets C/N capacity 2 1 3 N .

.

Circuit Switch 1 Capacity C 2 3 N Each transfer is allocated C/N capacity with capacity allocation C/N • Varying bandwidth list scheduling algorithm – uses knowledge of file size to make varying bandwidth allocations for transfer – catch: requires circuit switches to be reprogrammed multiple times within lifetime of a transfer (circuit) 27

Scheduled vs. immediate-request calls

Session type requests: • long holding times (2 hours) • specific rate • remote visualizations • scientists participate in sessions • best served with an advance reservation File transfer requests: • file sizes provided not holding times • max rate specified but any rate can be allocated • scientists not involved; just computers Small files (e.g. 1 GB on 1 Gbps takes 8 sec) • should be handled in immediate-request mode Large files (e.g. 1 TB on 1 Gbps takes 2.2 hours) • should be handled in scheduled mode • should we allocate 10Gbps and finish in 800 sec?

• immediate-request? or scheduled?

• depends on m, the number of 10Gbps circuits 28

Fairness

• Call admission algorithms – Use Markov Decision Process (MDP) tools to balance fairness and overall throughput – Long-path and short-path calls – Large files (high-BW) and short files (low-BW) calls – Multi-level answer rather than binary accept/reject • Both with Fixed bandwidth and Varying bandwidth 29

Multi-level problem

• Perhaps a new problem?

– Real-time (interactive) audio-video applications generate data at a certain rate (constant or variable) • implication: application requests the required bandwidth from the network, and answer is binary (accept or reject); multiple classes – File transfers: “any” bandwidth that the network can provide could be acceptable • implication: application requests a MAX bandwidth, but the answer can be multi-level 30

Mismatch between multitasking end hosts and TDM circuits

File transfer Matlab user space Matlab File transfer Filesystem Network protocols kernel Network protocols Filesystem network card network card Circuit-switched network • Variability in sender: – other processes (e.g. matlab) + disk access (disk head location) • Variability in receiver: if buffer not emptied out, data loss occurs 31

Effects of mismatch in nature of circuits and nature of hosts

• Choose a high circuit rate and receive buffer can overflow causing losses – impacts delay + utilization (retransmissions) • Choose a low circuit rate and delay can be high • If sending rate is not matched exactly with circuit rate – circuit lies idle; utilization impacted 32

Fixed Rate Transport Protocol (FRTP) • Set up a circuit at a carefully chosen rate • Send data at that rate – hard to meter out data at a fixed rate from a multitasking sender when that rate is high (Linux system time granularity: 10ms) • No changes of sending rate – i.e., no flow control or congestion control • Packet losses recovered through retransmissions – no timers needed, just negative ACKs • because of in-sequence delivery 33

Experimental results

CIRCUIT RATE (Mbps) 200 590 CIRCUIT UTILIZATION (%) 90 62 RELATIVE TRANSFER DELAY 1.7

1.0

34

Current work

• Experimenting with RT schedulers to schedule file transfer task in a set rhythm • Experimenting with file systems to characterize file write time to collect data to then determine circuit rate and receive buffer size 35

Engineering problems

• Need to use VLAN based switches between end hosts and MSPPs – Costly otherwise • VLSR: Virtual Label Switch Router – External GMPLS controller for Ethernet (VLAN) switches • Understood need for making it a connection-oriented inter network 36 Acknowledgment: DOE grant

Connection-oriented networks

• Circuit switched – Time Division Multiplexed (SONET) • Equipment vendors: Sycamore, Ciena • Network: Cheetah, UltraScience Net, CA*net 4 – Wavelength Division Multiplexed (WDM) • Equipment vendors: Movaz, Calient, LambdaOptical • Network: Dragon, OMNInet, Internet2 HOPI 37

Connection-oriented networks

• Packet switched – Multiprotocol Label Switching (MPLS) • Equipment vendors: Cisco, Juniper • Network: Internet2, ESnet – Virtual Local Area Network (VLAN) • Equipment vendors: Dell, Intel, Foundry, Extreme • Network: Enterprise local area networks • Just need to “enable” connection-oriented network through already deployed boxes 38

Bandwidth sharing problem in heterogeneous network Request for 30Mbps connection

1Mbps

2, 30Mbps b a Switch granularity 1, 150Mbps 1, 10Mbps

10Gbps

d 5, 100Mbps 1, 500Mbps 2, 50Mbps 1, 50Mbps f c 1, 50Mbps e

51Mbps 1Mbps

• Problem: – Tradeoff of fairness and utilization becomes more difficult when these crossconnect granularities are considered 39

Interconnecting these networks

• Tricky business!

• Involves many levels of interworking protocols – User (data) plane – Signaling protocols (for connection setup/release) – Routing protocols (for reachability, topology, loading data dissemination) 40

But

• We need to solve this internetworking problem for a true connection-oriented service to flourish!

Acknowledgment: DOE grant 41

Why do this?

• Two simple views – Purpose of a communication link, and by extension, a communication network – Analogy with transportation modes 42

Why do this?

• View 1: Purpose of a communication link and by extension a communication network – To provide connectivity between a data sending entity and a data receiving entity – Quantify connectivity • bandwidth is a primary measure – Shouldn’t we have a network that provides users specific bandwidth levels as requested, and when requested, on a dime?

43

Why do this?

• View 2: Analogy with people/goods transportation modes – unreserved travel: roadways – reserved travel: airline seat • So why not at least two such networks for moving data?

44

Would anyone use it?

• Don’t know • Depends on the business case – What’s the cost of building this network?

– What’s the market?

– Can the service price be set to turn a profit, i.e., to let companies survive?

45

Hasn't this been attempted before?

• ATM-to-the-desktop – Goal: to enable an end-to-end connection-oriented service – It was a homogeneous network – all ATM switches – Recognition of need to interwork with IP • LANE, MPOA • Soon morphed into ATM networks offering connection oriented service to interconnect routers NOT end hosts – Application focus: mostly multimedia • delay-sensitive but “low” bandwidth • could be supported with simple priority queueing added to connectionless packet switches First difference: aiming for a heterogeneous internet using already deployed switches and gateways 46

Call to make a reservation (if only for part of the distance: airport-to-airport)

Second difference

• Ipsilon’s IP switch – Flow classification at “airport” to trigger connection setup – Questions of scalability – notion of having to hold “state” information for millions of flows • No, just the ones who requested bandwidth CL network CO network airport airport 47

Summary

• Rich new set of research problems • Experimental challenges a plenty!

• Real opportunity to deploy a CO internetwork • Web site: http://cheetah.cs.virginia.edu

48