Transcript Slide 1
Packet Switches with Output and Shared Buffer 1 Packet Switches with Output Buffers and Shared Buffer • Packet switches with output buffers, or shared buffer • Delay Guarantees • Fairness • Fair Queueing • Deficit Round Robin • Random Eearly Detection • Weighted Fair Early Packet Discard 2 Quality of Service: Requirements How stringent the quality-of-service requirements are. 5-30 3 Buffering Smoothing the output stream by buffering packets. 4 Quality of Service • Integrated Services • Bandwidth is negotiated and the traffic is policed or shaped accordingly • Differentiated Services • Traffic is served according to its priority: Expedite forwarding (EF), assured forwarding (AF), best effort forwarding (BE) 5 The Leaky Bucket Algorithm (a) A leaky bucket with water. (b) a leaky bucket with packets. 6 The Leaky Bucket Algorithm (a) Input to a leaky bucket. (b) Output from a leaky bucket. Output from a token bucket with capacities of (c) 250 KB, (d) 500 KB, (e) 750 KB, (f) Output from a 500KB token bucket feeding a 10MB/sec leaky bucket. 7 The Token Bucket Algorithm 5-34 (a) Before (b) After 8 Admission Control An example of flow specification 5-34 9 Packet Switches with Output Buffers 10 Packet Switches with Shared Buffer 11 Delay Guarantees • All flows must police their traffic: send certain amount of data within one policing interval • E.g. 10Mbps flow should send 10Kb within 1ms • If output is not overloaded, it is guaranteed that the data passes the switch within one policing interval 12 Fairness • When some output is overloaded, its bandwidth should be fairly shared among different flows. • What is fair? • Widely adopted definition is max-min fairness. • The simplest definition (for me) for fair service is bit-by-bit round-robin (BR). 13 Fairness Definitions 1. Max-min fairness: 1) No user receives more than it requests 2) No other allocation scheme has a higher minimum allocation (received service divided by weight w) 3) Condition (2) recursively holds when the minimal user is removed 2. General Processor Sharing: if Si(t1,t2) is the amount of traffic of flow i served in (t1,t2) and flow i is backlogged during , then it holds Si (t1 , t2 ) wi S j (t1 , t2 ) w j 14 Examples • Link bandwidth is 10Mbps; Flow rates: 10Mbps, 30Mbps; Flow weights: 1,1; Fair shares: 5Mbps, 5Mbps • Link bandwidth is 10Mbps; Flow rates: 10Mbps, 30Mbps; Flow weights: 4,1; Fair shares: 8Mbps, 2Mbps • Link bandwidth is 10Mbps; Flow capacities: 4Mbps, 30Mbps; Flow weights: 3,1; Fair shares: 4Mbps, 6Mbps • Exercise: Link bandwidth 100Mbps; Flow rates: 5,10,20,50,50,100; Flow weights: 1,4,4,2,7,2; Fair shares ? 15 Fairness Measure • It is obviously impossible to implement bit-bybit round-robin • Other practical algorithm will not be perfectly fair, there is a trade-off between the protocol complexity and its level of fairness • Fairness measure is defined as: Si (t1 , t2 ) S j (t1 , t2 ) FM max w w i j where flows i an j are backlogged during (t1,t2) and should be as low as possible 16 Fair Queueing (FQ) • It is emulation of bit-by-bit round robin, proposed by Demers, Keshav and Shenker • Introduce virtual time which is the number of service rounds until time t, and is calculated as: dV dt N ac (t ) k • Define with Si the virtual time when packet k of flow i is serviced, and Fik the virtual time when this packet departs the switch. Its length is Lki, and arriving time to the switch at tki. It holds Sik max Fi k 1,V (tik ) , Fi k Sik Lki • Packets are transmitted in an increasing order of their departure times 17 Examples of FQ Performance • The performance of different end-to-end flow control mechanisms passing through switches employing WFQ has been examined by Demers et al. • Generic flow control algorithm uses a sliding window like TCP, and timeout mechanism where congestion recovery starts after 2RTT (RTT is an exponentially averaged round trip time) • Flow control algorithm proposed by Jacobson and Karels, TCP Tahoe version. It comprises: slow start, adaptive window threshold, tedious estimation of RTT • In the selective DECbit algorithm, switches send congestion messages to sources using more than their fair shares 18 Examples of FQ Performance F1 • Telnet source 40B per 5s, FTP 1KB, maximum window size 5 F6 T7 T8 800kbps B 15packets 56kbps Policy FTP Telnet F1 F2 F3 F4 F5 F6 T7 T8 G/FIFO 18 1154 1159 3 1149 15 31 3 G/FQ 178 838 591 600 615 621 96 98 JK/FIFO 582 583 585 585 583 582 3 0 JK/FQ 574 579 546 594 599 601 87 96 DEC 582 582 582 582 582 582 99 90 Sl DEC 582 582 582 582 582 582 105 97 19 Examples of FQ Performance • Telnet source 40B per 5s, FTP 1KB, maximum window size 5, ill behaved source twice the line bit-rate F1 T2 I3 800kbps B 20packets 56kbps FTP Telnet Ill behaved F1 T2 I3 G/FIFO 3 11 3497 G/FQ 3491 95 5 JK/FIFO 0 0 3500 JK/FQ 3489 110 6 DEC 166 0 3334 Sl DEC 3493 95 3 Policy 20 Examples of FQ Performance • FTP 1KB, maximum window size 5 Policy F4 56kbps 20packets S4 S S S S S S F1 S1 F2 S2 F3 S3 FTP F1 F2 F3 F4 G/FIFO 2500 2500 2500 1000 G/FQ 1750 1750 1750 1750 JK/FIFO 2500 2500 2500 1000 JK/FQ 1750 1750 1750 1750 DEC 2395 2406 2377 783 Sl DEC 1750 1750 1750 1750 21 Packet Generalized Processor Sharing (PGPS) • Parekh and Gallager generalized FQ by introducing weights and simplified it a little • Virtual time is updated whenever there is an event in the system: arrival or departure as follows t t j 1 V (t ) V (t j 1 ) , t j 1 t t j wi iA j • Virtual arrival and departure of packet k of flow i are calculated as k L Sik maxFi k 1 ,V (tik ), Fi k Sik i wi 22 Properties of PGPS • Theorem: For PGPS it holds that Si (t1 , t2 ) S j (t1 , t2 ) Lmax FM max w w i j where Lmax is the maximum packet length. • Complexity of the algorithm is O(N) because so many packets may arrive within a packet transmission time. 24 Deficit Round Robin • Proposed by Shreedhar and Varghese at Washington University in St. Louis • In DRR, flow i is assigned quantum Qi proportional to its weight wi, and counter ci. Initially counter value is set to 0. The number of bits of packets that are transmitted in some round-robin round must satisfy ti<ci+Qi. And counter is set to new value ci=ci+Qi-ti. If queue gets emptied ci=0; • The complexity of this algorithm is O(1) because a couple of operations should be performed within a packet duration time, if algorithm serves non-empty queue whenever it visits the queue. 25 Properties of DRR • Theorem: For PGPS it holds that Si (t1 , t2 ) S j (t1 , t2 ) 3Lmax FM max w w i j where Lmax is the maximum packet length. • Proof: Counter ci<Lmax, because it remains ci>0 if heading packet is longer than ci. It holds that Si(t1,t2)=mQi+ci(0)-ci(m) where m is the number of round-robin round and (t1,t2) is the busy interval and therefore |Si(t1,t2)mQi|<Lmax. 26 Properties of DRR • Proof(cont.): Si(t1,t2)/wi≤(m-1)·Q+Q+Lmax/wi, and Sj(t1,t2)/wj≥m’·Q-Lmax/wj where m’ is the number of round-robin rounds for flow j. Because m’≥m-1, FM=Q+Lmax/wi+Lmax/wj =3Lmax because wi,wj≥1, and Q≥Lmax in order for the protocol to have complexity of O(1). Namely if Q<Lmax it may happen that queue is not served when round-robin pointer points to it and the complexity of the algorithm is larger than O(1). Namely each queue visit incurs the operation of comparison, and many queues may be visited, up to N per packet transmission. 27 Properties of DRR • Maximum delay in BR is NLmax/B. In DRR, an incoming packet might have to wait for ∑iQi/B, and its maximum delay is NQmax/B. So, the ratio of the DRR delay and the ideal delay is Qmax/Lmax=Qmax/Qmin=wmax/wmin, and may be significant if the fairness granularity should be very fine. • Shreedhar and Varghese propose to serve the delay sensitive traffic with reservations and which is policed. 28 Packet Discard • First schemes discard packets coming to the full buffer or coming to the buffer with the number of queued packets exceeding some specified threshold. • They are biased against bursty traffic, because the probability that a packet is discarded increases with its burst length. • TCP sources sending discarded packets would slow down their rates and underutilize the network. All sources are synchronized, the network throughput would be oscillatory and the efficiency becomes low. 30 Random Early Detection (RED) • Floyd and Jacobson introduce two threshold for the queue length were introduced in random early detection (RED) algorithm. • When the queue length exceeds the low threshold but is below the high threshold, packets are dropped with a probability which increases with the queue length. The probability is calculated so that the packets that are dropped are equally spaced. • When the queue length exceeds the higher threshold, all incoming packets are dropped. • The queue length is calculated as an exponential weighted moving average, and it depends on the instantaneous queue length, and past values of the queue length. 31 Motivation for RED • The global synchronization is avoided by making a softer decision on packet dropping, i.e. by using two thresholds, and by evenly dropping packets between thresholds. • The queue length is calculation as an exponential weighted moving average allow short term bursts because they do not trigger packet drops. • Also, Authors argue that fair queueing is not required because the flows sending more traffic will loose more packets. But, it was shown in the subsequent papers that the fairness is not satisfactory because the flows are not isolated. 32 Severe Criticism of RED • Bonald, May and Bolot severely criticize RED. They analyzed RED and TailDrop • Removing bias against bursty traffic means higher drop probabilities for UDP traffic because TCP dominates • The average number of consecutive dropped packets is higher for RED, and so (they claim) the possibility for synchronization • They show that jitter introduced by RED is higher 33 Weighted Fair Early Packet Discard (WFEPD) • Racz, Fodor, and Turanyi proposed protocol WFPD to ensure fair throughput to different flows • Calculate average flow rate as a moving average ri q ri (1 q) ci / av av where ci is the number of bytes arrived in the last interval of length 34 Weighted Fair Early Packet Discard (WFEPD) • Violating, non-violating and pending sources are determined based on their rates • Flows are ordered so that r / w1 r / w2 r / wN av 1 • av 2 av N If first k-1 flows are violating, and E is the rate in excess then the bandwidth of violating flows is N k 1 Rv R riav riav riav R riav E i k i 1 i 1 i 1 N k 1 35 Weighted Fair Early Packet Discard (WFEPD) • If kmin is minimal k for which the inequality k w rkav ri av E k k i 1 wi i 1 is satisfied, then all flows below kmin are violating, and they get: ri sch kmin 1 av wi ri E kmin 1 i 1 w i 1 i 36 Weighted Fair Early Packet Discard (WFEPD) • If pmin is the largest p for which the inequality holds p wp av av rp ri E p thmin i 1 wi i 1 • If pmax is the minimal integer that satisfies: rpav thmax rpsch • Here 0<thmin<1 and thmax>1. Flows from pmin to pmax are pending, and are dropped with the probability which linearly increases with the flow rate 37 Examples of WFEPD Performance • Fair for TCP flows, gives bandwidth according to the weights. FIFO and early packet discard (EPD) protocol • Isolate misbehaving UDP flows that overload the output port and give them almost equal shares as to TCP flows with equal weights. FIFO queueing gives remaining bandwidth to TCP flows. • Give equal shares to TCP flows with different roundtrip times (RTT) and equal weights, while FIFO queueing gives three times more bandwidth to the flows with three times shorter RTT 38 References A. Demers, S. Keshav, and S. Shenker, “Analysis and simulation of a fair queueing algorithm,” Internet Research and Experiments, vol.1, 1990. • A. Parekh, and R. Gallager, “A generalized processor sharing approach to flow control in integrated services networks: The single-node case,” IEEE/ACM Transactions on Networking, vol. 1 no.3, June 1993? • M. Shreedhar, and G. Varghese, “Efficient fair queueing using deficit round robin,” IEEE/ACM Transactions on Networking, vol. 4, no. 3, 1996. • J. Bennett, and H. Zhang, “Hierarchical packet fair queueing algorithms,” IEEE/ACM Transactions on Networking, vol. 5, no. 5, October 1997. • 39 References • S. Floyd and V. Jacobson, “Random early detection gateways for congestion avoidance,” IEEE/ACM Transactions on Networking, vol. 1, no. 4, August 1993, pp. 397-413. • T. Bonald, M. May, J.C. Bolot, “Analytic evaluation of RED performance,” INFOCOM 2000, March 2000, pp. 1415 – 1424. • A. Racz, G. Fodor, Z. Turanyi, “Weighted fair early packet discard at an ATM switch output port,” INFOCOM 1999, pp. 1160-1168. 40