Transcript Slide 1

Packet Switches with
Output and Shared Buffer
1
Packet Switches with Output
Buffers and Shared Buffer
• Packet switches with output buffers, or
shared buffer
• Delay Guarantees
• Fairness
• Fair Queueing
• Deficit Round Robin
• Random Eearly Detection
• Weighted Fair Early Packet Discard
2
Quality of Service: Requirements
How stringent the quality-of-service
requirements are.
5-30
3
Buffering
Smoothing the output stream by
buffering packets.
4
Quality of Service
• Integrated Services
• Bandwidth is negotiated and the
traffic is policed or shaped
accordingly
• Differentiated Services
• Traffic is served according to its
priority: Expedite forwarding (EF),
assured forwarding (AF), best effort
forwarding (BE)
5
The Leaky Bucket Algorithm
(a) A leaky bucket with water. (b) a leaky
bucket with packets.
6
The Leaky
Bucket
Algorithm
(a) Input to a leaky
bucket. (b) Output
from a leaky bucket.
Output from a token
bucket with
capacities of (c) 250
KB, (d) 500 KB, (e)
750 KB, (f) Output
from a 500KB token
bucket feeding a 10MB/sec leaky bucket.
7
The Token Bucket Algorithm
5-34
(a) Before
(b) After
8
Admission Control
An example of flow specification
5-34
9
Packet Switches with Output Buffers
10
Packet Switches with Shared Buffer
11
Delay Guarantees
• All flows must police their
traffic: send certain
amount of data within one
policing interval
• E.g. 10Mbps flow should
send 10Kb within 1ms
• If output is not overloaded,
it is guaranteed that the
data passes the switch
within one policing interval
12
Fairness
• When some output is overloaded, its
bandwidth should be fairly shared
among different flows.
• What is fair?
• Widely adopted definition is max-min
fairness.
• The simplest definition (for me) for fair
service is bit-by-bit round-robin (BR).
13
Fairness Definitions
1.
Max-min fairness:
1) No user receives more than it requests
2) No other allocation scheme has a higher minimum
allocation (received service divided by weight w)
3) Condition (2) recursively holds when the minimal
user is removed
2. General Processor Sharing: if Si(t1,t2) is the
amount of traffic of flow i served in (t1,t2)
and flow i is backlogged during , then it
holds
Si (t1 , t2 ) wi

S j (t1 , t2 ) w j
14
Examples
• Link bandwidth is 10Mbps; Flow rates: 10Mbps,
30Mbps; Flow weights: 1,1; Fair shares:
5Mbps, 5Mbps
• Link bandwidth is 10Mbps; Flow rates: 10Mbps,
30Mbps; Flow weights: 4,1; Fair shares:
8Mbps, 2Mbps
• Link bandwidth is 10Mbps; Flow capacities:
4Mbps, 30Mbps; Flow weights: 3,1; Fair
shares: 4Mbps, 6Mbps
• Exercise: Link bandwidth 100Mbps; Flow
rates: 5,10,20,50,50,100; Flow weights:
1,4,4,2,7,2; Fair shares ?
15
Fairness Measure
• It is obviously impossible to implement bit-bybit round-robin
• Other practical algorithm will not be
perfectly fair, there is a trade-off between
the protocol complexity and its level of
fairness
• Fairness measure is defined as:
 Si (t1 , t2 ) S j (t1 , t2 ) 

FM  max

 w

w
i
j


where flows i an j are backlogged during
(t1,t2) and should be as low as possible
16
Fair Queueing (FQ)
• It is emulation of bit-by-bit round robin, proposed by
Demers, Keshav and Shenker
• Introduce virtual time which is the number of service
rounds until time t, and is calculated as:
dV


dt N ac (t )
k
• Define with Si the virtual time when packet k of flow
i is serviced, and Fik the virtual time when this packet
departs the switch. Its length is Lki, and arriving time
to the switch at tki. It holds


Sik  max Fi k 1,V (tik ) , Fi k  Sik  Lki
• Packets are transmitted in an increasing order of
their departure times
17
Examples of FQ Performance
• The performance of different end-to-end flow control
mechanisms passing through switches employing WFQ
has been examined by Demers et al.
• Generic flow control algorithm uses a sliding window
like TCP, and timeout mechanism where congestion
recovery starts after 2RTT (RTT is an exponentially
averaged round trip time)
• Flow control algorithm proposed by Jacobson and
Karels, TCP Tahoe version. It comprises: slow start,
adaptive window threshold, tedious estimation of RTT
• In the selective DECbit algorithm, switches send
congestion messages to sources using more than their
fair shares
18
Examples of FQ Performance
F1
• Telnet source 40B per
5s, FTP 1KB, maximum
window size 5
F6
T7
T8
800kbps
B
15packets
56kbps
Policy
FTP
Telnet
F1
F2
F3
F4
F5
F6
T7
T8
G/FIFO
18
1154
1159
3
1149
15
31
3
G/FQ
178
838
591
600
615
621
96
98
JK/FIFO
582
583
585
585
583
582
3
0
JK/FQ
574
579
546
594
599
601
87
96
DEC
582
582
582
582
582
582
99
90
Sl DEC
582
582
582
582
582
582
105
97
19
Examples of FQ Performance
• Telnet source 40B per 5s,
FTP 1KB, maximum window
size 5, ill behaved source
twice the line bit-rate
F1
T2
I3
800kbps
B
20packets
56kbps
FTP
Telnet
Ill behaved
F1
T2
I3
G/FIFO
3
11
3497
G/FQ
3491
95
5
JK/FIFO
0
0
3500
JK/FQ
3489
110
6
DEC
166
0
3334
Sl DEC
3493
95
3
Policy
20
Examples of FQ Performance
• FTP 1KB,
maximum
window
size 5
Policy
F4
56kbps
20packets
S4
S
S
S
S
S
S
F1
S1
F2
S2
F3
S3
FTP
F1
F2
F3
F4
G/FIFO
2500
2500
2500
1000
G/FQ
1750
1750
1750
1750
JK/FIFO
2500
2500
2500
1000
JK/FQ
1750
1750
1750
1750
DEC
2395
2406
2377
783
Sl DEC
1750
1750
1750
1750
21
Packet Generalized Processor Sharing
(PGPS)
• Parekh and Gallager generalized FQ by
introducing weights and simplified it a little
• Virtual time is updated whenever there is an
event in the system: arrival or departure as
follows
t  t j 1
V (t )  V (t j 1 ) 
, t j 1  t  t j
 wi
iA j
• Virtual arrival and departure of packet k of
flow i are calculated as
k
L
Sik  maxFi k 1 ,V (tik ), Fi k  Sik  i
wi
22
Properties of PGPS
• Theorem: For PGPS it holds that
 Si (t1 , t2 ) S j (t1 , t2 ) 
  Lmax
FM  max

 w

w
i
j


where Lmax is the maximum packet length.
• Complexity of the algorithm is O(N) because
so many packets may arrive within a packet
transmission time.
24
Deficit Round Robin
• Proposed by Shreedhar and Varghese at
Washington University in St. Louis
• In DRR, flow i is assigned quantum Qi
proportional to its weight wi, and counter ci.
Initially counter value is set to 0. The number
of bits of packets that are transmitted in
some round-robin round must satisfy ti<ci+Qi.
And counter is set to new value ci=ci+Qi-ti. If
queue gets emptied ci=0;
• The complexity of this algorithm is O(1)
because a couple of operations should be
performed within a packet duration time, if
algorithm serves non-empty queue whenever it
visits the queue.
25
Properties of DRR
• Theorem: For PGPS it holds that
 Si (t1 , t2 ) S j (t1 , t2 ) 
  3Lmax
FM  max

 w

w
i
j


where Lmax is the maximum packet length.
• Proof: Counter ci<Lmax, because it remains ci>0
if heading packet is longer than ci. It holds
that Si(t1,t2)=mQi+ci(0)-ci(m) where m is the
number of round-robin round and (t1,t2) is the
busy interval and therefore |Si(t1,t2)mQi|<Lmax.
26
Properties of DRR
• Proof(cont.): Si(t1,t2)/wi≤(m-1)·Q+Q+Lmax/wi,
and Sj(t1,t2)/wj≥m’·Q-Lmax/wj where m’ is the
number of round-robin rounds for flow j.
Because m’≥m-1, FM=Q+Lmax/wi+Lmax/wj =3Lmax
because wi,wj≥1, and Q≥Lmax in order for the
protocol to have complexity of O(1). Namely if
Q<Lmax it may happen that queue is not served
when round-robin pointer points to it and the
complexity of the algorithm is larger than
O(1). Namely each queue visit incurs the
operation of comparison, and many queues may
be visited, up to N per packet transmission. 27
Properties of DRR
• Maximum delay in BR is NLmax/B. In DRR, an
incoming packet might have to wait for
∑iQi/B, and its maximum delay is NQmax/B.
So, the ratio of the DRR delay and the ideal
delay is Qmax/Lmax=Qmax/Qmin=wmax/wmin, and
may be significant if the fairness granularity
should be very fine.
• Shreedhar and Varghese propose to serve the
delay sensitive traffic with reservations and
which is policed.
28
Packet Discard
• First schemes discard packets coming to the
full buffer or coming to the buffer with the
number of queued packets exceeding some
specified threshold.
• They are biased against bursty traffic,
because the probability that a packet is
discarded increases with its burst length.
• TCP sources sending discarded packets would
slow down their rates and underutilize the
network. All sources are synchronized, the
network throughput would be oscillatory and
the efficiency becomes low.
30
Random Early Detection (RED)
• Floyd and Jacobson introduce two threshold for the
queue length were introduced in random early
detection (RED) algorithm.
• When the queue length exceeds the low threshold but
is below the high threshold, packets are dropped with
a probability which increases with the queue length.
The probability is calculated so that the packets that
are dropped are equally spaced.
• When the queue length exceeds the higher threshold,
all incoming packets are dropped.
• The queue length is calculated as an exponential
weighted moving average, and it depends on the
instantaneous queue length, and past values of the
queue length.
31
Motivation for RED
• The global synchronization is avoided by making a
softer decision on packet dropping, i.e. by using two
thresholds, and by evenly dropping packets between
thresholds.
• The queue length is calculation as an exponential
weighted moving average allow short term bursts
because they do not trigger packet drops.
• Also, Authors argue that fair queueing is not required
because the flows sending more traffic will loose
more packets. But, it was shown in the subsequent
papers that the fairness is not satisfactory because
the flows are not isolated.
32
Severe Criticism of RED
• Bonald, May and Bolot severely criticize RED.
They analyzed RED and TailDrop
• Removing bias against bursty traffic means
higher drop probabilities for UDP traffic
because TCP dominates
• The average number of consecutive dropped
packets is higher for RED, and so (they claim)
the possibility for synchronization
• They show that jitter introduced by RED is
higher
33
Weighted Fair
Early Packet Discard (WFEPD)
• Racz, Fodor, and Turanyi proposed protocol
WFPD to ensure fair throughput to
different flows
• Calculate average flow rate as a moving
average

ri  q  ri  (1  q)  ci / 
av
av

where ci is the number of bytes arrived in
the last interval of length 
34
Weighted Fair
Early Packet Discard (WFEPD)
• Violating, non-violating and pending sources are
determined based on their rates
• Flows are ordered so that
r / w1  r / w2    r / wN
av
1
•
av
2
av
N
If first k-1 flows are violating, and E is the rate in
excess then the bandwidth of violating flows is
N
k 1


Rv  R   riav   riav    riav  R    riav  E
i k
i 1
 i 1
 i 1
N
k 1
35
Weighted Fair
Early Packet Discard (WFEPD)
• If kmin is minimal k for which the inequality
k

 w
rkav    ri av  E   k k
 i 1

 wi
i 1
is satisfied, then all flows below kmin are violating, and
they get:
ri
sch
 kmin 1 av
 wi
   ri  E   kmin 1
 i 1

w

i 1
i
36
Weighted Fair
Early Packet Discard (WFEPD)
• If pmin is the largest p for which the inequality holds
p

 wp
av
av
rp    ri  E   p
 thmin
 i 1

 wi
i 1
• If pmax is the minimal integer that satisfies:
rpav  thmax  rpsch
• Here 0<thmin<1 and thmax>1. Flows from pmin to pmax are
pending, and are dropped with the probability which
linearly increases with the flow rate
37
Examples of WFEPD Performance
• Fair for TCP flows, gives bandwidth according to the
weights. FIFO and early packet discard (EPD)
protocol
• Isolate misbehaving UDP flows that overload the
output port and give them almost equal shares as to
TCP flows with equal weights. FIFO queueing gives
remaining bandwidth to TCP flows.
• Give equal shares to TCP flows with different roundtrip times (RTT) and equal weights, while FIFO
queueing gives three times more bandwidth to the
flows with three times shorter RTT
38
References
A. Demers, S. Keshav, and S. Shenker, “Analysis and
simulation of a fair queueing algorithm,” Internet
Research and Experiments, vol.1, 1990.
• A. Parekh, and R. Gallager, “A generalized processor
sharing approach to flow control in integrated
services networks: The single-node case,” IEEE/ACM
Transactions on Networking, vol. 1 no.3, June 1993?
• M. Shreedhar, and G. Varghese, “Efficient fair
queueing using deficit round robin,” IEEE/ACM
Transactions on Networking, vol. 4, no. 3, 1996.
• J. Bennett, and H. Zhang, “Hierarchical packet fair
queueing algorithms,” IEEE/ACM Transactions on
Networking, vol. 5, no. 5, October 1997.
•
39
References
• S. Floyd and V. Jacobson, “Random early
detection gateways for congestion avoidance,”
IEEE/ACM Transactions on Networking, vol. 1,
no. 4, August 1993, pp. 397-413.
• T. Bonald, M. May, J.C. Bolot, “Analytic
evaluation of RED performance,” INFOCOM
2000, March 2000,
pp. 1415 – 1424.
• A. Racz, G. Fodor, Z. Turanyi, “Weighted fair
early packet discard at an ATM switch output
port,” INFOCOM 1999, pp. 1160-1168.
40