Flow Control - WebTP Home Page, EECS, UC Berkeley

Download Report

Transcript Flow Control - WebTP Home Page, EECS, UC Berkeley

Flow Control and Reliability
Control in WebTP – Part 2
Ye Xia
10/31/00
Outline
• Motivating problems
• Recall known ideas and go through simple
facts about flow control
• Flow control examples: TCP (BSD) and
Credit-based flow control for ATM
• WebTP challenges and tentative solutions
Motivating Problem
• Suppose a packet is lost at F1’s receive buffer.
Should the pipe’s congestion window be reduced?
Two Styles of Flow Control
• TCP congestion control
– Lossy
– Traffic intensity varies slowly and oscillates.
• Credit-Based flow control.
– No loss
– Handles bursty traffic well; handles bursty receiver link
well
• TCP’s receive-buffer flow control is equivalent to
Kung and Morris’s Credit-Based flow control
What is TCP?
• (Network) congestion control:
– Linear/sublinear increase and multiplicative decrease of
window.
– Use binary loss information.
• (End-to-end) flow control resembles credit
scheme, with credit update protocol.
• Can the end-to-end flow control be treated the
same as congestion control? Maybe, but …
Credit-Based Control (Kung and Morris 95)
• Overview of steps
– Before forwarding packets, sender needs to receive
credits from receiver
– At various times, receiver sends credits to sender,
indicating available receive buffer size
– Sender decrements its credit balance after forwarding a
packet.
• Typically ensures no buffer overflow
• Works well over a wide range of network
conditions, e.g. bursty traffic.
Credit Update Protocol (Kung and Morris 95)
Adaptive Credit-Based Control
(Kung and Chang 95)
• Without adaptation: M = N * C_r * (RTT + N2 * N)
• Idea: make buffer size proportional to actual bandwidth, for each
connection.
• For each connection and on each allocation interval,
Buf_Alloc = (M/2 – TQ – N) * (VU/TU)
TQ: current buffer occupancy
VU: amount of data forwarded for the connection
TU: amount forwarded for all N connections.
• M = 4 * RTT + 2 * N
• Easy to show no losses. But can allocation be controlled precisely?
• Once bandwidth is introduced, the scheme can no longer handle
burst well.
BSD - TCP Flow Control
• Receiver advertises free buffer space
win = Buf_Alloc – Que_siz
• Sender can send [snd_una, snd_una + snd_win –1].
snd_win = win; snd_una: oldest unACKed number
snd_win = 6: advertised by receiver
1
2
3
4
5
6
7
8
9
sent and ACKed sent not ACKed can send ASAP
10
11
…
can’t send until
window moves
snd_una
snd_nxt
next send number
TCP Example
• Receiver: ACKs 4, win = 3. (Total buffer size = 6)
1
2
3
4
5
6
7
8
9
10
11
…
7
8
9
10
11
…
• Sender: sends 4 again
snd_win = 3
1
2
3
4
5
6
snd_una
• Sender: after 4 is received at receiver.
snd_win = 6
3
4
5
6
7
snd_una
8
9
10
11
12
13
…
TCP Receiver Buffer Management
• Time-varying physical buffer size B_r(t),
shared by n TCP connections.
• BSD implementation: each connection has
an bound , B_i, on queue size.
• Buffers are not reserved. It is possible 
B_i > B_r(t), for some time t.
Possible Deadlock
Connection 1:
… 2
3
4
5
6
Connection 2:
… 4
5
6
7
8
• Example: Two connections, each with B_i = 4.
• Suppose B_r = 4. At this point, physical buffer runs out,
reassembly cannot continue.
• Deadlock can be avoided if we allow dropping received packets.
Implications to reliability control (e.g. connection 1):
– OK with TCP, because packets 4 and 5 have not been acked
– WebTP may have already acked 4 and 5
TCP Receiver Flow Control Uses
Credit Scheme
• For performance reasons: better throughput
than TCPC for the same (small) buffer size?
– Losses are observed by the receiver. Why not
inform the sender?
– Queue size is also observed. Tell the sender.
– For data re-assembly, the receiver has to tell
which packets are lost/received? Very little
complexity is added to support window-flow
control.
Data Re-Assemble Forces a Credit
Scheme
• There is reason to believe that TCP flow
control brings some order to TCP.
• Receiver essentially advertises window by
[min, max], rather than just the size.
Actual TCP:
…
2
3
4
Otherwise:
…
2
9
12
5
6
17
7
19
8
20
9
24
31
Why do we need receiver buffer?
• Part of flow/congestion control when C_s(t) >
C_r(t)
– In TCPC, certain amount of buffer is needed to get
reasonable throughput. (For optimality issues, see
[Mitra92] and [Fendick92])
– In CRDT, also for good throughput.
• Buffering is beneficial for data re-assembly.
Buffering for Flow Control:
Example
• Suppose link capacities are constant. Suppose C_s
>= C_r. To reach throughput C_r, B_r should be
– C_r * RTT, in a naïve but robust CRDT scheme
– (C_s - C_r) * C_r * RTT / C_s, if C_r is known to the
sender.
– 0, if C_r is known to the sender and sender never sends
burst at rate greater than C_r.
– Note: upstream can estimate C_r
Re-assembly Buffer Sizing
• Without it, throughput can suffer. (by how much?)
• Buffer size depends on network delay, loss, packet
reordering behaviors. Can we quantify this?
Question: How do we put the two together? Reassembly buffer size can simply be a threshold
number, e.g. TCP.
Example: (Actual) buffer size B = 6. But we allow
packet 3 and 12 coexist in the buffer.
One-Shot Model for Throughput
• Send n packets in block, iid delays.
• Example: B=1 and n=3.
• E[Throughput] = 1/6 * (3+2+2+1+1+1) / 3 = 5/9
Input
Accepted
123
123
132
12
213
1
231
1
312
12
321
1
Some Results
• If B = 1, roughly ½ + e packets will be received
on average, for large n.
• If B = n, all n packets will be received.
• Conjecture:
n
k B
B! B
E[Throughput]  (
k!
k B
 B  1) / n
Reliability and Flow Control
Intertwined
• They share the same feedback stream.
• The receiver needs to tell the sender HOW
MANY and WHICH packets have been
forwarded.
• Regarding to “WHICH”, TCP takes the
simplistic approach to ACK the first unreceived data.
Summary of Issues
• Find control scheme suitable for both pipe level
and flow level.
– Reconcile network control and last-hop control: we
need flow control at the flow level.
– Note that feedback for congestion control and reliability
control is entangled.
• Buffer management at receiver
– Buffer sizing
• for re-assembly
• for congestion control
– Deadlock prevention
WebTP Packet Header Format
0
1
2
3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
Packet Number
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
Acknowledgment Number
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
Acknowledged Vector
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
ADU Name
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
Segment Number
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
Source Port
|
Destination Port
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data |U|A|R|S|F|R|E|F|P| C | P |
|
| Offset|R|C|S|Y|I|E|N|A|T| C | C |
RES
|
|
|G|K|T|N|N|L|D|S|Y| A | L |
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
Window
|
Checksum
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
Options
|
Padding
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
data
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
WebTP: Receiver Flow Control
• A flow can be reliable (in TCP sense) or unreliable (in
UDP sense).
• Shared feedback for reliability and for congestion control.
• Reliable flow uses TCP-styled flow control and data reassembly. A loss at the receiver due to flow-control buffer
overflow is not distinguished from a loss at the pipe. But,
this should be rare.
• Unreliable flow: losses at receiver due to overflowing B_i
are not reported back to the sender. No window flow
control is needed for simplicity. (Is the window
information useful?)
WebTP: Buffer Management
• Each flow gets a fixed upper bound on queue size,
say B_i.  B_i >= B_r is possible.
• Later on, B_i will adapt to speed of application.
• Receiver of a flow maintains rcv_nxt and
rcv_adv.
B_i = rcv_adv - rcv_nxt + 1
• Packets outside [rcv_nxt, rcv_adv] are
rejected.
WebTP Example
• Receiver: (positively) ACKs 5, 6, and 7, win = 3. (B_i = 6)
1
2
3
4
5
6
7
8
9
rcv_nxt
10
11
…
11
…
rcv_adv
• Sender: can send 4, 8 and 9 (subject to congestion control)
snd_win = 3
1
2
3
4
5
6
7
8
9
10
snd_nxt
snd_una
• Sender: after 4, 8 and 9 are received at receiver.
snd_win = 6
5
6
7
8
9
10
snd_una
snd_nxt
11
12
13
14
15
…
WebTP: Deadlock Prevention
(Reliable Flows)
• Deadlock prevention: pre-allocate bN buffer spaces, b >=
1, where N = max. number of flows allowed.
• When dynamic buffer runs out, enter deadlock prevention
mode. In this mode,
– each flow accepts only up-to b in-sequence packets for each flow.
– when a flow uses up b buffers, it won’t be allowed to use any
buffers until b buffers are freed.
• We guard against case where all but one flow is still
responding. In practice, we only need N to be some
reasonable large number.
• b = 1 is sufficient, but can be greater than 1 for
performance reason.
WebTP: Feedback Scheme
• The Window field in packet header is for each
flow. Like TCP, it is the current free buffer space
for the flow.
• When a flow starts, use the FORCE bit (FCE) for
immediate ACK from the flow.
• Rules for acknowledgement:
– To inform the sender about the window size, flow
generates an ACK for every 2 received packets (MTU).
– Pipe generates an ACK for every k packets.
• ACK can be piggybacked in the reverse data
packets.
Acknowledgement Example: Four Flows
Receiver:
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
Pipe:
Flow 1:
Flow 2:
Flow 3:
Flow 4:
Result:
With some randomness in the traffic, 50 - 62 ACKs are
generated for every 100 data packets.
4
Computation
Flows:
Packet number:
1
4
2
3
4
… k-3
1
3
2
2
k-2
k-1
k
k+1
Correctness of Protocol and Algorithm
• Performance typically deals with average cases, and can be
studied by model-based analysis or simulation.
• What about correctness?
– Very often in networking, failures are more of the concerns than poor
performance.
• Correctness of many distributed algorithms in networking area
has not been proven.
• What can be done?
– Need formal description
– Need methods of proof
• Some references for protocol verification: I/O Automata
([Lynch88]), Verification of TCP ([Smith97])
References
[Mitra92] Debasis Mitra, “Asymptotically Optimal Design of Congestion Control for
High Speed Data Networks”, IEEE Transactions on Communications, VOL. 10 NO. 2,
Feb. 1992
[Fendick92] Kerry W. Fendick, Manoel A. Rodrigues and Alan Weiss, “Analysis of a
rate-based feedback control strategy for long haul data transport”, Performance Evaluation
16 (1992), pp. 67-84
[Kung and Morris 95], H.T. Kung and Robert Morris, “Credit-Based Flow Control for ATM
Networks”, IEEE Network Magazine, March 1995.
[Kung and Chang 95], H.T. Kung and Koling Chang, “Receiver-Oriented Adaptive Buffer
Allocation in Credit-Based Flow Control for ATM Networks”, Proc. Infocom ’95.
[Smith97] Mark Smith. “Formal Verification of TCP and T/TCP”. PhD thesis, Department
of EECS, MIT, 1997.
[Lynch88], Nancy Lynch and Mark Tuttle. “An introduction to Input/Output automata”.
Technical Memo MIT/LCS/TM-373, Laboratory for Computer Science, MIT, 1988.