No Slide Title

Download Report

Transcript No Slide Title

Chapter 5 End-to-End Protocols
Outline
5.1 UDP
5.2 TCP
5.3 Remote Procedure Call
1
End-to-End Protocols
• Underlying best-effort network
–
–
–
–
–
drop messages
re-orders messages
delivers duplicate copies of a given message
limits messages to some finite size
delivers messages after an arbitrarily long delay
• Common end-to-end services
–
–
–
–
–
–
–
guarantee message delivery
deliver messages in the same order they are sent
deliver at most one copy of each message
support arbitrarily large messages
support synchronization
allow the receiver to flow control the sender
support multiple application processes on each host
2
5.1 Simple Demultiplexor (UDP)
•
•
•
•
Unreliable and unordered datagram service
Adds multiplexing
No flow control
Endpoints identified by ports
– servers have well-known ports
– see /etc/services on Unix
• Header format
0
16
31
SrcPort
DstPort
Length
Checksum
Data
• Optional checksum
– psuedo header + UDP header + data
3
5.2 Reliable Byte-Stream (TCP)
Outline
5.2.1 End-to-end Issues
5.2.2 Segment Format
5.2.3 Connection Establishment/Termination
5.2.4 Sliding Window Revisited
5.2.5 Triggering Transmission
Silly Window Syndrome
Nagle’s Algorithm
5.2.6 Adaptive Retransmission
Original Algorithm
Karn/Partridge Algorithm
Jacobson/Karels Algorithm
5.2.7 Record Boundaries
5.2.8 TCP Extensions
4
TCP Overview
• Connection-oriented
• Byte-stream
• Full duplex
• Flow control: keep sender
from overrunning receiver
• Congestion control: keep
sender from overrunning
network
– app writes bytes
– TCP sends segments
– app reads bytes
Application process
Application process
Write
bytes
Read
bytes
TCP
TCP
Send buffer
Receive buffer
Segment
Segment ■ ■ ■ Segment
Transmit segments
5
Data Link Versus Transport
• Potentially connects many different hosts
– need explicit connection establishment and termination
• Potentially different RTT
– need adaptive timeout mechanism
• Potentially long delay in network
– need to be prepared for arrival of very old packets
• Potentially different capacity at destination
– need to accommodate different node capacity
• Potentially different network capacity
– need to be prepared for network congestion
6
5.2.2 Segment Format
0
10
4
16
31
SrcPort
DstPort
SequenceNum
Acknow ledgment
HdrLen
0
Flags
AdvertisedWindow
Checksum
UrgPtr
Options (variable)
Data
7
Segment Format (cont)
• Each connection identified with 4-tuple:
– (SrcPort, SrcIPAddr, DsrPort, DstIPAddr)
• Sliding window + flow control
– acknowledgment, SequenceNum, AdvertisedWinow
Data
(SequenceNum)
Receiver
Sender
Acknow ledgment +
AdvertisedWindow
• Flags
– SYN, FIN, RESET, PUSH, URG, ACK
• Checksum
– pseudo header + TCP header + data
8
5.2.3 Connection Establishment and
Termination
Active participant
(client)
Passive participant
(server)
9
State Transition Diagram
CLOSED
Active open
Passive open
/SYN
Close
Close
LISTEN
SYN/SYN + ACK
Send /SYN
SYN/SYN + ACK
SYN_RCVD
ACK
SYN_SENT
SYN + ACK/ACK
ESTABLISHED
Close /FIN
Close /FIN
FIN/ACK
FIN_WAIT_1
CLOSE_WAIT
A
ACK
C
FIN/ACK
K
+
FIN_WAIT_2
Close /FIN
FI
N
/A
C
K
CLOSING
ACK
FIN/ACK
TIME_WAIT
LAST_ACK
Timeout after tw o
segment lifetimes
ACK
CLOSED
10
5.2.4 Sliding Window Revisited
Sending application
Receiving application
TCP
LastByteWritten
LastByteAcked
TCP
LastByteRead
LastByteSent
(a)
• Sending side
– LastByteAcked < =
LastByteSent
– LastByteSent < =
LastByteWritten
– buffer bytes between
LastByteAcked and
LastByteWritten
NextByteExpected
LastByteRcvd
(b)
• Receiving side
– LastByteRead <
NextByteExpected
– NextByteExpected < =
LastByteRcvd +1
– buffer bytes between
NextByteRead and
LastByteRcvd
11
Flow Control
• Send buffer size: MaxSendBuffer
• Receive buffer size: MaxRcvBuffer
• Receiving side
– LastByteRcvd - LastByteRead < = MaxRcvBuffer
– AdvertisedWindow = MaxRcvBuffer - (NextByteExpected NextByteRead)
• Sending side
– LastByteSent - LastByteAcked < = AdvertisedWindow
– EffectiveWindow = AdvertisedWindow - (LastByteSent LastByteAcked)
– LastByteWritten - LastByteAcked < = MaxSendBuffer
– block sender if (LastByteWritten - LastByteAcked) + y >
MaxSenderBuffer
• Always send ACK in response to arriving data segment
• Persist when AdvertisedWindow = 0
12
Protection Against Wrap Around
• 32-bit SequenceNum
Bandwidth
T1 (1.5 Mbps)
Ethernet (10 Mbps)
T3 (45 Mbps)
FDDI (100 Mbps)
STS-3 (155 Mbps)
STS-12 (622 Mbps)
STS-24 (1.2 Gbps)
Time Until Wrap Around
6.4 hours
57 minutes
13 minutes
6 minutes
4 minutes
55 seconds
28 seconds
13
Keeping the Pipe Full
• 16-bit AdvertisedWindow
Bandwidth
T1 (1.5 Mbps)
Ethernet (10 Mbps)
T3 (45 Mbps)
FDDI (100 Mbps)
STS-3 (155 Mbps)
STS-12 (622 Mbps)
STS-24 (1.2 Gbps)
Delay x Bandwidth Product
18KB
122KB
549KB
1.2MB
1.8MB
7.4MB
14.8MB
assuming 100ms RTT
14
5.2.5 Triggering Transmission
Silly Window Syndrome
• How aggressively does sender exploit open window?
Sender
Receiver
• If the sender aggressively fills an empty container as soon as it arrives,
then any small container introduced into the system remains in the system
indefinitely.
• Receiver-side solutions
– after advertising zero window, wait for space equal to a maximum
segment size (MSS)
– delayed acknowledgements
15
Nagle’s Algorithm
• How long does sender delay sending data?
– too long: hurts interactive applications
– too short: poor network utilization
– strategies: timer-based vs self-clocking (ack)
• When application generates additional data
– if fills a max segment (and window open): send it
– else
• if there is unack’ed data in transit: buffer it until ACK arrives
• else: send it
16
5.2.6 Adaptive Retransmission
(Original Algorithm)
• Measure SampleRTT for each segment / ACK pair
• Compute weighted average of RTT
–
–


EstRTT = a x EstRTT + b x SampleRTT
where a + b = 1
a between 0.8 and 0.9
b between 0.1 and 0.2
• Set timeout based on EstRTT
– TimeOut = 2 x EstRTT
17
Karn/Partridge Algorithm
Sender
Receiver
Orig
in al
trans
an sm
Receiver
Orig
miss
io
n
Retr
Sender
issio
n
i n al
trans
miss
io
n
ACK
Retr
an sm
ACK
(a)
issio
n
(b)
• Do not sample RTT when retransmitting
• Double timeout after each retransmission
18
Jacobson/ Karels Algorithm
•
•
•
•
New Calculations for average RTT
Diff = SampleRTT - EstRTT
EstRTT = EstRTT + (d x Diff)
Dev = Dev + d( |Diff| - Dev)
– where d is a factor between 0 and 1
• Consider variance when setting timeout value
• TimeOut = m x EstRTT + f x Dev
– where m = 1 and f = 4
• Notes
– algorithm only as good as granularity of clock (500ms on Unix)
– accurate timeout mechanism important to congestion control (later)
19
5.2.7 Record Boundaries
• TCP is a byte-stream protocol, the number of bytes written
by the sender are not necessarily the same as the number of
bytes read by the receiver.
• How to insert “record boundaries” into this byte stream ?
• URG flag and UrgPtr field (out-of-band data)
• PUSH operation
20
5.2.8 TCP Extensions
• Mitigate some problem that TCP is facing as
the underlying network gets faster.
• Implemented as header options
– Store timestamp in outgoing segments to improve
TCP’s timeout mechanism.
– Extend sequence space with 32-bit timestamp
– Shift (scale) advertised window (larger window)
21