Transcript Document

LT-TCP: End-to-End Framework to
Improve TCP Performance over Networks with Lossy Channels
Omesh Tickoo, Vijay Subramanian, Shiv Kalyanaraman
(Rensselaer Polytechnic Institute)
K. K. Ramakrishnan (AT&T)
1
Networks Lab, Rensselaer Polytechnic Institute
Overall Motivation




TCP response to errors and congestion is the same:
 drop the window, and thus reduce load on the network
 In the worst case, timeout when particular sequence of packets get lost
(retransmits, entire window)
TCP was designed for congestion, loss rate in the 1-2% max. range.
 TCP suffers significant timeout penalties with erasure rates > 5%.
Wireless channels becoming more pervasive
 With mesh networks (infrastructure or community) it is likely that more
than the last hop will be wireless.
Wireless links:
 individual links can experience loss that can be high (even 10-15%) in
transient situations, until power and link rate adjustments kick in
 interference can also result in high loss rates.
 E.g., ad-hoc networks, Mesh networks.
2
Networks Lab, Rensselaer Polytechnic Institute
Approach

Tools available to us:
 Method of getting congestion indication that is separate from packet loss
due to errors: Explicit Congestion Notification (ECN)
 Use error recovery methods beyond retransmission and timeouts to
overcome packet loss, so that TCP’s performance is retained.
 Use FEC on an end-end basis:
 Dynamic knowledge of the loss information can be exploited by the
end-system.
 Track short term loss rates.
 Protect data by using FEC proactively and reactively.
 FEC can work in a coordinated fashion with TCP’s window
mechanisms to optimize the usage of FEC within a window (which is
not available at the link level).
3
Networks Lab, Rensselaer Polytechnic Institute
Goals
We pose the following questions..




Dynamic Range:
 Can we extend the dynamic range of TCP into high loss regimes?
 Can TCP perform close to the theoretical capacity achievable under high
loss rates?
Congestion Response:
 How should TCP respond to notifications due to congestion..
 … but not respond to packet erasures that do not signal congestion?
Mix of Reliability Mechanisms:
 What mechanisms should be used to extend the operating point of TCP
into loss rates from 0% - 50 % packet loss rate?
 How can Forward Error Correction (FEC) help?
 How should the FEC be split between sending it proactively (insuring
the data in anticipation of loss) and reactively (sending FEC in response
to a loss)?
Timeout Avoidance:
 Timeouts: Useful as a fall-back mechanism but wasteful otherwise
especially under high loss rates.
 How can we add mechanisms to minimize timeouts?
4
Networks Lab, Rensselaer Polytechnic Institute
TCP uses Loss Feedback to Estimate Available
LT-TCP: Adaptive Mechanisms
to Reinstate Performance
Capacity
Available Capacity
S
E
N
D
E
R
Adaptive
MSS/
Proactive
and
Reactive
FEC
Erasure
Recovery/
Loss
Estimation
X
X
R
E
C
E
I
V
E
R
Capacity
CapacityUsed
Used
Loss Feedback Through Acknowledgements
5
X – Packet Erasure
Networks Lab, Rensselaer Polytechnic Institute
Building Blocks…



ECN-Only: We infer congestion solely from ECN markings. Window is cut
in response to
 ECN signals: which means that hosts/routers have to be ECN-capable.
 Timeouts: The response to a timeout is the same as before.
Window Granulation and Adaptive MSS: We ensure that the window
always has at least G segments at all times.
 Window size in bytes initially is the same as normal SACK TCP.
 Initial segment size is small to accommodate G segments.
 Packet size is continually so that we have at least G segments. Once we
have G segments, packet size increases with window size.
Loss Estimation: The receiver continually tracks the loss rate and provides
a running estimate of perceived loss back to the TCP sender through ACKs.
An adaptive EWMA approach to estimating loss is used.
6
Networks Lab, Rensselaer Polytechnic Institute
Building Blocks …


Proactive FEC: TCP sender sends data in blocks where the block contains
K data segments and R FEC packets. The amount of FEC protection (K) is
determined by the current loss estimate.
 Proactive FEC based upon estimate of per-window loss rate (Adaptive)
Reactive FEC: Upon receipt of 1 or 2 dupacks, Reactive FEC packets are
sent based on the following criteria.
 Number of Proactive FEC packets already sent.
 Number of holes still left in the decoding block.
 Loss rate currently estimated.
 Reactive FEC to complement retransmissions
7
Networks Lab, Rensselaer Polytechnic Institute
Proactive and Reactive FEC in Action..
8
Networks Lab, Rensselaer Polytechnic Institute
Block Behavior: Per-Block Loss Estimator for P-FEC
Packet Erasure Rate EWMA Estimator:
E = *Elatest + (1-)*E
Overestimate after spikes :
 = Elatest/ (Elatest+ E)
Estimate is fairly
accurate within small
erasure rate variations
Trade off :Overestimation leads
to overhead.
Overestimate
Inefficiency Period
Estimation is done at receiver and fed-back to the sender
9
Networks Lab, Rensselaer Polytechnic Institute
Loss Tracking at Sender
Sender can quickly and accurately track the loss rate based
on feedback from the receiver.
0% 0% 20% 20% 30% 0% 20% 10%
0
50 100 150
200 250 300 350 400
10
Packet Error Rate
(Time)
Networks Lab, Rensselaer Polytechnic Institute
Reed-Solomon FEC: RS(N,K)
>= K of N
received
RS(N,K)
Recover K
data packets!
FEC (N-K)
Block
Size
(N)
Lossy Network
Data = K
Recovery possible if we receive at
least K packets out of N
11
Networks Lab, Rensselaer Polytechnic Institute
Timout Cause #1: Burst Errors + Large MSS
5
4
3
4
3
2
1
2
1
X
X
X
X
Transmission
Loss
Complete Window Lost!
12
Networks Lab, Rensselaer Polytechnic Institute
Window Granulation Reduces the Risk of Losing the
Complete Window
7
6
5
4
3
2
1
7
6
X
2
5
X
3
4
3
X
2
1
Transmission
Loss
X
8
ACK Stream
6
5
4
3
Rexmins
13
Networks Lab, Rensselaer Polytechnic Institute
Timout Cause #2: Insufficient Dupacks
=> SACK not triggered
6
5
4
6
3
5
4
3
2
Transmission
Loss
2
X
1
2
X
3
1
X
3
DUPACK-1
ACK Stream
Timeout because of insufficient
dupacks
14
Networks Lab, Rensselaer Polytechnic Institute
Proactive FEC
P-FEC
P-FEC
4
3
P-FEC
P-FEC
4
3
2
2
X
1
1
Transmission
Loss
X
Receiver FEC Decoder
P-FEC
P-FEC
+
4
+
2
+
1
Recover data packets…
2
1
3
15
Networks Lab, Rensselaer Polytechnic Institute
Timeout Cause #3: Loss of Retransmissions
6
5
6
4
5
4
3
2
X
X
1
3
2
1
Transmission Loss
2
2
DUPACK1
Retransmission
2
2
DUPACK2
DUPACK3
ACK Stream
2
Transmission Loss
X
ReXMITS ESPECIALLY vulnerable!
16
Networks Lab, Rensselaer Polytechnic Institute
Reactive FEC: Complements Rexmits
6
5
Transmission Loss
4
6
3
5
4
3
2
X
X
1
2
1
2
4
DUPACK1
5
6
DUPACK2
DUPACK3
ACK Stream
Selective Acknowledgements
R-FEC
R-FEC
Receiver FEC Decoder
+
R-FEC
R-FEC
4 17
3
+
4
+
1
2
Networks
Lab,1Rensselaer Polytechnic Institute
Putting it Together….
Application
Data
MSS
Adaptation
Granulated
Window Size
Window
P-FEC
Window
Size
Loss
Estimation
(n,k)
Data
FEC Computation
Loss Estimate
18
Networks Lab, Rensselaer Polytechnic Institute
Simulation Configuration
19
Networks Lab, Rensselaer Polytechnic Institute
Performance Results
SACK (Multiple Sources)
LT-TCP (Multiple Sources)
20
Networks Lab, Rensselaer Polytechnic Institute
Contribution of Components
(20% PER case (Single Source)
LT-TCP is able to
reduce timeouts drastically
keep the queue non-empty maximizing throughput and
capacity utilization.
minimize use of FEC to level needed
21
Networks Lab, Rensselaer Polytechnic Institute
Comparison w/ Link Layer FEC, HARQ
LL FEC: FEC based upon average PER
HARQ: 10% FEC; ARQ persistence = 3
LT-TCP: end-to-end
22
Networks Lab, Rensselaer Polytechnic Institute
Summary


TCP performance over wireless with residual erasure rates 050% (short- or long-term).
E2E FEC:






Granulation ensures better flow of ACKs especially in small window
regime.
Adaptive FEC (proactive and reactive) can protect critical packets
appropriately
Adaptive => No overhead when there is no loss.
ECN used to distinguish congestion from loss.
Near-optimal performance for wide range: from low to high
loss rates.
Future Work:


Optimal division of reliability functions between PHY,MAC, E2E
Study of interaction between LT-TCP and link-layer schemes.
23
Networks Lab, Rensselaer Polytechnic Institute
Thanks!
Researchers:
Omesh Tickoo: [email protected]
Vijay Subramanian: [email protected]
Shiv Kalyanaraman: [email protected]
K.K. Ramakrishnan, [email protected]
24
Networks Lab, Rensselaer Polytechnic Institute
Building Block Behavior: Adaptive MSS (Window
Granulation)
Adaptive
MSS behavior.
Congestion window (in segments) kept above G = 10
MSS increases when CWND grows,
MSS shrinks when CWND shrinks to maintain G
25
Networks Lab, Rensselaer Polytechnic Institute
Shortened Reed Solomon FEC (per-Window)
RS(N,K)
0
0
0
0
0
0
RS(N,K)
z
Zeros (Z)
Reactive FEC (R)
K=d+z
Block
Size
(N)
Proactive FEC (F)
Data = D
Window
(W)
26
d
Networks Lab, Rensselaer Polytechnic Institute
Performance Results..
Drop in Performance from 40 to 50 %
LT-TCP (Single Source)
At 50 % error rate, timeouts increase drastically because..
Few Proactive FEC packets received.
Proactive FEC cannot counter variation in error patterns.
Reactive FEC is insufficient in this case to avoid timeouts.
This effect can be mitigated by increasing FEC protection.
27
Networks Lab, Rensselaer Polytechnic Institute
Changes w.r.t. submitted paper








FEC is now done on a block-by-block basis.
Proactive protection is determined solely by the loss estimate. (no arbitrary
constants)
Reactive FEC packets may be wasted if they belong to the wrong block.
 Conditions under which Reactive FEC packets are sent are restricted
(discussed earlier).
Window granulation is done using the following rule
 “Send as big a packet as possible while maintaining granularity”
Throughput and goodput are measured at the receiver for better accuracy.
On partial dupacks, we make sure that retransmission are not duplicated. We
send new TCP data instead.
Loss tracking is now done whenever we receive an ACK.
Loss estimation at receiver has changed to accommodate block-by-block
decoding.
28
Networks Lab, Rensselaer Polytechnic Institute