Robust low-latency streaming

Download Report

Transcript Robust low-latency streaming

Robust Low-Latency Voice and
Video Communication over
Best-Effort Networks
Yi Liang
Department of Electrical Engineering
Stanford University
March 12, 2003
http://www.stanford.edu/~yiliang/
Media Delivery over IP Networks
Internet
Liang: Robust Low-Latency Voice and Video Communication
2
QoS Concerns and Challenges
Communication over best-effort networks …

Delay
Impairs interactivity of conversational services
Voice over IP: recommended one way delay < 150 ms
[ITU-T G.114]

Packet loss
Impairs perceptual quality

Delay jitter
Obstructs sequential and continuous media output
Liang: Robust Low-Latency Voice and Video Communication
3
Outline of Contributions
III. Networkadaptive coding
II. Transport
I. Client side
Packet network
Server
I.
Client
Client side
Adaptive playout scheduling for VoIP that reduces latency and packet loss
II.
Transport
Packet path diversity and applications in low-latency communications
III. Network-adaptive coding
Low-latency video communication that does not require packet
retransmission
Liang: Robust Low-Latency Voice and Video Communication
4
Outline
I.
Client side
Adaptive playout scheduling for VoIP that reduces
latency and packet loss
II.
Transport
Packet path diversity and applications in low-latency
communications
III. Network-adaptive coding
Low-latency video communication that does not require packet
retransmission
Liang: Robust Low-Latency Voice and Video Communication
5
Delay Jitter and Buffering
Fixed Playout Schedule
Late loss
Late loss rate (%)
Buffering delay
Avg. buffering delay (ms)
Liang: Robust Low-Latency Voice and Video Communication
6
Adaptive Playout Scheduling (1)
Adaptive Playout Schedule
Fixed schedule
Buffer. delay
Liang: Robust Low-Latency Voice and Video Communication
7
Adaptive Playout Scheduling (2)
Packetization time
Sender 1
2
3
4
5
6
7
8
Receiver
Playout
1
2
3
4
5
Slow down
6
7
8
time
Speed up
Requires media scaling
1. How to set the playout schedule?
2. How to scale the media?
3. Quality of scaled voice?
Liang: Robust Low-Latency Voice and Video Communication
8
Determine the Playout Schedule
Probability
Delay histogram
Next packet:
Given the acceptable loss
rate, find the playout
deadline
Deadline
Loss prob.
Delay (ms)
Liang: Robust Low-Latency Voice and Video Communication
History-based estimation
using past w delays
9
Voice Scaling Using Time-Scale Modification
Packet expansion
Template segment
0
1
2
3
Pitch
period
4
Input
Similar segment
Output
0/1


1/2
Based on WSOLA [Verhelst ‘93]
Preserves pitch
Liang: Robust Low-Latency Voice and Video Communication
2/3

3
4
Improved to scale short
individual voice packets;
no delay
10
Examples of Time-Scale Modification
Speech scaling
Original
130%
70%
130%
70%
Audio scaling
Original
Liang: Robust Low-Latency Voice and Video Communication
11
Quality of Time-Scale Modified Voice
Adaptive Playout Schedule

Packets scaled: 18.4 %
Scaling ratio: 50 - 200%

DMOS: 4.5 out of 5
[ITU-T P.800]
Liang: Robust Low-Latency Voice and Video Communication
MODIFIED
ORIGINAL
12
Results and Comparison
Algorithms:
11. Fixed playout schedule
22. Only adjust playout
schedule during silence
periods
[Ramjee ’94; Moon ‘98]
33. Adaptive playout
scheduling
Liang: Robust Low-Latency Voice and Video Communication
13
Overall Performance
Stanford  Chicago
-50%
Alg.
Loss
rate
MOS
Alg. 2
10%
2.6
Alg. 3
4%
3.7
MOS scale : 1 - 5
[ITU-T P.800]
Liang: Robust Low-Latency Voice and Video Communication
14
Subjective Listening Test Results
MOS
5
Stanford 
4.5
4
1.
2.
3.
4.
3.5
3
2.5
Chicago
Germany
MIT
China
2
1.5
1
Alg.
Alg.22
0.5
Alg.33
Alg.
0
1
2
3
4
Trace
Liang: Robust Low-Latency Voice and Video Communication
15
Summary
Adaptive Playout Scheduling




Improves the tradeoff between buffering delay and packet loss
Time-scale modification-based speech processing does not impair
speech quality
Overall speech quality improves by 1 on a 5-point MOS scale
The passive algorithm can be easily implemented on client
Audacity T2, 8X8, Inc.
Liang: Robust Low-Latency Voice and Video Communication
16
Outline
I.
Client side
Adaptive playout scheduling for VoIP that reduces latency and
packet loss
II.
Transport
Packet path diversity and applications in low-latency
communications
III. Network-adaptive coding
Low-latency video communication that does not require packet
retransmission
Liang: Robust Low-Latency Voice and Video Communication
17
Packet Path Diversity
Motivation


Typically better alternative path
exists [Savage, SigComm ‘99]
Uncorrelated packet loss on
independent paths
[Apostolopoulos ‘01]

Sender
Low-latency requirement
Relay
server
1
Relay
server
2
Receiver
Liang: Robust Low-Latency Voice and Video Communication
18
Internet Experiments
Sender
(5ms)
Exodu
s
Comm.
Santa Clara, CA
192.84.16.176
(45ms)
Receiver
BBN Planet
(40ms)
(5ms)
Qwest
MIT
18.184.0.50
Harvard
140.247.62.110
Relay Server
(delay incurred on a link or ISP network)
Liang: Robust Low-Latency Voice and Video Communication
19
Measured Packet Delay Trace
Liang: Robust Low-Latency Voice and Video Communication
20
Adaptive Playout Scheduling for Two-Stream
Liang: Robust Low-Latency Voice and Video Communication
21
Multiple Description Speech Coding
Packet length
s1
E
O E
s2
O
E

O E
O E
O
O
E O
E
Time
Complementary and redundant descriptions of media
Stream 1:
Even samples: finer quantization
Odd samples:
coarser quantization
Stream 2: Vice versa [Jiang, Ortega ‘00]
Liang: Robust Low-Latency Voice and Video Communication
22
Determine the Playout Schedule
Prob.
d
To minimize the Lagrangian cost function
C  d  1  Pr{both descriptions lost | d }
p1
 2  Pr{only one description lost | d }
Stream 1
 d  1 p1 p2  2 ( p1 (1  p2 )  p2 (1  p1 ))
p2
Stream 2
Delay
Liang: Robust Low-Latency Voice and Video Communication
23
Overall Performance
Loss rate
(%)
-35ms
Avg end-to-end delay (ms)
Liang: Robust Low-Latency Voice and Video Communication
24
Summary
Packet Path Diversity


Exploitation of statistically uncorrelated delay jitter and packet loss
behavior
Adaptive playout scheduling for multiple streams provides lower
latency and reduced distortion
Liang: Robust Low-Latency Voice and Video Communication
25
Outline
I.
Client side
Adaptive playout scheduling for VoIP that reduces latency and
packet loss
II.
Transport
Packet path diversity and applications in low-latency
communications
III. Network-adaptive coding
Low-latency video communication that does not require
packet retransmission
Liang: Robust Low-Latency Voice and Video Communication
26
Low-Latency Video Communication
Motivation for low-latency video


Real-time conversational services
Interactive video streaming
Voice vs. Video
Voice over IP
Typical video streaming
< 150 ms
5 ~ 15 seconds pre-roll time
Weak or no dependency
across packets
Strong dependency across
packets due to motioncompensated coding
Liang: Robust Low-Latency Voice and Video Communication
27
Low-Latency Video — Challenges
What the problems are
Packet dependency due to hybrid motion-compensated coding

Interframe prediction
I
P
P
P
P
P
Transmission error

P
P
The “P-I” scheme
Time
Large receiver buffer and packet retransmission employed
Liang: Robust Low-Latency Voice and Video Communication
28
Approaches
Goal
Achieve VoIP-like latency
Approach


Eliminate the need for retransmission
Robust network-adaptive coding by optimal packet dependency
management
Liang: Robust Low-Latency Voice and Video Communication
29
Coding Mode
Coding mode
P1
P2
…
Increased
errorresilience
P5
…
INTRA
Liang: Robust Low-Latency Voice and Video Communication
30
Error-Resilience vs. Compression Efficiency
Foreman sequence
Rate (Kbps)
coded at
PSNR=35.9 dB
(H.26L TML8.5,
30 fps, 270 frames)
INTRA
Coding mode
Increased error-resilience
Decreased compression efficiency
Liang: Robust Low-Latency Voice and Video Communication
31
Determine R-D Optimized Coding Modes
Select the prediction mode that minimizes the R-D cost
…
…
Long-Term Memory V
P1 : (R1, D1)
J v  Dv  Rv
P2 : (R2, D2)
…
PV: (RV, DV)
vopt (n)  arg min v1, 2,...V , J v (n)
v: coding mode
I : (R , D )
Liang: Robust Low-Latency Voice and Video Communication
32
Estimation of Distortion
n-3
n-2
n-1
1-p
1-p
p
p
P1
P2
n
( R1, D1 )
1-p
1-p
p
D11, p11=(1-p)3
D21, p21=(1-p)2
D12, p12=(1-p)2p
p
D22,…
p22=(1-p)p
D23, p23=p(1-p)
1-p
( R2 , D2 )
…
( RV , DV )
…
p
2
D
,, pp24=p
3
D24
=p
18
18
4
DD2 1 DD2i  pp2i
 1i 1i
8
i 1
i 1
Channel feedback utilized at the source coder
Liang: Robust Low-Latency Voice and Video Communication
33
Experimental Results
Comparing
1.
1 Rate-distortion optimized dependency management
2.
2 Simple P-I
I
P
P
Liang: Robust Low-Latency Voice and Video Communication
P
P
P …
34
R-D Performance (1)
36%
1.2dB
No retransmission; no algorithm delay
channel loss rate=10%
Liang: Robust Low-Latency Voice and Video Communication
35
R-D Performance (2)
No retransmission; no algorithm delay
channel loss rate=10%
Liang: Robust Low-Latency Voice and Video Communication
36
R-D Performance (3)
Bitrate 200 Kbps, various channel loss rates
Liang: Robust Low-Latency Voice and Video Communication
37
Video Demo (1)
R-D optimized
Simple P-I
Foreman, 109Kbps, 10% channel loss
No retransmission; no algorithm delay
Liang: Robust Low-Latency Voice and Video Communication
38
Video Demo (2)
R-D optimized
Simple P-I
Mother-Daughter, 318Kbps, 10% channel loss
No retransmission; no algorithm delay
Liang: Robust Low-Latency Voice and Video Communication
39
Summary
Network-Adaptive Packet Dependency Management


R-D optimization improves the tradeoff between error-resilience and
compression efficiency
Eliminated the need for packet retransmission; achieved VoIP-like low
latency
Liang: Robust Low-Latency Voice and Video Communication
40
Summary of Contributions
III. Networkadaptive coding
II. Transport
I. Client side
Packet network
Server
I.
Client
Client side
Adaptive playout scheduling that reduces latency and packet loss
II.
Transport
Packet path diversity that further reduces communication delay and
distortion
III. Network-adaptive coding
A video communication system that requires no packet retransmission,
which allows VoIP-like low-latency
Liang: Robust Low-Latency Voice and Video Communication
41
Other Contributions
Other contributions not covered in this presentation

A low-latency loss concealment scheme

Packet path diversity for robust low-latency video
communication

A layered coding structure to avoid mismatch error for
streaming of pre-coded video

An accurate model to quantify video distortion as a result of
packet losses

A prescient scheme that optimizes the dependency for a group
of packets for video streaming
Liang: Robust Low-Latency Voice and Video Communication
42
Publications

Journal publications: 3
IEEE Transactions on Multimedia
Journal of Wireless Communication and Mobile Computing
IEEE Transactions on Circuits and Systems for Video Technology

Invited papers: 4

Papers in conference proceedings: 8
Proceedings ACM Multimedia (SigMM)
… …
Liang: Robust Low-Latency Voice and Video Communication
43
Media Delivery over IP Networks
Internet
Liang: Robust Low-Latency Voice and Video Communication
44
Low-Latency Media Communication
Liang: Robust Low-Latency Voice and Video Communication
45
Acknowledgements
Committee members,
EE faculty
My family members
IVMS group
members and alumni,
and assistants
Our sponsors
Many friends, in ISL,
EE, and Stanford
Liang: Robust Low-Latency Voice and Video Communication
46
Backup Slides
The following backup slides may or may not be used …
Liang: Robust Low-Latency Voice and Video Communication
47
Determine the Playout Schedule
Percentage
Delay histogram
d
ˆl
Delay (ms)
Liang: Robust Low-Latency Voice and Video Communication
48
Likelihood Ratio Factor
1 w ( Di   s )2
lrf  
2
w i 1
s
[Gibbon, Little, ‘96]
Liang: Robust Low-Latency Voice and Video Communication
49
More Samples for Time-Scale Modification
Audio scaling
Original
Expanded by 20%
Liang: Robust Low-Latency Voice and Video Communication
Compressed by 20%
50
Low-Latency Loss Concealment
L
L
i-2
i-1
i lost
i+1
i+2
Alignment found by correlation
i-2
i+1
i-1
i+2
time
2L
1.3L




Earlier work [Stenger ‘96]
Algorithm delay reduced to one packet time
Nicely integrates into adaptive playout system
20% random packet loss:
Original:
Loss:
Concealed:
Liang: Robust Low-Latency Voice and Video Communication
51
Speech Samples
Alg.
Loss
rate
MOS
Alg. 2
10%
2.6
Alg. 3
4%
3.7
Original
Liang: Robust Low-Latency Voice and Video Communication
4.4
52
Overall Performance
1
2
3
4
Stanford ->
1.
2.
3.
4.
Liang: Robust Low-Latency Voice and Video Communication
Chicago
Germany
MIT
China
53
Multi-Stream Playout Scheduling
Sending on path 1
1
2
3
4
5
6
1
2
3
4
Time
Receiving on path 1
Playout
5
6
Receiving on path 2
Sending on path 2
1
2
3
4
5
6
Packet path diversity reduces effective delay jitter and therefore late
loss rate
Liang: Robust Low-Latency Voice and Video Communication
54
Path Diversity – Voice Demo
Original
Path Diversity
Single-stream with FEC
at same data rate
 Average total end-to-end delay: 84 ms
 Error concealment: speech segment repetition
Liang: Robust Low-Latency Voice and Video Communication
55
More Experiment Results


Results obtained by varying 2
while keeping 1 fixed
With higher delay: better
chances to play both
descriptions
Liang: Robust Low-Latency Voice and Video Communication


Observed lower playout rate
variation by using multiple
streams
Jitter averaged; lower STD of
min(di , dj)
56
PESQ Results
 Perceptual Evaluation
of Speech Quality
(ITU-T Rec. P.862,
Feb. 2001)
 PESQ can be used for
end-to-end quality
assessment
 Ranges from –0.5 to
4.5 but usually
produces MOS-like
scores between 1.0
and 4.5
Liang: Robust Low-Latency Voice and Video Communication
57
Internet Experiment (2)
Harvard
(7ms)
131.188.130.136
(40ms)
VBNS IP
Backbone
Service
(5ms)
AT&T
Erlangen
140.247.62.110
DANTE
Operations
(10ms)
(5ms)
UUNE
T
Tech.
New Jersey
165.230.227.81
 Path 1 (direct): N. J. – Erlangen
 Path 2 (alternative): N. J. – Harvard – Erlangen
Liang: Robust Low-Latency Voice and Video Communication
58
Results (2)



Mean delay
61.3/65.0 ms
link loss
0.6% / 1.1%
Significant reduction of
late loss and end-to-end
delay by packet path
diversity
Path 1 (direct): N. J. – Germany
Path 2 (alternative): N. J. – Harvard – Germany
Liang: Robust Low-Latency Voice and Video Communication
59
Video Streaming Using Path Diversity
Path 1
n-5
n-4
n-3
n-2
n-1
n
Path 2
Next frame to encode and send: n
Goal Minimize distortion under rate constraint
(1) Path selection to minimize the loss probability of frame n and
maximize the benefit of path diversity
 Alternate when both channels are good
 Send small probe packets over the channel in bad state
[Setton, Liang, Girod, ICME’03, submitted]
(2) Source coding
Liang: Robust Low-Latency Voice and Video Communication
60
Determine Prediction Mode
V=5
V=3
V=2
Path 1
V=1
n-5
n-4
n-3
n-2
n-1
Path 2
n
Long-Term Memory V=5
Prediction modes:
v=1, 2, … V, I
J v  D v  Rv , for v  
  {1,2,3,5, I}
  {v | frame n-v sent over the
same channel as n}  {1}  {I }
J1, J 2 , J 3 , J 5 , J I
vopt  arg min v J v
Liang: Robust Low-Latency Voice and Video Communication
61
Results (1)
Comparing to

RPS-NACK
[Lin, ICME’01]

Video redundancy
coding (VRC)
[H.263++]
Channel loss rate_1
=loss rate 2 =15%
Avg burst len=8
Feedback delay=6
Liang: Robust Low-Latency Voice and Video Communication
62
Results (2)
Channel loss rate_1
=loss rate 2 =15%
Avg burst len=8
Feedback delay=6
Liang: Robust Low-Latency Voice and Video Communication
63
Path Diversity Gain with Shared Link
Liang: Robust Low-Latency Voice and Video Communication
64
TCP-Friendly Streaming
1.22  MTU
r
RTT  p
[Mahdavi, Floyd, ‘97]
[Floyd, Handley, Padhye, Widmer, ‘00]
Liang: Robust Low-Latency Voice and Video Communication
65
Long-Term Memory Prediction and Packet Dependency
To manage prediction dependency

Long-term Memory (LTM) prediction on macroblock level
[Wiegand, Zhang, Färber, Girod, ’99, ‘00]

Reference Picture Selection (RPS)
NACK
[Annex N H.263+, Annex U H.263++, H.26L]
NEWPRED
[ISO/IEC MPEG-4]
Liang: Robust Low-Latency Voice and Video Communication
66
R-D Optimization
J v  D v  Rv
L(n)
D v   pvl Dvl
l 1
  5e
0.1Q
5Q
34  Q
[H.26L TML 8.5]
[Wiegand, Girod, ICIP’01]
Liang: Robust Low-Latency Voice and Video Communication
67
Dynamic PSNRs
Liang: Robust Low-Latency Voice and Video Communication
68
Streaming of Pre-Encoded Media

Media pre-coded and pre-stored offline
Bit-stream assembly at streaming times
Pre-coded content benefits large number of users

One potential problem …


Liang: Robust Low-Latency Voice and Video Communication
69
Potential Mismatch Error
S1
I
I
I
I
I
I
P
P
P
P
P
P …
Transmitted
P
P
P
I
P
P …
Decoded
P
P
P
I
P
P …
Encoded
S2
…
Mismatch
Previous schemes using S-frame [Färber, ICIP’97 ], SP-frame [H. 26L]
alleviate or solve the problem at the cost of higher bitrate
Liang: Robust Low-Latency Voice and Video Communication
70
Layered Coding Structure for Bitstream Assembly
TGOP=25
LAYER I
LAYER II
I
P5
I
P5
P5
I
V=5
P5
P5
P5
I
P5
P5
P5
P5
I
I
P5
P5
P5
P5
P5
P5
P5
…
…
SYNC-frames: allow switching
LAYER III
P5
P5
Liang: Robust Low-Latency Voice and Video Communication
71
P-I for Comparison
I
P P P P P P P
I P P P P P
I P P P
I P
P
P
P
P
I
P
P
P
P
P
Liang: Robust Low-Latency Voice and Video Communication
I
P
P
P
P
P
P
P
P
P
…
I…
P P I…
P P P P I…
P P P P P P I…
72
R-D Performance (1)
36%
1.2dB
No retransmission; no algorithm delay
channel loss rate=10%
Liang: Robust Low-Latency Voice and Video Communication
73
R-D Performance (2)
No retransmission; no algorithm delay
channel loss rate=10%
Liang: Robust Low-Latency Voice and Video Communication
74
R-D Performance (3)
Bitrate 200 Kbps, various channel loss rates
Liang: Robust Low-Latency Voice and Video Communication
75
Cost of Error-Resilience (1)

Error-resilience / low-latency is not free
PSNR
(dB)
Bitrate
increase for
5% loss
Bitrate
increase for
10% loss
33.4
17%
39%
35.9
20%
43%
37.8
14%
35%
Distortion at the encoder
V  5, d fb  7.
Liang: Robust Low-Latency Voice and Video Communication
76
Cost of Error-Resilience (2)
PSNR (dB)
Bitrate
increase for
5% loss
Bitrate
increase for
10% loss
35.0
20%
52%
36.4
17%
45%
39.3
22%
46%
40.0
16%
40%
Distortion at the encoder
V  5, d fb  7.
Liang: Robust Low-Latency Voice and Video Communication
77
Cost of Layered Coding Structure (1)
Lossless channel
23%
30%
25%
32%
V  5, d fb  5, p  0.
Liang: Robust Low-Latency Voice and Video Communication
78
Cost of Layered Coding Structure (2)
Channel loss rate=5%
V  5, d fb  5, p  0.05.
Liang: Robust Low-Latency Voice and Video Communication
79
Comparing Different Error-Resilience Schemes
Latency
R-D cost
Resilience to burst
loss
ARQ
High
Low
FEC
Medium-low
Medium-high Medium-low,
depending on delay
Dependency Very low
control
Low
Medium-high High
Liang: Robust Low-Latency Voice and Video Communication
80