Video Coding For Compression . . . and Beyond Bernd Girod Information Systems Laboratory Department of Electrical Engineering Stanford University.

Download Report

Transcript Video Coding For Compression . . . and Beyond Bernd Girod Information Systems Laboratory Department of Electrical Engineering Stanford University.

Video Coding
For Compression
. . . and Beyond
Bernd Girod
Information Systems Laboratory
Department of Electrical Engineering
Stanford University
Bit Consumption of US Households
Bit equivalent, assuming state-of-the-art compression, year 2000
Total for 70M households
~230 Exabyte/year
Television
94%
Radio
1.7%
Recorded Music
0.4%
Newspaper
0.0003%
Books
0.0002%
Magazines
0.0002%
Home video
3.3%
Video games
0.6%
Internet
0.0003%
[Source: UC Berkeley: How much Information]
Bernd Girod: Video Coding for Compression and Beyond
2
Desirable Compression Ratios
SDTV broadcasting
~2 Mbps
ITU-R 601
166 Mbps
~
100 : 1
DSL
~
1,000 : 1
~200 kbps
CIF
Dial-up modem,
wireless link
~
10,000 : 1
~ 20 kbps
QCIF
Bernd Girod: Video Coding for Compression and Beyond
3
Outline


Video compression – state-of-the-art
Beyond compression
– Rate-scalable video
– Wavelet video coding
– Error-resilient video transmission
– Unequal error protection
– Optimal scheduling for packet networks
– Distributed video coding
Bernd Girod: Video Coding for Compression and Beyond
4
Outline


Video compression – state-of-the-art
Beyond compression
– Rate-scalable video
– Wavelet video coding
– Error-resilient video transmission
– Unequal error protection
– Optimal scheduling for packet networks
– Distributed video coding
Bernd Girod: Video Coding for Compression and Beyond
5
“It has been customary in the past to transmit successive
complete images of the transmitted picture.”
[...]
“In accordance with this invention, this difficulty is avoided by
transmitting only the difference between successive images of
the object.”
Bernd Girod: Video Coding for Compression and Beyond
6
Motion-Compensated Hybrid Coding
Coder
Control
Video in
Control
Data
Transform/
Quantizer
Decoder
Quant.
Transf. coeffs
Deq./Inv.
Transform
Entropy
Coding
0
Intra/Inter
MotionCompensated
Predictor
Motion
Data
Motion
Estimator
Standards: H.261, MPEG-1, MPEG-2, H.263, MPEG-4, H.264/AVC
Bernd Girod: Video Coding for Compression and Beyond
7
Motion-Compensated Hybrid Coding
Coder
Control
Video in
Control
Data
Transform/
Quantizer
Decoder
Quant.
Transf. coeffs
Deq./Inv.
Transform
Entropy
Coding
0
Intra/Inter
MotionCompensated
Predictor
Motion
Data
Motion
Estimator
¼-pixel accuracy
Standards: H.261, MPEG-1, MPEG-2, H.263, MPEG-4, H.264/AVC
Bernd Girod: Video Coding for Compression and Beyond
8
Motion-Compensated Hybrid Coding
Coder
Control
Video in
Control
Data
Transform/
Quantizer
Decoder
Quant.
Transf. coeffs
Deq./Inv.
Transform
Entropy
Coding
0
Intra/Inter
MotionCompensated
Predictor
Adaptive blockMotion
sizes
Motion
Estimator
...
Data
Standards: H.261, MPEG-1, MPEG-2, H.263, MPEG-4, H.264/AVC
Bernd Girod: Video Coding for Compression and Beyond
9
Motion-Compensated Hybrid Coding
Coder
Control
Video in
Control
Data
Transform/
Quantizer
Decoder
Quant.
Transf. coeffs
Deq./Inv.
Transform
Entropy
Coding
0
Intra/Inter
MotionCompensated
Predictor
Motion Frames
Multiple Past Reference
Data
Motion
Estimator
Standards: H.261, MPEG-1, MPEG-2, H.263, MPEG-4, H.264/AVC
Bernd Girod: Video Coding for Compression and Beyond
10
Motion-Compensated Hybrid Coding
Coder
Control
Video in
Control
Data
Transform/
Quantizer
Decoder
Quant.
Transf. coeffs
Deq./Inv.
Transform
Entropy
Coding
0
Intra/Inter
MotionCompensated
Predictor
Motion
Data
Motion
Estimator
Generalized B-Frames
Standards: H.261, MPEG-1, MPEG-2, H.263, MPEG-4, H.264/AVC
Bernd Girod: Video Coding for Compression and Beyond
11
Rate-Distortion Optimized Coder Control

Minimize Lagrangian cost function
J  D  l R   Di  l Ri   J i
i
Total
distortion

Total
bit-rate
Distortion
for block i
i
Rate
for block i
Lagrangian
cost
for block i
Strategy: minimize Ji for each block i separately, using a common
Lagrange multiplier l
Bernd Girod: Video Coding for Compression and Beyond
12
Multiple Reference Frames in H.264/AVC
PSNR Y [dB]
Mobile & Calendar (CIF, 30 fps)
38
37
36
35
34
33
32
31
30
29
28
27
26
0
~15%
PBB... with generalized B pictures
PBB... with classic B pictures
PPP... with 5 previous references
PPP... with 1 previous reference
1
2
R [Mbit/s]
Bernd Girod: Video Coding for Compression and Beyond
3
4
13
Multiple Reference Frames in H.264/AVC
PSNR Y [dB]
Mobile & Calendar (CIF, 30 fps)
38
37
36
35
34
33
32
31
30
29
28
27
26
0
>25%
PBB... with generalized B pictures
PBB... with classic B pictures
PPP... with 5 previous references
PPP... with 1 previous reference
1
2
R [Mbit/s]
Bernd Girod: Video Coding for Compression and Beyond
3
4
14
Multiple Reference Frames in H.264/AVC
PSNR Y [dB]
Mobile & Calendar (CIF, 30 fps)
38
37
36
35
34
33
32
31
30
29
28
27
26
0
~40%
PBB... with generalized B pictures
PBB... with classic B pictures
PPP... with 5 previous references
PPP... with 1 previous reference
1
2
R [Mbit/s]
Bernd Girod: Video Coding for Compression and Beyond
3
4
15
Outline


Video compression – state-of-the-art
Beyond compression
– Rate-scalable video
– Wavelet video coding
– Error-resilient video transmission
– Unequal error protection
– Optimal scheduling for packet networks
– Distributed video coding
Bernd Girod: Video Coding for Compression and Beyond
16
Surprising Success of ITU-T Rec. H.263
What H.263 was developed for . . .
. . . and what is was used for.
??
Analog videophone
Internet video streaming
Bernd Girod: Video Coding for Compression and Beyond
17
Internet Video Streaming
Streaming client
DSL
Media Server
Internet
dial-up modem
wireless



How to accommodate heterogeneous bit-rates?
How to react to network congestion?
How to mitigate late or lost packets?
Bernd Girod: Video Coding for Compression and Beyond
18
Fine Granular Scalability (FGS)
Efficiency gap
Enhancement layer
variable bit-rate
~2dB gap
Base layer
20 kbps
Bernd Girod: Video Coding for Compression and Beyond
H.264 with/without FGS option
Foreman sequence (5fps)
19
Wavelet Video Coder
Original
video
frames
0
1
2
3
4
5
6
7
LH
HHH
H
H
H
H
LH
LLL
Temporal
Wavelet Transform
H
HH
HH
H
H
H
LLH
Spatial
Wavelet
Transform
Embedded
Quantization &
Entropy Coding
[Taubman & Zakhor, 1994] [Ohm, 1994]
[Choi & Woods, 1999] [Hsiang & Woods, VCIP ’99] . . . and others
Bernd Girod: Video Coding for Compression and Beyond
20
Lifting
Even Frames
Analysis:
P
G0
Low Band
G1
High Band
U
Odd Frames
Motion Compensation
Even Frames
Synthesis:
P
Low Band
G11
High Band
U
Odd Frames
[Secker & Taubman, 2001]
G01
[Popescu & Bottreau, 2001]
Bernd Girod: Video Coding for Compression and Beyond
21
MC Wavelet Coding vs. H.264/AVC
38
Luminance PSNR (dB)
36
34
Non-scalable
H.264/AVC
32
30
28
26
Scalable
MC 5/3 Wavelet
24
Sequence: Mobile CIF
H.264/AVC
• high complexity RD control
22
• CABAC
• PBBPBBP . . .
• 5 prev/3 future reference frames
• data courtesy of M. Flierl
20
0.2
0.4
0.6
0.8
1.0
1.2
1.4
bit-rate (Mbps)
Bernd Girod: Video Coding for Compression and Beyond
1.6
1.8
2.0
[Taubman & Secker, VCIP 2003]
courtesy D. Taubman
22
Wavelet Synthesis with Lossy Motion Vector
Video
in
MC Wavelet
Transform
Embedded
Encoding
Inverse
Wavelet
Transform
Decoder
Video
out
Minimize
J=D+lR
d
Motion
Estimator
Embedded
Encoding
Decoder
d
Minimize
J=D+lR
[Taubman & Secker, ICIP03]
Bernd Girod: Video Coding for Compression and Beyond
23
R-D Performance with Lossy Motion Vector
40
Video PSNR (dB)
38 Non-embedded
single-rate
36
34
Embedded wavelet coefficients
Lossless motion
32
30
28
Embedded wavelet coefficients
Lossy motion
CIF Foreman
26
24
0
200
400
600
800
1000
1200
Bit -Rate (kbps)
[Taubman & Secker, VCIP 2003]
courtesy D. Taubman
Bernd Girod: Video Coding for Compression and Beyond
24
Outline


Video compression – state-of-the-art
Beyond compression
– Rate-scalable video
– Wavelet video coding
– Error-resilient video transmission
– Unequal error protection
– Optimal scheduling for packet networks
– Distributed video coding
Bernd Girod: Video Coding for Compression and Beyond
25
Priority Encoding Transmission (PET)
base layer
enhancement layer
packet
network
K
…
Reed-Solomon
codeword
information symbols
N-K
block of packets
redundancy symbols
[Albanese, Blömer, Edmonds, Luby, Sudan, 1996]
[Horn, Stuhlmuller, Link, Girod, 1999]
[Mohr, Riskin, Ladner, 2000]
[Chou, Wang, Padmanabhan, 2003]
Bernd Girod: Video Coding for Compression and Beyond
[Davis & Danskin, 1996]
[Puri, Ramchandran, 1999]
[Stankovic, Hamzaoui, Xiong, 2002]
. . . and many more . . .
26
Packet Delay Jitter and Loss
pdf
(1e)
e
loss loss
probability
probability
loss
k
lead-time
lead-time
Bernd Girod: Video Coding for Compression and Beyond
delay

27
Smart Prefetching
Idea: Send more important packets earlier to allow for more retransmissions
Server
Client
Updated
Packet
Schedule
Video
packets
Rate-distortion
preamble
Request
Packet
stream
Schedule
Internet
[Podolsky, McCanne, Vetterli 2000] [Miao, Ortega 2000] [Chou, Miao 2001]
Bernd Girod: Video Coding for Compression and Beyond
28
Rate-Distortion Preamble
I
I
I
B
P
B
P
B
P
I
B
P
B
P
B
P
…
…
…

Each media packet n is labeled by
− Bn — size [in bits] of data unit n
− Ddn —distortion reduction if n is decoded
− tn — decoding deadline for n
Bernd Girod: Video Coding for Compression and Beyond
29
Rate-Distortion Preamble
I
I
I
B
P
B
P
B
P
I
B
P
B
P
B
P
…
…
…

Each media packet n is labeled by
− Bn — size [in bits] of data unit n
− Ddn —distortion reduction if n is decoded
− tn — decoding deadline for n
Bernd Girod: Video Coding for Compression and Beyond
For video:
Ddn must be made
“state-dependent” to
accurately capture
concealment
30
Markov Decision Tree for One Packet
ack: 1
ack: 1
send: 1
send: 1
1
0
0
0
0
0
“Policy“ minimizing
J0= D + lR
1
0
1
0
1
send: 1
0
0
0
... N transmission
opportunities before
deadline
1
0
0
Observation
Action
tcurrent
tcurrent+Dt
tcurrent+2Dt
Bernd Girod: Video Coding for Compression and Beyond
31









Foreman
120 frames
10 fps, I-P-P-…
H.263+ 2 Layer SNR
scalable
20 frame GOP
Copy Concealment
20 % loss forward
and back
Γ-distributed delay
– κ = 10 ms
– μ = 50 ms
– σ = 23 ms
Pre-roll 400ms
PSNR [dB]
R-D Optimized Streaming Performance
31
R-D Optimized
Prioritized ARQ
30
29
28
~50 %
27
26
25
24
40
60
80
100
120
140
Bit-Rate [kbps]
Bernd Girod: Video Coding for Compression and Beyond
32
Naive Coding Questions
1.
To achieve graceful degradation in case of channel error for a
digitally encoded signal, is an embedded signal
representation (aka layers, aka data partitioning) always
needed?
2.
Can one, in general, send refinement information for an
analog (i.e. uncoded) signal transmission over a noisy
channel?
Bernd Girod: Video Coding for Compression and Beyond
33
Digitally Enhanced Analog Transmission
Analog
Channel
(uncoded)
Side
info
WynerZiv
Encoder



Digital
Channel
WynerZiv
Decoder
Forward error protection of the signal waveform
Information-theoretic bounds [Shamai, Verdu, Zamir,1998]
“Systematic lossy source-channel coding”
Bernd Girod: Video Coding for Compression and Beyond
34
Forward Error Protection of Compressed Video
Analog channel (uncoded)
Any Old
Video Encoder
Wyner-Ziv
Encoder A
Video Decoder with
Error Concealment
Error-Prone channel
S
Wyner-Ziv
Encoder B
S’
Wyner-Ziv
Decoder A
S*
Wyner-Ziv
Decoder B
S**
[Aaron, Rane, Girod, ICIP 2003]
Graceful degradation without a layered signal representation
Bernd Girod: Video Coding for Compression and Beyond
35
Wyner-Ziv MPEG Codec
main
MPEG
Encoder
ED
Q
-1
-1
T
S*
+
MC
Reconstructed
Frame at
Encoder
Channel
S
ED
q-1
-1
T
S’
+
MC
MPEG
Encoder
R-S
Encoder
R-S
Decoder
coarse
Slepian-Wolf
Encoder
Wyner-Ziv Encoder
Bernd Girod: Video Coding for Compression and Beyond
Side
information
MPEG
Encoder
coarse
[Rane, Aaron, Girod, VCIP 2004]
36
Graceful Degradation with Forward Error Protection
Main Stream @ 1.092 Mbps
FEC (n,k) = (40,36)
FEC bitrate = 120 Kbps
Total = 1.2 Mbps
WZ Stream @ 270 Kbps
FEP (n,k) = (52,36)
WZ bitrate = 120 Kbps
Total = 1.2 Mbps
Bernd Girod: Video Coding for Compression and Beyond
37
Visual Comparison of Degradation at Same PSNR
Foreman 50 CIF frames @ symbol error rate = 4 x 10-4
With FEC
1 Mbps + 120 kbps
(38.32 db)
Bernd Girod: Video Coding for Compression and Beyond
With FEP
1 Mbps + 120 kbps
(38.78 db)
38
Superior Robustness of FEP
Foreman 50 CIF frames @ symbol error rate = 10-3
With FEC
1 Mbps + 120 kbps
(33.03 db)
Bernd Girod: Video Coding for Compression and Beyond
With FEP
1 Mbps + 120 kbps
(38.40 db)
39
Lossy Compression with Side Information
Source
X Encoder
Decoder
X'
Y
Y
[Wyner, Ziv, 1976] For mse distortion and Gaussian statistics,
rate-distortion functions of the two systems are the same.
Source
X Encoder
Decoder
X'
Y
Y
Bernd Girod: Video Coding for Compression and Beyond
Y
40
Ultra-Low-Complexity Video Coding
Interframe Decoder
Intraframe Encoder
WZ
frames
Slepian-Wolf Codec
Scalar
Quantizer
X
K
Buffer
Turbo
Decoder
Request
bits
Key
frames
Turbo
Encoder
Conventional
Intraframe coding
Reconstruction
X’
Y
Conventional
Intraframe
decoding
Interpolation/
Extrapolation
K’
[Aaron, Zhang, Girod, Asilomar 2002]
[Aaron, Rane, Zhang, Girod, DCC 2003]
Bernd Girod: Video Coding for Compression and Beyond
41
R-D Performance
Ultra-Low-Complexity Video Coder
3 dB
8 dB




Bernd Girod: Video Coding for Compression and Beyond
Sequence: Foreman
WZ frames - even frames
Key frames - odd frames
Side information - motion
compensated interpolation
of key frames
42
Ultra-Low-Complexity Video Coder
H263+ Intraframe Coding
330 kbps, 32.9 dB
Bernd Girod: Video Coding for Compression and Beyond
Wyner-Ziv Codec
274 kbps, 39.0 dB
43
Ultra-Low-Complexity Video Coder
H263+ I-B-I-B
276 kbps, 41.8 dB
Bernd Girod: Video Coding for Compression and Beyond
Wyner-Ziv Codec
274 kbps, 39.0 dB
44
Stanford Camera Array
Bernd Girod: Video Coding for Compression and Beyond
Courtesy Marc Levoy,
Stanford Computer Graphics Lab
45
Stanford Camera Array
Courtesy Marc Levoy,
Stanford Computer Graphics Lab
Bernd Girod: Video Coding for Compression and Beyond
46
Light Field Compression
Wyner-Ziv, Pixel-Domain
JPEG-2000
Rate: 0.11 bpp
PSNR 39.9 dB
Rate: 0.11 bpp
PSNR 37.4 dB
Bernd Girod: Video Coding for Compression and Beyond
47
Conclusions



Video compression is very important
. . . but there is more to video coding than compression
Rate-scalable video representations: mc lifting break-through
Robust video transmission
– Virtual priority mechanisms by packet scheduling
– RD gains easily larger than from super-clever compression

Distributed video coding: radically different approach
– Graceful degradation w/o layers
– Ultra-low-complexity coders

Ubiquitous J=D+lR
Bernd Girod: Video Coding for Compression and Beyond
48
Acknowledgments
Anne M. Aaron
Jacob Chakareski
Philip A. Chou
J=D+lR
Markus Flierl
Sang-eun Han
Mark Kalman
Marc Levoy
Yi Liang
Shantanu Rane
David Rebollo-Monedero
Andrew Secker
David Taubman
Thomas Wiegand
Xiaoqing Zhu
Rui Zhang
Progress is a wonderful thing,
if only it would stop . . .
Robert Musil