Distribution Video Coding and Its Application

Download Report

Transcript Distribution Video Coding and Its Application

Distributed Video Coding and Its
Application
Abhik Majumdar, Rohit Puri, Kannan Ramchandran, and
Jim Chou
Presented by Lei Sun
1
/24
Introduction(1/3)
Contemporary digital video coding architectures have
been driven primarily by the “downlink” broadcast model
of a complex encoder and multitude of light decoders.
However, with the current proliferation of video devices
which have constrained computing ability, memory and
battery power, we expect future systems to use multiple
video input and output streams captured using a network
of distributed devices and transmitted over a bandwidthconstrained, noisy wireless transmission medium.
2
/24
Introduction(2/3)
System requirements:
 robustness to packet/frame loss caused by channel
transmission errors;
 low-power and light-footprint encoding due to limited
battery power and/or device memory;
 high compression efficiency due to both bandwidth and
transmission power limitation.
3
/24
Introduction(3/3)
 PRISM (a video coding paradigms founded on the principles
of source coding with side information)
 A flexible distribution of computational complexity between
encoder and decoder
 High compression efficiency
4
/24
Background on Source Coding with
Side Information (1/3)
 Let 3bits binary data X,Y can have the same possibilty of 8 values. they are
correlated so the Hamming distance is at most 1. there are 2 scenario showed
in figure 1
Scenario a: X can be encoded in 2
bits using (X⊕Y) since Y is
available both on encoder and
decoder.
Figure 1
5
/24
Scenario b: Y is only available on decoder, X encoded in
to a coset index so the decoder reception coset index
using Y.
Background on Source Coding with
Side Information (2/3)
 compressing the two or more sources seperately and decoding
using the correlation between these sources
 Slepian and Wolf theorem (lossless case)
 Wyner-Ziv theorem (lossy case)
6
/24
Background on Source Coding with
Side Information (3/3)
 Figures 2,4 show the structure of the Wyner-Ziv encoding and
decoding
7
/24
Figure 2 (a) Encoding consists of
quantization followed by a binning
operation encoding U into Bin
(Coset) index.
(b) Structure of distributed
decoders. Decoding consists
of “de-binning” followed by
estimation.
Figure 3
(c) Structure of the codebook bins.
8
/24
Architectural Goals of PRISM
 Compression Performance
 The current macro-block X can be encoded into bin index which
reduces the encoding rate.
 Robustness
 As long as |Y-X|<δ (step size), the decoder is guaranteed to
recover the correct output.
 Moving Motion-Search Complexity to the Decoder
 Uncertainty at the receiver about the exactly state of the side
information that requires Motion-search at the decoder.
9
/24
A Theory for Distributed Video Coding
 Sharing Motion Complexity between Encoder and Decoder
 A Motion-Compensated Video model
10
Figure 4: Motion-indexed additive–innovations model for video signals. X
denotes a block of size n pixels in the current frame to be encoded and
{Y1,Y2…Ym} is the set of blocks (each of size n) in the previous decoded frame
corresponding to different values of the motion vector indexed by T.
/24
Sharing Motion Complexity between
Encoder and Decoder…
 Motion-Compensated Predictive Coding
 Step1:The encoder estimates and transmits the index of the estimated
motion vector to the decoder.
 Step 2: Once the decoder knows T , the video coding problem is reduced
to the problem of compressing the “source” X using the correlated sideinformation YT now available to both the encoder and the decoder.
11
/24
Sharing Motion Complexity between
Encoder and Decoder…
 Distributed Video Coding
 In this case, due to severely limited processing capability (or some other
reason), the encoder is disallowed from performing the complex motioncompensated prediction task. This is in effect pretending that the encoder
does not have access to the previous decoded blocks Y1, . . . ,YM.
12
/24
A Theory for Distributed Video Coding
 Robustness to Transmission Errors
 Discrete Data, lossless Recover
 The Rpclb=H(Z)+H(Y|Y’), In this case, when either channel noise or the
accumulated drift is small, the cost of correct errors is not take too many
bits, however, if they are big, the rate penalty is significant.
 Jiontly Gaussian Data, Recovery with MSE<=D
 In general, if the channel noise is too big, this system is akin to the case of
not sending the block at all.
13
/24
A Theory for Distributed Video Coding
 Complexity Performance Trade-Offs
 Typically, the more the complexity invested in the motion
estimation process, the more accurate is the estimate of the
statistics leading to better compression performance.
14
/24
PRISM: Encoding
 Decorrelating Transform (DCT on source block)
 Quantization
 Classification
 Syndrome Encoding
 Hash Generation
15
/24
PRISM: Encoding
 Classification
Figure 5: A bit plane view
of a block of 64
coefficients. Bit planes
are arranged in
increasing order with 0
corresponding to the
least-significant bit.
16
/24
Classification…
 depending on the available complexity budget, as well as the
prevailing channel conditions, the classification module can
perform varying degrees of motion search, ranging from an
exhaustive motion search to a coarse motion search to no
motion search at all.
17
/24
PRISM: Encoding
 Hash Generation
 A hash signature for the quantized sequence codewords is more
pratical to let decoder know which is the “best” predictor for
the block X.
18
/24
PRISM: Encoding
Figure 6: Bit stream associated with a block.
19
/24
Figure 7: Functional block
diagram of the encoder.
PRISM: Decoding
Figure 8: Functional block diagram of the decoder.
20
/24
Simulation Results
21
/24
Figure 9 encoding rate comparison
Simulation Results
Figure 10 packet drop rate comparison
22
/24
Simulation Results
23
/24
Figure 11 frame Number comparison
Summary
 The PRISM is a pratical video coding framework built on
distributed source coding principles. Base on a generalization
of the classical Wyner-Ziv step, PRISM is characterized by
inherent system uncertain about the “state” of the relevant
side information that is know at the decoder. The two main
architectural goals of PRISM make it radically different from
existing video codecs.
24
/24