Basics of Image and Video Compression

Transcript Basics of Image and Video Compression

Adaptive Rate-Distortion Based
Wyner-Ziv Video Coding
Lina Karam
Image, Video, and Usability (IVU) Lab
Department of Electrical Engineering
Arizona State University
Tempe, AZ 85287
[email protected]
ivulab.asu.edu
1
Outline
• Motivation
• Existing DVC Approaches
• BLAST-DVC: Rate-distortion based BitpLane
SelecTive decoding for pixel-domain Distributed
Video Coding
• AQT-DVC: Rate-distortion based Adaptive
QuanTization for transform-domain Distributed
Video Coding
• Enhanced AQT-DVC
• Conclusion and future directions
2
Motivation
Mother and Daughter
CIF – 352 x 288
Spatial and Temporal Redundancy
Frame 60
Frame 61
Time
3
Motion Estimation and Compensation
CIF Mother & Daughter
Reference Frame (Frame 197)
Current Frame (Frame 198)
4
Residual Error ( No Motion Compensation)
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
Difference (Residual) Frame = Frame 198 – Frame 197
5
Motion Estimation and Compensation
CIF Mother & Daughter
Reference Frame (Frame 197)
Current Frame (Frame 198) =
Reference Frame + Error
6
Full Search Motion Estimation
[8x8] block motion vectors superimposed on Reference Frame (Frame 197)
7
Motion Compensation
Motion Compensated Reference (Frame 197)
PSNR = 40.8 dB, MSE = 5.4
8
Residual Error ( No Motion Compensation)
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
9
Residual Error ( 16x16 blocks, Full pixel)
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
PSNR = 39.4 dB, MSE = 7.5
10
Residual Error ( 4x4 blocks, quarter pixel)
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
PSNR = 45 dB, MSE = 2.1
11
Variable block size
(16x16 – 4x4) +
quarter-pel + multi-frame
motion compensation+ R-D Optimization
( H.264 2004)
85%
12
So, what is the problem?
13
Power profile of H.264 with QCIF@15fps (H.264
Decoder)
CAVLC+IQ+IZZ+IDCT
Syntax parser
4%
5%
Intra predictor
10%
Deblocking filter
34%
Deblocking filter
Motion compensation
Mics.
18%
Mics.
Motion
compensation
29%
Intra predictor
Syntax parser
CAVLC+IQ+IZZ+IDCT
From: T.-A. Liu, T.-M. Lin, S. -Z. Wang, et al. “A low-power dual-mode video decoder for mobile applications,”
IEEE Communications Magazine, volume 44, issue 8, pp.119-126, Aug. 2006.
• Encoder performs both Motion Estimation and Compensation
• Motion Estimation operation much more computationally complex
and consumes much more power than Motion Compensation
14
Distributed Video Coding: Motivation
Conventional video coding
• MPEGx or H.26x
• High complexity video encoder due to motion estimation.
Emerging applications
• Video compression with mobile devices
‒ Low complexity video encoder is preferred to reduce
the hardware cost and to extend battery life.
• Video compression for sensor networks
‒ Low complexity video encoder is also preferred to
reduce the hardware cost and to extend battery life.
‒ Inter–sensor communication may not be allowed or
needs to be minimized.
Two main frameworks
• Multi-View/Multi-Cameras
• Single-View/Single Camera (Wyner-Ziv Video Coding)
15
Distributed Video Coding: Objectives
Intraframe encoding and interframe decoding
• Move complexity (motion estimation) from encoder to decoder
• Achieve interframe compression rate-distortion performance
Distributed source coding
• Compress consecutive frames separately
• Decode the frames jointly at the decoder
• Motivated by the work of Slepian-Wolf (1973) and Wyner-Ziv
(1976)
‒ Slepian-Wolf : possible to compress losslessly two statistically
dependent sources in a distributed fashion at a rate equal to their
joint entropy
‒ Wyner-Ziv: possible to compress in a distributed fashion and
achieve the same rate-distortion performance as when coding in a
non-distributed fashion (Gaussian memoryless sources and meansquare error distortion).
16
How can we do this?
17
Distributed Video Coding (DVC): How?
Back to Mother & Daughter…
Reference Frame (Frame 197)
Current Frame (Frame 198) =
Reference Frame + “Error”
DVC problem becomes: Correct or Reduce “Error” without using
Motion Estimation at the encoder and without knowing what the
“Error” is!
Similar to a channel coding problem => can make use of channel codes
18
Distributed Video Coding (DVC): Example
QCIF (176x144) Foreman
•Encoder:
Frame 1
Frame 2
Frame 3
Frame 5
Parity Bits or
Syndrome bits
Parity Bits or
Syndrome bits
Intra-coded
Frame 4
Intra-coded
Intra-coded
•Decoder:
- Recovers even frames from intra-coded odd-numbered frames
- Odd-numbered frames are considered to be a distorted version of
even-numbered frames; i.e. Frame2n=Frame2n-1+”Error”
- “Error” corrected using parity bits or syndrome bits
19
Distributed Video Coding (DVC): Example
•Issue 1: “Error” can be large => need to send a lot of parity bits
=> large bitrate
Frame 55
• Strategy:
Frame 56
at the decoder, try to reconstruct even frames using
received odd frames (e.g., bi-directional motion-compensated
interpolation).
Distributed Video Coding (DVC): Example
•Decoder: Side Information Generation
Frame 1
Frame 2
interpolate
Frame 3
QCIF (176x144) Foreman
Frame 4
Frame 5
interpolate
Interpolated frames called “side information”
•Issue 2: How to generate high-quality side information?
•Issue 3: How do we determine the number of needed parity or
syndrome bits ?
- Sending too much will waste bits
- Sending too little might leave large distortions uncorrected
Existing Approaches
PRISM (Puri et al., IEEE Trans. IP, Oct 2007)
• Syndrome-based Wyner-Ziv Coding by dividing codeword space
into cosets
• After quantization, bitplane representation used1 0 1 1 0 0 1 1
• Most significant bits can be inferred from side information
• Least significant bits (syndrome bits) need to be encoded and sent
to decoder
• Issues:
- Syndrome coding rate is fixed in advance
- Coding can stop if CRC check fails => correctness not
guaranteed
- Coding performance decreases significantly if unknown source
statistics. Source correlation not known in advance in practice
and is hard to estimate
22
Existing Approaches
Feedback-channel-based DVC by Aaron et al. 2004, Girod et al., 2005
• Bitplane coding
• Rate-Compatible Punctured Turbo (RCPT) codes used to
generate parity bits (Slepian-Wolf coding) for each bitplane
• Feedback channel used to request parity bits based on need
• No need to determine number of parity bits to send in advance
• Hybrid FEC/ARQ–like scheme
‒ Feedback channel is to acknowledge the decoding
correctness (e.g., CRC can be used to check correctness)
‒ Bitrate is determined on the fly.
‒ Decoding successes can be guaranteed.
23
Existing Approaches: Feedback-based DVC (Girod’s Group)
For pixel-domain, no DCT, IDCT
Wyner-Ziv
frames
S
Intraframe Encoder
DCT
Wyner-Ziv Encoder
Xk
2 M k levels
Quantizer
Interframe Decoder
bitplane1
Extract
bitplanes
bitplane2
Wyner-Ziv Decoder
RCPT
SlepianWolf
Encoder
Buffer
SlepianWolf
Decoder
qk
bitplane M k
Reconstruction
Xk
Decoded
Wyner-Ziv
frames
S’
IDCT
Xˆ k
Request bits
Side Information
DCT
MCTI
Sˆ
Side Information
Generation
K
Key frames
Conventional
Intraframe
Encoder
Conventional
Intraframe
Decoder
K’
Decoded
Key frames
24
Existing Approaches: DISCOVER (Artigas et al., PCS 2007 )
Wyner-Ziv
Significant R-D performance improvement
frames
Intraframe Encoder
S
DCT
Interframe Decoder
Wyner-Ziv Encoder
Xk
bitplane1
2 M k levels
Extract
bitplanes
Quantizer
bitplane2
Wyner-Ziv Decoder
LDPCA*
SlepianWolf
Encoder
Buffer
SlepianWolf
Decoder
qk
bitplane M k
Reconstruction
Xk
Decoded
Wyner-Ziv
frames
S’
IDCT
Xˆ k
Request bits
Side Information
DCT
Hierarchical
subpixel ME with
Smoothing filter
K
Key frames
Conventional
Intraframe
Encoder
Conventional
Intraframe
Decoder
Sˆ
Side Information
Generation
K’
Decoded
Key frames
* LDPCA provided by Girod’s Group – Varodayan et al., 2006
25
Issues with Existing Approaches
• Issue 1: Existing DVC schemes do not adapt the Slepian-Wolf
decoding to the local characteristics of the video => every
bitplane is Slepian-Wolf decoded based on bit budget starting
from MSB to LSB.
- Decoding stops when no error detected or when bit budget
exhausted. Some important locations and bitplanes might not
be decoded!
Question:
Can we skip some less important regions and bitplanes without
decoding them?
How do we measure the significance of a bitplane?
Issues with Existing Approaches
•Issue 2: Existing DVC schemes do not adapt the quantization to
the local characteristics of the video => During the encoding, a
single quantizer matrix (one fixed quantizer for each subband) is
selected for the whole video.
Question:
Can we adapt the quantization matrix to the local characteristics
of the video so as minimize the needed bits for LDPCAdecoding while maximizing the quality?
Proposed Strategy
• Divide each video frame into partitions in order to exploit local
characteristics
• Allocate bits to a partition only if they result in sufficient
distortion reduction
- Determined using Distortion-Rate (D-R) ratios: D-R = D/R,
where D = Distortion Reduction resulting from allocating R
bits.
• Mimimum allowed distortion reduction per bit is specified in
terms of a target Distortion-Rate (D-R) ratio = TD-R
-Allocate bits only if D/R of partition is > TD-R
• D/R is an indication of how much distortion reduction (quality)
can a bit can buy us on average for the considered partition
•Bits can be allocated to a partition via Slepian-Wolf (LDPCA-)
decoding and/or by selecting quantization matrix
• Target TD-R used to control bit-rate: set low for high bit-rate
coding, and high for low bit-rate coding
28
Challenge: How to Measure Distortion-Rate Ratio?
• The original source information is not available at the
decoder, so the distortion D cannot be exactly measured.
• The bitrate R cannot be known without decoding.
• Proposed Approach: Distortion-Rate Ratio estimation
performed at the decoder using the side information frames
and the source correlation model
‒ The complexity of the encoder is not increased
‒ More flexibility as the decoder can selectively decode
the bitplanes based on a target distortion-rate ratios.
The target rate-distortion ratio can be changed so that
different R-D operating point can be achieved.
‒ Error probability needs to be estimated at decoder
29
BLAST-DVC: Pixel-Domain BiTpLAne SelecTive Decoding
Xi
CRC Generator
x’i,1,k
X’i,1
X’i
X’i,2
X’i,M
Key frames
i 1
LDPCA Decoder
Xˆ i ,M
Xˆ i1,1
Divide into Sub- ˆ
X i 1,M
images
Xˆ i 1,1
Xˆ i 1,M
Rate-Distortion Ratio
Estimation
…
Wyner-Ziv Frame Decoder
Block Indices
Encoding
Xˆ i ,1
…
X i1
x'i,m,k
LDPCA Decoder
…
X i1
Xˆ i
Motion Compensated Xˆ i1
Interpolation
Xˆ
x'i,2,k
LDPCA Decoder
…
Minimum Distance
Symbol
Reconstruction
…
…
Decoded Wyner-Ziv frames
…
Merge Sub-images
Minimum-distortion
Pixel Reconstruction
Block Indices
Decoding
Request bits by
block indices
xi,m,k
Buffer
parity bits
+ CRC bits
Xi,M
LDPCA Encoder
…
Extract
bitplanes
…
Divide into Subimages
…
Wyner-Ziv
frames
xi,m,1
xi,m,2
…
Wyner-Ziv Frame Encoder
Xi,1
Xi,2
30
BLAST-DVC: Distortion-Rate Ratio Estimation
Source Correlation Model
• Let D be the difference of the source information X and its side
information Xside.
• D can be modeled as a random variable with a Laplacian distribution.
  0.5 exp( x ) dx,
 254 .5


P ( D  d )  dd00..550.5 exp( x ) dx ,

 254 .5
0.5 exp( x ) dx ,

 
if d  255
if - 255  d  255,
if d  255
• α can be estimated from the co-located blocks of two motioncompensated Key frames Xˆ i 1 and Xˆ i 1 (Brites et al., 2006).
2
1
 ˆ m 2 
2
N
m
2
ˆ
ˆ


X

X
 i1,m,n i1,m,n ,
N
n 1
where m = partition index and n is the pixel location in the partition
31
BLAST-DVC: Rate Estimation
:
•Let Pn ,k be the error probability at a pixel n in bitplane k in partition
•Average of the error probabilities Pn ,k over subimage :
1 N
Pk   Pn,k .
N n1
•The needed bits for the considered kth bitplane can be
computed as:
Rk  (Pk  log Pk  (1  Pk )  log(1  Pk ))  N  RCRC
32
BLAST-DVC: Error Probability Estimation
The probability of bit error can be expressed as:
Pn,k  P(bn,k  1, b'n,k  0 | b'n,r  bn,r , r  DBPs, 1  r  k  1)
 P(bn,k  0, b'n,k  1 | b'n,r  bn,r , r  DBPs, 1  r  k  1)
where bn,k and b’n,k denote a bit in the kth bitplane corresponding to
the nth pixels in the original subimage and in the side information
(generated through motion compensated interpolation), respectively.
DBP stands for Decoded Bit Planes.
33
BLAST-DVC: Error Probability Estimation
S
P(bn, k  1 | b' n,r  bn,r , r  DBPs, 1  r  k  1) 

U n ,k ,s  0.5 X n ,k 1
s 1
S
d  Bn ,k ,s  0.5 X n ,k 1

U n ,k ,s  0.5 X n ,k 1
s 1
d  Ln ,k ,s  0.5 X n ,k 1
P( D  d )
P( D  d )
and
S
P(bn,k  0 | b'n,r  bn,r , r  DBPs, 1  r  k  1) 

d  Ln ,k ,s  0.5 X n ,k 1

d  Ln ,k ,s  0.5 X n ,k 1
s 1
S
s 1
Bn ,k ,s 0.5 X n ,k 1
U n ,k ,s 0.5 X n ,k 1
P( D  d )
P( D  d )
34
BLAST-DVC: Distortion Estimation
• Estimate distortion reduction if the target bitplane is
decoded.
ˆ
ΔDk  Dk  D
k
Distortion reduction
Average distortion
if the target bitplane is LDPCA decoded
Average distortion
if the target bitplane is not LDPCA decoded
• Average distortion estimation for a sub-image Xn
N
N
n 1
n 1
2
ˆ   E[( X  Recon( X , X '
))
]
Dk   E[( X n  X 'n,k 1 ) 2 ];D
k
n
n
n , k 1
Partially reconstructed pixel
value based on the previously
determined k-1 bitplanes and
side info
Partially reconstructed pixel value
when the target bitplane is LDPCA-decoded =>
minimum distance symbol reconstruction is used
35
Minimum Distortion Reconstruction
Xn
Xn
Xn
X n , k 1
X n , k 1
X n , k 1
Δk
X n , k
X n , k
 X n 
  Δ k  Δ k  1,
 Δ k 


X n , k  Recon( X n , X n , k 1)  
X n , k 1,


 Xn 


  Δk ,
Original
Δ

 k
Side Info 
Laplacian RV
X n , k
 X n , k 1   X n 
if 


Δ
Δ
k   k 

 X n , k 1   X n 
if 


Δ
k   Δk 

 X n , k 1   X n 
if 


Δ
k   Δk 

36
Distortion Estimation – Bitplane not decoded
P ( X | X side )
N
Dk   E[( X n  X n ,k 1 ) 2 | bn ,r  bn,r , r  DBPs, 1  r  k  1]
0
1
n 1
S
N

U n ,k ,s  0.5
  P( X
s 1 y  Ln ,k ,s  0.5
S
n
 y )  ( y  X n ,k 1 ) 2
  P( X
n 1
; S  2 p and p  no.of NDBP s
U n ,k ,s  0.5
s 1 y  Ln ,k ,s  0.5
n
 y)
y-Xside
y-Xside
0
L1
y
00
B1
01
X side
U
Estimated value 1
y
255
Consider that the MSB is 0 and
we want to determine next bit
=> Next bit is 1
X
37
Distortion Estimation – Bitplane LDPCA-decoded
Consider that the MSB is 0 and
we want to determine next bit
P ( X | X side )
00
01
10
11
N
ˆ   E[( X  Recon(X , X  ))2 | b  b , r  DBPs, 1  r  k  1]
D
k
n
n
n , k 1
n,r
n,r
n 1
S
N

U n ,k ,s  0.5
  P( X
s 1 y  Ln ,k ,s  0.5
n
S
 y )  ( y  Recon(y,X n ,k 1 ))2
U i ,m ,n ,k ,s  0.5
P( X


y-Recon(y,X )
n 1
s 1 y  Li ,m ,n ,side
k , s  0.5
n
 y)
y-Recon(y,Xside)
0
L1
y
B1
X side
If y in Bin 00, Recon(y,Xside)
U1
255
X
If y in Bin 01, Xside is Recon(y,Xside)
y
38
Bitplane Decoding Selection
Once the rate Rk and the distortion reduction ΔDk are
obtained, a targeted distortion-rate ratio t can be chosen to
determine whether bitplane decoding should be performed.
If ΔDk / Rk < t , the current bitplane is not decoded (NDBP
case)
If ΔDk / Rk ≥ t , CRC bits are requested followed
progressively by parity/syndrome bits, one
parity/syndrome bit at a time, so that error correction can
be applied to the current sub-image bitplane by means of
LDPCA until no errors are detected (DBP case).
39
Proposed BLAST-DVC
Xi
CRC Generator
x’i,1,k
X’i,1
X’i
X’i,2
Key frames
i 1
LDPCA Decoder
Xˆ i ,M
Xˆ i1,1
Divide into Sub- ˆ
X i 1,M
images
Xˆ i 1,1
Xˆ i 1,M
Rate-Distortion Ratio
Estimation
…
Wyner-Ziv Frame Decoder
Block Indices
Encoding
Xˆ i ,1
…
X i1
x'i,m,k
LDPCA Decoder
…
X i1
Xˆ i
Motion Compensated Xˆ i1
Interpolation
Xˆ
x'i,2,k
LDPCA Decoder
…
X’i,M
Minimum Distance
Symbol
Reconstruction
…
…
Decoded Wyner-Ziv frames
…
Merge Sub-images
Minimum-distortion
Pixel Reconstruction
Block Indices
Decoding
Request bits by
block indices
xi,m,k
Buffer
parity bits
+ CRC bits
Xi,M
LDPCA Encoder
…
Extract
bitplanes
…
Divide into Subimages
…
Wyner-Ziv
frames
xi,m,1
xi,m,2
…
Wyner-Ziv Frame Encoder
Xi,1
Xi,2
40
Proposed BLAST-DVC
Xi
CRC Generator
x’i,1,k
X’i,1
X’i
X’i,2
Key frames
i 1
LDPCA Decoder
Xˆ i ,M
Xˆ i1,1
Divide into Sub- ˆ
X i 1,M
images
Xˆ i 1,1
Xˆ i 1,M
Rate-Distortion Ratio
Estimation
…
Wyner-Ziv Frame Decoder
Block Indices
Encoding
Xˆ i ,1
…
X i1
x'i,m,k
LDPCA Decoder
…
X i1
Xˆ i
Motion Compensated Xˆ i1
Interpolation
Xˆ
x'i,2,k
LDPCA Decoder
…
X’i,M
Minimum Distance
Symbol
Reconstruction
…
…
Decoded Wyner-Ziv frames
…
Merge Sub-images
Minimum-distortion
Pixel Reconstruction
Block Indices
Decoding
Request bits by
block indices
xi,m,k
Buffer
parity bits
+ CRC bits
Xi,M
LDPCA Encoder
…
Extract
bitplanes
…
Divide into Subimages
…
Wyner-Ziv
frames
xi,m,1
xi,m,2
…
Wyner-Ziv Frame Encoder
Xi,1
Xi,2
41
Simulation Setup
QCIF Video Sequences (176x144)
Frame rate: 15 frame per second.
Number of partitions per frame = 64 (22x18 each)
Comparison with following systems:
• H.264 Inter : I-B-I-B
• H.264 Intra only
• DISCOVER by X. Artigas et al.
‒ Transform domain DVC, GOP = 2.
• PDDVC (non-adaptive best pixel-domain system)
‒ Pixel domain DVC, GOP =2.
‒ Special case of the proposed system but no partitions (1
partition per frame)
42
Simulation Results
Salesman
Hall Monitor
38
22% reduction
18% reduction
40
39
37
1.6 dB
38
2.0 dB
37
PSNR (dB)
PSNR (dB)
36
35
34
36
35
34
33
H.264 Inter
BLASTDVC
DISCOVER
H.264 Intra
PDDVC
33
32
H.264 Inter
BLASTDVC
DISCOVER
H.264 Intra
PDDVC
32
31
30
29
31
0
50
100
150
200
Bitrate (kbps)
250
300
350
0
100
200
300
400
500
600
700
Bitrate (kbps)
43
Simulation Results
Container
41
40
18% reduction
Foreman
37
1.4 dB
39
18% reduction
36
38
35
0.8 dB
34
36
PSNR (dB)
PSNR (dB)
37
35
34
33
H.264 Inter
32
BLASTDVC
31
DISCOVER
30
33
32
H.264 Inter
31
BLASTDVC
DISCOVER
30
H.264 Intra
29
PDDVC
28
0
50
100
150
200
250
Bitrates (kbps)
300
350
H.264 Intra
29
400
PDDVC
28
0
50
100
150
200
250
300
350
400
Bitrates (kbps)
44
450
Visual Testing Setup
9 subjects took the test.
Two video sequences are randomly placed side by
side on a 19” Dell Ultrasharp screen.
Score
•
•
•
•
•
1: DISCOVER is much better than BLAST DVC
2: DISCOVER is better than BLAST DVC
3: same quality
4: DISCOVER is worse than BLAST DVC
5: DISCOVER is much worse than BLAST DVC
45
Visual testing
5
4.5
4.5
4
4
Opinion Score
Opinion Score
Hall Monitor
5
3.5
3
2.5
Foreman
3.5
3
2.5
2
2
1.5
1.5
1
1
A
B
C
Mean opinion scores
Operating Point
A
D
A
B
C
D
Operating Point
B
C
Mean opinion scores
D
A
B
C
D
Average Bitrate
DISCOVER
73.60
97.64
167.01
293.73
(kbps)
BLAST
71.43
97.62
166.48
291.63
37.27
Average PSNR
DISCOVER
28.71
29.93
32.38
35.51
37.25
(dB)
BLAST
28.19
29.34
31.68
34.59
Average Bitrate
DISCOVER
87.62
100.28
140.38
208.25
(kbps)
BLAST
83.53
89.69
121.57
185.45
Average PSNR
DISCOVER
31.48
32.07
34.31
(dB)
BLAST
31.49
32.02
34.29
46
DISCOVER
Frame bits: 5.34 kbits.
Frame PSNR : 33.21 dB
Proposed System
Frame bits: 3.36 kbits.
Frame PSNR: 32.89 dB.
Sequence average bitrate is 140.38 kbps and average PSNR is 34.31 dB for DISCOVER.
Sequence average bitrate is 121.57 kbps and average PSNR is 34.29 dB for the proposed system.
47
DISCOVER
Frame bits: 8.61 kbits
Frame PSNR: 33.16 dB
Proposed System
Frame bits: 5.83 kbits
Frame PSNR: 31.84 dB
Sequence average bitrate is 167.01 kbps, and average PSNR is 32.38 dB for DISCOVER.
Sequence average bitrate is 166.48 kbps, and average PSNR is 31.68 dB for the proposed system.
48
DISCOVER
BLAST-DVC
Compressed at 15fps, 167.01 kbps
Compressed at 15fps, 166.48 kbps
49
AQT-DVC: Transform-Domain Distributed Video Coding with RateDistortion Based Adaptive Quantization
Motivation
• Transform domain DVC performance is better than pixel domain DVC
performance, especially for high motion sequences.
• Rate-distortion based adaptive quantization provides a better quantization
scheme in terms of rate-distortion performance.
Considerations:
• Feedback channel
Minimize the traffic on the feedback channel. Bitplane selective scheme is
not applicable because the number of bitplanes might be too large.
-> One quantization matrix for each partition (M 4x4 DCT blocks)
• Partition size versus LDPCA block size
Smaller partition size keeps the flexibility of the quantization scheme.
Larger LDPCA block size provides a better error correction ability and
reduce the feedback channel traffic.
-> One LDPCA code for a bitplane of a subband.
-> Due to different adative quantizers, resulting bitplanes are not
rectangular (irregular shape) and have undefined values => need to modify
LDPCA
50
Sample Quantizer Matrices
Each matrix describes the number of quantization levels
used for each of the 16 DCT subbands
51
Adaptive Quantization
3
3
3
3
3
3
3
31
1
1
1
1
3
3
3
3
3
3
3
3
14
4
1
1
1
3
3
3
3
3
3
4
4
4
4
1
1
3
3
3
3
3
3
45
45
45
45
1
1
3
3
3
3
3
3
5
5
5
5
1
1
3
3
3
3
3
3
53
53
53
53
13
13
Q: Quantizer matrix index
Q
4x4 DCT block
52
LDPCA Adaptation
x1
s1
a1
x1
s1+s2
a2
x1
s1+s2
a2
s3+s4
a4
s5+s6
a6
x2
x8
s8
LDPCA encoder
a8
s3+s4
a4
s5+s6
a6
s7+s8
a8
x3
x8
Tanner graph corresponding
to the transmission of only
the even-indexed subset of
the accumulated syndrome
Tanner graph after
eliminating redundant nodes
53
AQT-DVC
Wyner-Ziv Frame Encoder
LDPCA Encoder
Xi
DCT
Wyner-Ziv
frames
Adaptive
Quantization
Extract
Bitplanes
Buffer
CRC Generator
Quantizer Set
Index
Wyner-Ziv Frame Decoder
Side Information
Generation
DCT
Distortion-rate
Estimation
Quantizater Set
Selection
X’i
Inverse DCT
Quantizer Set
Index
Same D-R concept but
different equations for D
and R
request
Xi+1
parity bits
+ CRC bits
Xi-1
Reconstruction
LDPCA Decoder
Decoded Wyner-Ziv frames
54
Quantizer Matrix Selection
• Each RD point corresponds to a quantizer matrix
Side information
Average distortion D
M
• Two criterions for quantizer selection
7
- D/R Is larger than threshold target D-R TD-R=t
- The quantizer matrix results in the largest
distortion reduction
M6
M5
M4
M3
Selected quantizer set
M2
M1
M0
Slope
0
t
Average bitrate R
55
Simulation Setup
QCIF Video Sequences (176x144)
Frame rate: 15 frame per second.
Partition size = 16x16 pixels (four 4x4 DCT blocks)
Four LDPCA code to accommodate variable-size
bitplanes: 396, 792, 1188, and 1984
Comparison with following systems:
• GOP = 2
‒ H.264 Inter : I-B-I-B
‒ H.264 Intra only
‒ DISCOVER by X. Artigas et al. (LDPCA length:
1584)
56
Simulation Results
Foreman
Hall Monitor
38
37
36
37
35
36
PSNR (dB)
PSNR (dB)
34
35
34
H.264 Inter
33
32
H.264 Inter
31
AQTDVC
33
AQTDVC
30
DISCOVER
32
BLASTDVC
DISCOVER
BLASTDVC
29
H.264 Intra
H.264 Intra
28
31
0
50
100
150
200
250
300
Bitrates (kbps)
350
0
50
100
150
200
250
300
350
Bitrates (kbps)
Up to 1.4 dB compared to DISCOVER
57
400
Visual testing
5
5
4.5
4.5
4
4
Opinion Score
Opinion Score
Hall Monitor
3.5
3
2.5
3.5
3
2.5
2
2
1.5
1.5
1
Foreman
1
A
B
C
D
A
Mean opinion scores
Operating Point
A
B
C
D
Average Bitrate
DISCOVER
87.62
100.28
140.38
208.25
(kbps)
AQT-DVC
85.60
98.51
141.56
207.78
Average PSNR
DISCOVER
31.48
32.07
34.31
37.27
(dB)
AQT-DVC
32.02
32.78
35.17
38.60
Operating Point
B
C
Mean opinion scores
D
A
B
C
D
Average Bitrate
DISCOVER
73.60
97.64
167.01
293.73
(kbps)
AQT-DVC
71.99
103.02
168.31
290.52
Average PSNR
DISCOVER
28.71
29.93
32.38
35.51
(dB)
AQT-DVC
28.70
29.93
32.37
35.49
58
DISCOVER
Compressed at 15fps,167.01 kbps
AQT-DVC
Compressed at 15fps, 168.31 kbps
59
AQT-DVC
Inaccurate estimate of source correlation model might result in inappropriate
quantization matrix selection and might cause significant RD performance loss
in AQT-DVC.
• The model estimation solely depends on two neighboring motion-compensated Key
frames.
Previous Key frame
Motion-compensated
previous Key frame
Original WZ frame
Side information
Next Key frame
Motion-compensated
next Key frame
60
eAQT-DVC Procedure
Encoder
Coarsely quantize and
encode all DC coefficients
Decoder
syndromes
Decode and reconstruct all DC coefficients
Estimate the Laplacian model paramters of all
DCT coefficients by using the motioncompensated Key frames and the reconstructed
coarsely quantized DC coefficients
Use the obtained Laplacian models to estimate
rate-distortion ratios for all coefficients with
respect to all available quantization matrices
Receive quantization matrix
index
Quantize and encode all
DCT coefficients
Matrix
index
syndromes
Select the best quantization matrices (in the
R-D sense), one for each partition.
Decode and reconstruct all DCT coefficients
61
Simulation Results (High-motion Sequences)
Soccer
Foreman
40
37
38
36
35
34
34
32
H.264 Inter
30
eAQT-DVC
PSNR (dB)
PSNR (dB)
36
33
32
H.264 Inter
31
eAQT-DVC
AQT-DVC
28
DISCOVER
H.264 Intra
26
0
50
100 150 200 250 300 350 400 450 500
Bitrate (kbps)
30
AQT-DVC
29
DISCOVER
H.264 Intra
28
0
50
100
150
200
250
300
350
Bitrate (kbps)
62
400
Visual testing
5
5
4.5
4.5
4
4
Opinion Score
Opinion Score
Hall Monitor
3.5
3
2.5
3.5
3
2.5
2
2
1.5
1.5
1
Foreman
1
A
B
C
Mean opinion scores
Operating Point
D
A
A
B
C
D
Average Bitrate
DISCOVER
87.62
100.28
140.38
208.25
(kbps)
eAQT
89.58
100.84
142.51
209.17
Average PSNR
DISCOVER
31.48
32.07
34.31
37.27
(dB)
eAQT
32.02
32.74
35.65
38.61
B
Operating Point
C
Mean opinion scores
D
A
B
C
D
Average Bitrate
DISCOVER
73.60
97.64
167.01
293.73
(kbps)
eAQT
71.43
97.62
166.48
291.63
Average PSNR
DISCOVER
28.71
29.93
32.38
35.51
(dB)
eAQT
28.19
29.34
31.68
34.59
63
DISCOVER
Compressed at 15fps,167.01 kbps
eAQT-DVC
Compressed at 15fps,166.48 kbps
64
Conclusion
Adaptive distributed video coding
Distortion-Rate estimation for distributed video coding
• Allows allocation of more bits to significant regions
• A bitplane selective decoding scheme for pixel-domain
DVC
• An adaptive quantization for transform-domain DVC
• PSNR improvement as much as 2.0 dB on the decoded
video.
• Superior visual quality on the decoded video.
65
Future Research Directions
•
•
•
•
•
•
Explore more accurate source probability model.
Variable block size locally-adaptive DVC scheme
Improved DVC without feedback channel
Real-time decoding
Multi-View compression/3D TV
Perceptual-based DVC
66
Related Publications
Wei-Jung Chien and Lina J. Karam, “Transform-Domain Distributed
Video Coding with Rate-Distortion Based Adaptive Quantization,”
to appear in the IET Journal of Image Processing, Special Issue
on Distributed Video Coding.
Wei-Jung Chien and Lina J. Karam, “BLAST-DVC: BitpLAne
SelecTive Distributed Video Coding,” Springer Journal of
Multimedia Tools and Applications, Special Issue on Distributed
Video Coding, July 2009.
Wei-Jung Chien and Lina J. Karam, “AQT-DVC: Transform-Domain
Distributed Video Coding with Rate-Distortion Based Adaptive
Quantization,” accepted to IEEE International Conference on
Image Processing, 2009.
Wei-Jung Chien and Lina J. Karam “Bitplane Selective Distributed
Video Coding,” Asilomar Conference on Signals, Systems and
Computers, 2008.
67
Related Publications
Wei-Jung Chien, Lina J. Karam, and Glen P. Abousleman, “ RateDistortion Based Selective Decoding for Pixel-Domain
Distributed Video Coding ,” IEEE International Conference on
Image Processing, p .1132 - 1135 , 2008
Wei-Jung Chien, Lina J. Karam, and Glen P. Abousleman, “Block
Adaptive Wyner-Ziv Coding for Transform-Domain Distributed
Video Coding,” IEEE International Conference on Acoustics,
Speech, and Signal Processing, p I-525-8, 2007.
Wei-Jung Chien, Lina J. Karam, and Glen P. Abousleman,
“Distributed Video Coding with lossy side information,” IEEE
International Conference on Acoustics, Speech, and Signal
Processing, p II-69-72, 2006.
Wei-Jung Chien, Lina J. Karam, and Glen P. Abousleman,
“Distributed Video Coding with 3-D Recursive Search Block
Matching,” IEEE International Symposium on Circuits and
Systems, p 5415-5418, 2006.
68
Wei-Jung Chien and Lina Karam
Wei-Jung Chien and President Obama
Thank you