No Slide Title

Download Report

Transcript No Slide Title

Image and Video Compression Fundamentals

Heejune AHN Embedded Communications Laboratory Seoul National Univ. of Technology Fall 2013 Last updated 2013. 9. 07

1. Driving Force of Video Compr.

  Uncompressed Video Bandwidth  Ver. Resolution x Hor. Resolution x Time Resolution x Colors  Eg. CCIR 601 (TV Quality) 720x480x30x24 = 248,832,000 bps Typical Storage and Network   DVD 4.7 GB (about 80 sec for CCIR) ADSL 100Mbps < CCIR BW

o

480

t y

Heejune AHN: Image and Video Compression

720 30

x

p. 2

Typical values

 Typical Video Bandwidth  ITU CCIR 601 L(858x525) C(429 x525) 30fps => 216.0Mbps

  CIF L (352x288) C(176x144) 30fps => 36.5Mbps QCIF L (176x144) C(88x72) 15fps => 4.6Mbps

 Typical Storage /Transmission Capacity      Terrestrial TV broadcasting channel ~20 Mbps CD/DVD-5 640MB/4.7GB

Ethernet/Fast Ethernet <10/100 Mbps ADSL/VDSL downlink Wireless cellular (2G/3G/3G+) 2048 kbps/100Mbps 9.6/384/2000kbps

Heejune AHN: Image and Video Compression p. 3

2. Image and Video Compression

  Information Theory  1950 ’ s Claude Shannon (Bell Lab) pioneered.

 Providing Mathematical Limits for Information Processing/Communications Coding   Source Coding • How to Reduce the data • for information representation Channel coding • How to Transmit Data • though Noise/Distored Channels  Note : TDMA, FDMA, CDMA, OFDMA, and MIMO are all for the channelization methods

Claude Elwood Shannon (April 30, 1916 – February 24, 2001) Heejune AHN: Image and Video Compression p. 4

Typical Visual Comm. System

 Typical path Info output Info source Source coder channel coder modulator Channel (wired/wirless/ storage)

Heejune AHN: Image and Video Compression

Source decoder channel decoder demodulator

p. 5

Codec

  Codec = enCOder&DECoder Codec Types   Lossless compression • X == X’ • Used for document file (ZIP), Medical Images (JPEG lossless) • Entropy coding (Arithmetic coding, Huffman coding), Predictive coding Lossy compression • X ~ X ’ • Used for Entertainment, Communication Multimedia • (DCT), Quantization

Encoder Decoder X Y X’ Heejune AHN: Image and Video Compression p. 6

 Uncompressed, Zipped, H264-encoded of same video  Video Compression System Feature   Source model • Note: zip is source-independent encoding Human Visual System • HVS does not notice many distortions

Heejune AHN: Image and Video Compression p. 7

3. Predictive Coding

 DPCM (Differential Pulse Coded Modulation)   Highly Correlated pixel values in Spatial Domain Code current (S 0 ) using previously coded ones (S 1 , S 2 , S 3 etc)

S

2

S

3

S

4 line of pixels above

S

1

S

0  Coder Block Diagram

+ Entropy Coder -

p

Predictor

Encoder

Heejune AHN: Image and Video Compression

current line of pixels

Entropy Decoder +

p

+ Predictor

Decoder

p. 8

 DPCM example

original

0 1 0 0 0 0 1 0

Heejune AHN: Image and Video Compression

0 0.5

0.5

0

p. 9

Motion Compensation Prediction

 Temporal domain prediction Two successive video frames  How to use the temporal correlation?

 Model and representation methods

Heejune AHN: Image and Video Compression

Change detection mask

p. 10

 Model based MC   2D/3D Model • dx, dy, dz and rotations • Estimate (ie. Calculate) the parameters in encoder and use for decoder Difficulties • Too high Shape encoding, Estimation Complexity for now • In MPEG-4 Object Oriented coding Background Moving area picked up by change detector Moving areas missed by change detector

Heejune AHN: Image and Video Compression p. 11

 Block Based MC  Segment Fixed Size Block and find best matching displacement  Easier Implementation in HW and SW X(t) X(t+1) Real Motion MV

Heejune AHN: Image and Video Compression p. 12

4.Transform coding

 Transform  Spatial Domain to Frequency Domain  Easy for quantization • Energy Compaction Properties and HVS properties • No Compression itself image

x

transform

y

T

samples

y

quantizer

q

Q

indices

q

reconstructed image ˆ inverse 

T

1   samples ˆ dequantizer 

Q

 1 encoder

c

C

indices

q

decoder

q

C

 1 bit-stream

c Heejune AHN: Image and Video Compression p. 13

Block transform

 (fixed-size) Block Transform  Easy for implementation  Normally 2-D separable Transform image block DCT coefficients of block quantized DCT coefficients of block block reconstructed from quantized coefficients 30 20 10 0 - 10 - 20 - 30 0 2 4 6 6 4 2 0 30 20 10 0 - 10 - 20 - 30 0 2 4 6 6 4 2 0 30 20 10 0 - 10 - 20 - 30 0 2 4 6 6 4 2 0 30 20 10 0 - 10 - 20 - 30 0 2 4 6 6 4 2 0

Heejune AHN: Image and Video Compression

30 20 10 0 - 10 - 20 - 30 0 2 4 6 6 4 2 0 30 20 10 0 - 10 - 20 - 30 0 2 4 6 6 4 2 0

p. 14

 Transform types  KL Transform is proved optimal   DCT is fixed and similar to KL for image signals Wavelet and Fractal Transform etc (1) (2) (3) (4) (5) (1) Karhunen Loève transform

[1948/1960]

(2) Haar transform

[1910]

(3) Walsh-Hadamard transform

[1923]

(4) Slant transform

[Enomoto, Shibata, 1971]

(5) Discrete CosineTransform (DCT)

[Ahmet, Natarajan, Rao, 1974]

Heejune AHN: Image and Video Compression p. 15

 Transform size  The Larger Block, The more efficient, but The more Computationally complex  8x8 or 4x4 are used for Standards

Heejune AHN: Image and Video Compression p. 16

5. Quantization

 Approximation of Values  Lossy Coding (key data reduction)  Applied to 2D transform Coefficient

Heejune AHN: Image and Video Compression p. 17

  Qstep (or qscale)  Distortion Range   The Larger/Coarse Q step • The More Compression • The Larger Distortion Rate Distortion Theory In Video Coding   Applied to 2D transform Coefficients HVS • Smaller in low freq • Larger in high frequency Quantizer output

x

D+a D Quantizer input

x

D

Heejune AHN: Image and Video Compression p. 18

6. Entropy Coding

  Statistical redundancy in video coding  Many zeroes in quantized transform coefficients  Unequal histogram of control info, like motion vectors and coding type Entropy coding    Principle • “Shorter Code words for More Frequency events” • Variable Length Coding (VLC) Huffman coding • Integer VLC: each code words are integer length • Used for most Standards Arithmetic Coding • Fractional Length Coding • Started from H.263+ but used in H.264 practically

Heejune AHN: Image and Video Compression p. 19

 VLC coding in Image Coding  Zigzag scan used for more statistical correlation  2-D Run-Length Code (num of zeros, no zero value)

1480 26.0 9.5 8.9 -26.4 15.1 -8.1 0.3 11.0 8.3 -8.2 3.8 -8.4 -6.0 -2.8 10.6 -5.5 4.5 10.7 9.8 9.0 5.3 -8.0 4.0 -5.1 4.9 4.9 -8.3 -2.1 -1.9 2.8 -8.1 1.6 1.4 8.2 4.3 3.4 4.1 -7.9 1.0 -4.5 -5.0 -6.4 4.1 -4.4 1.8 -3.2 2.1 5.9 5.8 2.4 2.8 -2.0 5.9 -3.0 2.5 -1.0 0.7 3.2 4.1 -6.1 6.0 1.1 5.7

Q (8)

185 1 0 1 0 0 0 0 3 1 0 1 0 0 0 0 1 -1 1 0 1 0 0 0 1 0 0 -1 0 0 0 0 -3 -1 -1 0 0 0 0 0

Transformed 8x8 block

2 0 0 0 0 0 0 0

Zig-zag scan

-1 0 0 0 -1 0 0 0 Mean of Block: 185 (0,3) (0,1) (1,1) (0,1) (0,1) (0,1) (0,-1) (1,1) (1,1) (0,1) (1,-3) (0,2) (0,-1) (6,1) (0,-1) (0,-1) (1,-1) (14,1) (9,-1) (0,-1) EOB Heejune AHN: Image and Video Compression

Run-level coding

p. 20 0 1 0 -1 0 0 0 0

7. Codec Design

 Hybrid Codec  Most Standards Codec  MC => DCT => Quant => Entropy Coding

Coder Control

Control Data -

Intra-frame DCT Coder

DCT Coefficients

Quant Decoder Intra-frame Decoder DeQ

Intra/Inter 0

Motion Compensated Predictor Motion Estimator Heejune AHN: Image and Video Compression

Motion Data

p. 21

 Complexity Consideration   Asymmetric Complexity • Encoders are more complex for most standards • Non-real time Encoding but Real time Encoding (e.g. Broadcasting, Storage) • One time encoding many time decoding • Encoder and decoder Cost Parallel Processing and HW/SW implementation (in MPEG-2) • Motion Compensation (~ 55%) • DCT/DCT (~15%) • VLC encoding/Decoding (~15%) • Other (post processing) (15%)

Heejune AHN: Image and Video Compression p. 22