No Slide Title

Download Report

Transcript No Slide Title

Transform Coding
Heejune AHN
Embedded Communications Laboratory
Seoul National Univ. of Technology
Fall 2013
Last updated 2013. 9. 30
Agenda






Transform Coding Concept
Transform Theory Review
DCT (Discrete Cosine Transform)
DCT in Video coding
DCT Implementation & Fast Algorithms
Appendix: KL Transform
Heejune AHN: Image and Video Compression
p. 2
1. Transform Coding

X1= lum(2n), X2= lum(2n+1), neighbor pixels




X1 ~ U(0, 255), X2~ U(0,255)
Quantization of X1 and X2 => same data
Cross-Correlation of X1 and X2
Y1, Y2


X2
Y2
x2
p
Y1
y1
y2
45 degree rotation
Y1 = (X1 + X2) /2
x1
X1
• Average or DC value

Y2 = (X2 – X1) /2
• Difference or AC value

Y1 ~ F(0, 255), Y2~ F(-255,255)
0
255
Heejune AHN: Image and Video Compression
-255
0
255
p. 3

Which ones are easier to encode (quantize)?
f(X1)
0
f(X2)
255
0
255
f(Y1)
0
f(Y2)
255
Heejune AHN: Image and Video Compression
-255
0
255
p. 4

Origins of Transform Coding Benefits

Signal Theory
• Make the representation easier to manipulate
• energy concentration
bk ,l

B
1
 2  log2 ( k2,l /  2 ),
N
2

2
2 

     k ,l 
 k ,l

1
N2
Vilfredo Pareto
Economist
1848-1923
Image and HVS Properties
• HVS is more sensitive to Low frequency
• More dense quantizer to Low frequency
Heejune AHN: Image and Video Compression
p. 5
2. Transform Theory Review

Definition of Transform


Linear Transform (cf. Non-Linear Transform)



N to M mapping, [Y1, Y2, . . ., YN] = F [X1,X2, . . ., XM]
if [Y11, Y12] = F [X11,X12] and [Y21, Y22] = F [X21,X22]
[Y11 + Y21, Y12 +Y22] = F [X11+X21, X21+X22]
Matrix representation of Linear Transform

Forward
N transform coefficients,
arranged as a vector

Inverse
y=Tx
Transform matrix
of size NxN
Input signal block of size
N, arranged as a vector
x = T-1 y
Heejune AHN: Image and Video Compression
p. 6

Basis Vectors

Orthogonal



v1
v2
v3
vN
Vl * Vm = 0 for basis Vector V1, V2, . . ., VN
Each vectors are disjointed, separated.
Orthonormal


|| Vl || = 1 for basis Vector V1, V2, . . ., VN
x = T-1 y =
T-1 =TT =>
TT y
Parseval’s Theorem
• Signal Power/Energy conserves between Transform Domain
||y||2 = yTy = xTTT Tx = ||x||2
Heejune AHN: Image and Video Compression
p. 7

Example of Orthonormal transform
o

cos
45
T (45o rotation)  
o
sin
45

Heejune AHN: Image and Video Compression
 sin 45o  1 1  1



o 
cos 45 
2 1 1 
p. 8
2D Transform

Data



2D pixel value matrix, 2D transform coefs matrix
2D matrix => 1D vector
Forward Transform
N 1 N 1
F (k , l )   f (n, m)  t (n, m)
n 0 m 0
y=Tx
NxN transform coefficients,
arranged as a vector

Inverse transform
Transform matrix
of size N2xN2
Input signal block of size
NxN, arranged as a vector
x = T-1 y
Heejune AHN: Image and Video Compression
p. 9
3. Transforms

Various transforms in image compression






DFT (Discrete Fourier Transform)
DCT (Discrete cosine Transform)
DST (Discrete sine Transform)
Hadamard Transfrom
Discrete Wavelet Transform
and more (HAAR etc )
Heejune AHN: Image and Video Compression
p. 10
Hadamard transform

Core Matrix


1 1 1 


H1 
2 1  1
1차원
H n  H1  H n 1 
N 차원
1
2
 H n 1
H
 n 1
where n  log2 N , N  2n ,
H n 1 
 H n 1 
n  1, 2, 
 : Knonecker product

2차원
H n  H1  H n 1 

Transform
1
2
 H n 1
H
 n 1
1
1
1 1


H n 1 
1 1  1 1  1

for n  2

 H n 1 
4 1 1  1  1


1

1

1
1


YNN  H X NN H t  H X NN H
H *  H  H 1
t
Heejune AHN: Image and Video Compression
p. 11
DCT Transform

1D Forward DCT (pixel domain to frequency domain)
N 1
F (k )  C (k ) 
n 0
 k (2n  1) 
f (n)  cos
,

2N


k  0, 1,  , N  1,
 (0)  1 N
 (k  0)  2 N

1D Inverse DCT (frequency domain to pixel domain)
 k (2n  1) 
f (n)   C (k ) F (k )  cos
,

 2N

k 0
N 1
Heejune AHN: Image and Video Compression
0  n  N 1
p. 12
2D DCT


2D DCT basis Functions
Coef. Distribution

DC ~ Uniform dist., AC ~ Laplacian dist.
Heejune AHN: Image and Video Compression
p. 13

Properties




Orthonormal transform
Separable transform
Real valued coefficients
DCT performance

very resembles KLT for image input
• Image input model (1 order Markov chain)
• xn+1 = rho * xn+1 + e(n)

DCT complexity



2D DCT = 1D DCT for vertical * 1D DCT for horizontal
Not for 3D (for delay and memory size)
DCT size (4x4, 8x8, 16x16, 32x32 …)
• Larger: better performance, but blocking artifact (?) and HW complexity
Heejune AHN: Image and Video Compression
p. 14
Coding Performance of DCT
Karhunen Loève transform [1948/1960]
Haar transform [1910]
Walsh-Hadamard transform [1923]
Slant transform [Enomoto, Shibata, 1971]
Discrete CosineTransform (DCT)
[Ahmet, Natarajan, Rao, 1974]
Comparison of 1-d
basis functions for
block size N=8
Heejune AHN: Image and Video Compression
p. 15

Energy concentration Performance



measured for typical natural images, block size 1x32
KLT is optimum
DCT performs only slightly worse than KLT
Heejune AHN: Image and Video Compression
p. 16
Complexity Performance of DCT

Separation of 2D DCT



Cascading 1-D DCT
Reduction of the complexity (multiplication) from O(N4) to O(N3)
8x8 DCT
• For 64 each Coefs, 64 multiplications
• 2 times 64 Coefs x 8

Can you derive this ?
AxA
NxN block
of pixels
N
N
x
column-wise
N-transform
Heejune AHN: Image and Video Compression
T
Ax
NxN block
of transform
coefficients
row-wise
N-transform
AxA T
p. 17
4. Transform in Image Coding

Transform coding Procedure
 Transform T(x) usually invertible


Quantization not invertible, introduces distortion
Combination of encoder and decoder lossless
image x
transform
y  T  x
reconstructed
image xˆ
samples y
quantizer
indices q
q  Qy
inverse
samples yˆ dequantizer indices q
transform
xˆ  T 1  yˆ 
yˆ  Q 1  q 
Heejune AHN: Image and Video Compression
encoder
bit-stream c
c  C q 
decoder
q  C 1  c 
p. 18
DCT in Image Coding
198
202
194
179
180
184
196
168
1480 26.0
9.5
8.9 -26.4 15.1 -8.1
185
3
1
1
-3
2
-1
0
187
196
192
181
182
185
189
174
11.0
8.3
-8.2
3.8
-8.4
-6.0
-2.8 10.6
1
1
-1
0
-1
0
0
1
188
184
185
188
193
182
179
187
188
183
188
186
187
195
170
174
-5.5
10.7
4.5
9.8
9.0
4.9
5.3
-8.3
-8.0
-2.1
4.0
-1.9
-5.1
2.8
4.9
-8.1
0
1
0
1
1
0
0
-1
-1
0
0
0
0
0
0
-1
194
193
189
187
180
183
181
185
1.6
1.4
8.2
4.3
3.4
4.1
-7.9
1.0
0
0
1
0
0
0
-1
0
193
195
193
192
170
189
187
181
-4.5
-5.0
-6.4
4.1
-4.4
1.8
-3.2
2.1
0
0
0
0
0
0
0
0
181
185
183
180
175
184
185
176
0
0
0
0
0
0
0
195
185
177
178
170
179
195
175
1.1
0
5.7
0
0
0
0
0
0
0
0
DCT
5.9
-3.0
5.8
2.5
2.4
-1.0
2.8
0.7
-2.0
4.1
5.9
-6.1
0.3
3.2
6.0
Q
Transformed
8x8 block
Original 8x8
block
Zig-zag scan
Run-level
coding
Mean of Block: 185
(0,3) (0,1) (1,1) (0,1) (0,1) (0,1) (0,-1) (1,1)
(1,1) (0,1) (1,-3) (0,2) (0,-1) (6,1) (0,-1) (0,-1)
(1,-1) (14,1) (9,-1) (0,-1) EOB
Transmission
Mean of Block: 185
(0,3) (0,1) (1,1) (0,1) (0,1) (0,1) (0,-1) (1,1)
Reconstructed
8x8 block
(1,1) (0,1) (1,-3) (0,2) (0,-1) (6,1) (0,-1) (0,-1)
(1,-1) (14,1) (9,-1) (0,-1) EOB
192
201
195
184
177
184
193
174
185
3
1
1
-3
2
-1
0
189
191
195
182
182
187
190
171
1
1
-1
0
-1
0
0
1
188
189
185
188
190
185
181
183
185
183
187
182
189
190
171
175
0
1
0
1
1
0
0
-1
-1
0
0
0
0
0
0
-1
191
192
186
189
179
182
188
178
0
0
1
0
0
0
-1
0
190
191
189
190
177
186
184
179
0
0
0
0
0
0
0
0
189
188
185
184
175
186
187
179
0
0
0
0
0
0
0
0
189
188
178
176
173
183
193
180
0
0
0
0
0
0
0
0
Scaling and inverse DCT
Inverse zig-zag scan
Heejune AHN: Image and Video Compression
p. 19
DCT in Image Coding

Uniform deadzone quantizer


transform coefficients that fall
below a threshold are discarded.
Entrphy coding


Positions of non-zero transform
coefficients are transmitted in
addition to their amplitude
values.
Efficient encoding of the position
of non-zero transform
coefficients: zig-zag-scan + runlevel-coding
Heejune AHN: Image and Video Compression
p. 20
DCT Examples

Note that only a few coefficients has sizable value.
image block
DCT coefficients
of block
quantized DCT
coefficients
of block
30
30
20
20
10
10
0
0
- 10
- 10
- 20
- 20
0
- 30
0
- 30
2
2
0
0
4
2
4
4
2
4
6
6
6
6
30
30
20
20
10
10
0
0
- 10
- 10
- 20
- 20
0
- 30
0
- 30
2
2
0
0
4
2
4
4
2
4
6
6
6
6
30
30
20
20
10
10
0
0
- 10
- 10
- 20
- 20
0
- 30
0
- 30
2
0
2
0
4
2
4
6
6
Heejune AHN: Image and Video Compression
block
reconstructed
from quantized
coefficients
4
2
4
6
6
p. 21
DCT coding with increasingly coarse quantization, block size 8x8
quantizer stepsize
for AC coefficients: 25
quantizer stepsize
for AC coefficients: 100
Heejune AHN: Image and Video Compression
quantizer stepsize
for AC coefficients: 200
p. 22
4. Implementation

Implementation issue





HW or SW
Computational Cost, Speed, Implementation Size
Performance Cost
Implementation complexity
SW Implementation decision factors



Computational cost of multiplication
Whether Fixed or Float point operation (esp. multiplication)
Special Coprocessor and Instruction set (e.g. MMX)
Heejune AHN: Image and Video Compression
p. 23
Fast DCT Algorithm

Original DCT/IDCT

Computation load
• 64 Add + 64 Mult.
• 8 (7) Addition + 8 multiplication / one coeff. (from eqn.)

Scaling
• input range [0, 255] => output range [-2024, 2024]

Fast DCT



Similar to Fast DFT
Share same computation between nodes.
O(NxN) => O (N log2N)
• N : Width (num of coeff.)
• log2N : Steps of algorithm

Several version : Chen, Lee, Arai etc
Heejune AHN: Image and Video Compression
p. 24
Chen’s FDCT
See Code at http://www.cmlab.csie.ntu.edu.tw/~chenhsiu/tech/fastdct.cpp
Heejune AHN: Image and Video Compression
p. 25

How the fast algorithm works?

Exploiting the symmetry of cosine function.
 f 0 cos8   f1 cos38   f 2 cos58   f3 cos 78 
 f 4 cos 98   f 5 cos118   f 6 cos138   f 7 cos158 
2 F (2)  ( f 0  f 4  f 7  f 3 ) cos8   ( f1  f 2  f 5  f 6 ) cos 38 
2 F (6)  ( f 0  f 4  f 7  f 3 ) cos 38   ( f1  f 2  f 5  f 6 ) cos8 
F (2) 

1
2
STEP 1
D1  ( f 0  f 4  f 7  f3 ), D2  ( f1  f 2  f5  f 6 )

STEP 2
2F (2)  D1 cos8   D2 cos38 ,2F (6)  D2 cos38   D2 cos8 
Heejune AHN: Image and Video Compression
p. 26
HW Implementation

2D DCT using 1D DCT Function Block
Input sample
MUX
Column order output
Output coef
1-D DCT
8x8 RAM
Heejune AHN: Image and Video Compression
Row order input
p. 27

Distributed Arithmetic DCT


Multiplier-less architecture
Lookup, Shift, accumulators only
4 bits from u input
Shift(2-1)
LUT
(ROM)
Output coef Fx
accumulator
Add or subtract
Heejune AHN: Image and Video Compression
p. 28
IDCT Mismatch

DCT x IDCT = I ?




DCT is defined: in “floating point” and “direct form.”
Integer Implementation induces ‘error’ after Inverse DCT.
different FDCT has different ‘error’s.
DCT mismatch in MC-DCT


different reference image at encoder and decoder
very small error but it accumulates.
orgE
DCT
Q
IDCTE
IQ
recE
VLC
VLD
IQ
Should Equal but
Mismatch !
Heejune AHN: Image and Video Compression
IDCTD
recD
p. 29

IDCT Mismatch control


Minimum accuracy of DCT algorithm is defined in SPEC.
H.261/3,MPEG-1/2 Restrict the sum of coefficients values
• Oddification rule of sum of all DCT coefficients,
• Make LSB of F[63], the last Coef.
• Decoder check and correct the values

H.264
• (modified) Integer DCT is used
adding random error cancelation
Heejune AHN: Image and Video Compression
p. 30
Appendix
KL Transform, The Optimal Transform
Optimal Transform

Optimality



(No) Redundancy in input signal => (No) Redundant Quantization
Result
No cross-correlation between different components (coefs)
K-L (Karhunen-Loeve) transform

Assumption
• Input Covariance is given

R X , X  E[ X X
*t
]
Problem Definition
• find a transform (Y=T X) such that RY,Y = T RX,X TT meets diagonal
matrix (i.e., completely uncorrelated Y)
  ET X (T X )   T R
Ry , y  E Y Y
0

 


*t
1

Heejune AHN: Image and Video Compression
*t
t
x, x
T*


  diag {k }

 N 1 
p. 32
Optimal Transform

Solution
• Build T with eigenvectors of RX,X as basis vector
Toptimal  0  N 1 
*t
• Then, by the definition of Eigen-vectors & values (of RX,X)
– Rx  k  k  k , k  0, 1, , N 1 ,
0
– Rx , x 0   N 1    0    N 1   0   N 1 

0
N 1

• So.

*t
Ry , y  0  N 1  Rx , x 0   N 1 
 0   N 1  0
*t

0
 N 1 





 N 1 


  I   

 N 1 

Issue in KLT
• RX,X is varying for image to image: Need to calculate new T, transmit it
to decoder
• Not Separable (vertical, horizontal)
• But, good for benchmarking performance of other transform.
Heejune AHN: Image and Video Compression
p. 33