No Slide Title
Download
Report
Transcript No Slide Title
Transform Coding
Heejune AHN
Embedded Communications Laboratory
Seoul National Univ. of Technology
Fall 2013
Last updated 2013. 9. 30
Agenda
Transform Coding Concept
Transform Theory Review
DCT (Discrete Cosine Transform)
DCT in Video coding
DCT Implementation & Fast Algorithms
Appendix: KL Transform
Heejune AHN: Image and Video Compression
p. 2
1. Transform Coding
X1= lum(2n), X2= lum(2n+1), neighbor pixels
X1 ~ U(0, 255), X2~ U(0,255)
Quantization of X1 and X2 => same data
Cross-Correlation of X1 and X2
Y1, Y2
X2
Y2
x2
p
Y1
y1
y2
45 degree rotation
Y1 = (X1 + X2) /2
x1
X1
• Average or DC value
Y2 = (X2 – X1) /2
• Difference or AC value
Y1 ~ F(0, 255), Y2~ F(-255,255)
0
255
Heejune AHN: Image and Video Compression
-255
0
255
p. 3
Which ones are easier to encode (quantize)?
f(X1)
0
f(X2)
255
0
255
f(Y1)
0
f(Y2)
255
Heejune AHN: Image and Video Compression
-255
0
255
p. 4
Origins of Transform Coding Benefits
Signal Theory
• Make the representation easier to manipulate
• energy concentration
bk ,l
B
1
2 log2 ( k2,l / 2 ),
N
2
2
2
k ,l
k ,l
1
N2
Vilfredo Pareto
Economist
1848-1923
Image and HVS Properties
• HVS is more sensitive to Low frequency
• More dense quantizer to Low frequency
Heejune AHN: Image and Video Compression
p. 5
2. Transform Theory Review
Definition of Transform
Linear Transform (cf. Non-Linear Transform)
N to M mapping, [Y1, Y2, . . ., YN] = F [X1,X2, . . ., XM]
if [Y11, Y12] = F [X11,X12] and [Y21, Y22] = F [X21,X22]
[Y11 + Y21, Y12 +Y22] = F [X11+X21, X21+X22]
Matrix representation of Linear Transform
Forward
N transform coefficients,
arranged as a vector
Inverse
y=Tx
Transform matrix
of size NxN
Input signal block of size
N, arranged as a vector
x = T-1 y
Heejune AHN: Image and Video Compression
p. 6
Basis Vectors
Orthogonal
v1
v2
v3
vN
Vl * Vm = 0 for basis Vector V1, V2, . . ., VN
Each vectors are disjointed, separated.
Orthonormal
|| Vl || = 1 for basis Vector V1, V2, . . ., VN
x = T-1 y =
T-1 =TT =>
TT y
Parseval’s Theorem
• Signal Power/Energy conserves between Transform Domain
||y||2 = yTy = xTTT Tx = ||x||2
Heejune AHN: Image and Video Compression
p. 7
Example of Orthonormal transform
o
cos
45
T (45o rotation)
o
sin
45
Heejune AHN: Image and Video Compression
sin 45o 1 1 1
o
cos 45
2 1 1
p. 8
2D Transform
Data
2D pixel value matrix, 2D transform coefs matrix
2D matrix => 1D vector
Forward Transform
N 1 N 1
F (k , l ) f (n, m) t (n, m)
n 0 m 0
y=Tx
NxN transform coefficients,
arranged as a vector
Inverse transform
Transform matrix
of size N2xN2
Input signal block of size
NxN, arranged as a vector
x = T-1 y
Heejune AHN: Image and Video Compression
p. 9
3. Transforms
Various transforms in image compression
DFT (Discrete Fourier Transform)
DCT (Discrete cosine Transform)
DST (Discrete sine Transform)
Hadamard Transfrom
Discrete Wavelet Transform
and more (HAAR etc )
Heejune AHN: Image and Video Compression
p. 10
Hadamard transform
Core Matrix
1 1 1
H1
2 1 1
1차원
H n H1 H n 1
N 차원
1
2
H n 1
H
n 1
where n log2 N , N 2n ,
H n 1
H n 1
n 1, 2,
: Knonecker product
2차원
H n H1 H n 1
Transform
1
2
H n 1
H
n 1
1
1
1 1
H n 1
1 1 1 1 1
for n 2
H n 1
4 1 1 1 1
1
1
1
1
YNN H X NN H t H X NN H
H * H H 1
t
Heejune AHN: Image and Video Compression
p. 11
DCT Transform
1D Forward DCT (pixel domain to frequency domain)
N 1
F (k ) C (k )
n 0
k (2n 1)
f (n) cos
,
2N
k 0, 1, , N 1,
(0) 1 N
(k 0) 2 N
1D Inverse DCT (frequency domain to pixel domain)
k (2n 1)
f (n) C (k ) F (k ) cos
,
2N
k 0
N 1
Heejune AHN: Image and Video Compression
0 n N 1
p. 12
2D DCT
2D DCT basis Functions
Coef. Distribution
DC ~ Uniform dist., AC ~ Laplacian dist.
Heejune AHN: Image and Video Compression
p. 13
Properties
Orthonormal transform
Separable transform
Real valued coefficients
DCT performance
very resembles KLT for image input
• Image input model (1 order Markov chain)
• xn+1 = rho * xn+1 + e(n)
DCT complexity
2D DCT = 1D DCT for vertical * 1D DCT for horizontal
Not for 3D (for delay and memory size)
DCT size (4x4, 8x8, 16x16, 32x32 …)
• Larger: better performance, but blocking artifact (?) and HW complexity
Heejune AHN: Image and Video Compression
p. 14
Coding Performance of DCT
Karhunen Loève transform [1948/1960]
Haar transform [1910]
Walsh-Hadamard transform [1923]
Slant transform [Enomoto, Shibata, 1971]
Discrete CosineTransform (DCT)
[Ahmet, Natarajan, Rao, 1974]
Comparison of 1-d
basis functions for
block size N=8
Heejune AHN: Image and Video Compression
p. 15
Energy concentration Performance
measured for typical natural images, block size 1x32
KLT is optimum
DCT performs only slightly worse than KLT
Heejune AHN: Image and Video Compression
p. 16
Complexity Performance of DCT
Separation of 2D DCT
Cascading 1-D DCT
Reduction of the complexity (multiplication) from O(N4) to O(N3)
8x8 DCT
• For 64 each Coefs, 64 multiplications
• 2 times 64 Coefs x 8
Can you derive this ?
AxA
NxN block
of pixels
N
N
x
column-wise
N-transform
Heejune AHN: Image and Video Compression
T
Ax
NxN block
of transform
coefficients
row-wise
N-transform
AxA T
p. 17
4. Transform in Image Coding
Transform coding Procedure
Transform T(x) usually invertible
Quantization not invertible, introduces distortion
Combination of encoder and decoder lossless
image x
transform
y T x
reconstructed
image xˆ
samples y
quantizer
indices q
q Qy
inverse
samples yˆ dequantizer indices q
transform
xˆ T 1 yˆ
yˆ Q 1 q
Heejune AHN: Image and Video Compression
encoder
bit-stream c
c C q
decoder
q C 1 c
p. 18
DCT in Image Coding
198
202
194
179
180
184
196
168
1480 26.0
9.5
8.9 -26.4 15.1 -8.1
185
3
1
1
-3
2
-1
0
187
196
192
181
182
185
189
174
11.0
8.3
-8.2
3.8
-8.4
-6.0
-2.8 10.6
1
1
-1
0
-1
0
0
1
188
184
185
188
193
182
179
187
188
183
188
186
187
195
170
174
-5.5
10.7
4.5
9.8
9.0
4.9
5.3
-8.3
-8.0
-2.1
4.0
-1.9
-5.1
2.8
4.9
-8.1
0
1
0
1
1
0
0
-1
-1
0
0
0
0
0
0
-1
194
193
189
187
180
183
181
185
1.6
1.4
8.2
4.3
3.4
4.1
-7.9
1.0
0
0
1
0
0
0
-1
0
193
195
193
192
170
189
187
181
-4.5
-5.0
-6.4
4.1
-4.4
1.8
-3.2
2.1
0
0
0
0
0
0
0
0
181
185
183
180
175
184
185
176
0
0
0
0
0
0
0
195
185
177
178
170
179
195
175
1.1
0
5.7
0
0
0
0
0
0
0
0
DCT
5.9
-3.0
5.8
2.5
2.4
-1.0
2.8
0.7
-2.0
4.1
5.9
-6.1
0.3
3.2
6.0
Q
Transformed
8x8 block
Original 8x8
block
Zig-zag scan
Run-level
coding
Mean of Block: 185
(0,3) (0,1) (1,1) (0,1) (0,1) (0,1) (0,-1) (1,1)
(1,1) (0,1) (1,-3) (0,2) (0,-1) (6,1) (0,-1) (0,-1)
(1,-1) (14,1) (9,-1) (0,-1) EOB
Transmission
Mean of Block: 185
(0,3) (0,1) (1,1) (0,1) (0,1) (0,1) (0,-1) (1,1)
Reconstructed
8x8 block
(1,1) (0,1) (1,-3) (0,2) (0,-1) (6,1) (0,-1) (0,-1)
(1,-1) (14,1) (9,-1) (0,-1) EOB
192
201
195
184
177
184
193
174
185
3
1
1
-3
2
-1
0
189
191
195
182
182
187
190
171
1
1
-1
0
-1
0
0
1
188
189
185
188
190
185
181
183
185
183
187
182
189
190
171
175
0
1
0
1
1
0
0
-1
-1
0
0
0
0
0
0
-1
191
192
186
189
179
182
188
178
0
0
1
0
0
0
-1
0
190
191
189
190
177
186
184
179
0
0
0
0
0
0
0
0
189
188
185
184
175
186
187
179
0
0
0
0
0
0
0
0
189
188
178
176
173
183
193
180
0
0
0
0
0
0
0
0
Scaling and inverse DCT
Inverse zig-zag scan
Heejune AHN: Image and Video Compression
p. 19
DCT in Image Coding
Uniform deadzone quantizer
transform coefficients that fall
below a threshold are discarded.
Entrphy coding
Positions of non-zero transform
coefficients are transmitted in
addition to their amplitude
values.
Efficient encoding of the position
of non-zero transform
coefficients: zig-zag-scan + runlevel-coding
Heejune AHN: Image and Video Compression
p. 20
DCT Examples
Note that only a few coefficients has sizable value.
image block
DCT coefficients
of block
quantized DCT
coefficients
of block
30
30
20
20
10
10
0
0
- 10
- 10
- 20
- 20
0
- 30
0
- 30
2
2
0
0
4
2
4
4
2
4
6
6
6
6
30
30
20
20
10
10
0
0
- 10
- 10
- 20
- 20
0
- 30
0
- 30
2
2
0
0
4
2
4
4
2
4
6
6
6
6
30
30
20
20
10
10
0
0
- 10
- 10
- 20
- 20
0
- 30
0
- 30
2
0
2
0
4
2
4
6
6
Heejune AHN: Image and Video Compression
block
reconstructed
from quantized
coefficients
4
2
4
6
6
p. 21
DCT coding with increasingly coarse quantization, block size 8x8
quantizer stepsize
for AC coefficients: 25
quantizer stepsize
for AC coefficients: 100
Heejune AHN: Image and Video Compression
quantizer stepsize
for AC coefficients: 200
p. 22
4. Implementation
Implementation issue
HW or SW
Computational Cost, Speed, Implementation Size
Performance Cost
Implementation complexity
SW Implementation decision factors
Computational cost of multiplication
Whether Fixed or Float point operation (esp. multiplication)
Special Coprocessor and Instruction set (e.g. MMX)
Heejune AHN: Image and Video Compression
p. 23
Fast DCT Algorithm
Original DCT/IDCT
Computation load
• 64 Add + 64 Mult.
• 8 (7) Addition + 8 multiplication / one coeff. (from eqn.)
Scaling
• input range [0, 255] => output range [-2024, 2024]
Fast DCT
Similar to Fast DFT
Share same computation between nodes.
O(NxN) => O (N log2N)
• N : Width (num of coeff.)
• log2N : Steps of algorithm
Several version : Chen, Lee, Arai etc
Heejune AHN: Image and Video Compression
p. 24
Chen’s FDCT
See Code at http://www.cmlab.csie.ntu.edu.tw/~chenhsiu/tech/fastdct.cpp
Heejune AHN: Image and Video Compression
p. 25
How the fast algorithm works?
Exploiting the symmetry of cosine function.
f 0 cos8 f1 cos38 f 2 cos58 f3 cos 78
f 4 cos 98 f 5 cos118 f 6 cos138 f 7 cos158
2 F (2) ( f 0 f 4 f 7 f 3 ) cos8 ( f1 f 2 f 5 f 6 ) cos 38
2 F (6) ( f 0 f 4 f 7 f 3 ) cos 38 ( f1 f 2 f 5 f 6 ) cos8
F (2)
1
2
STEP 1
D1 ( f 0 f 4 f 7 f3 ), D2 ( f1 f 2 f5 f 6 )
STEP 2
2F (2) D1 cos8 D2 cos38 ,2F (6) D2 cos38 D2 cos8
Heejune AHN: Image and Video Compression
p. 26
HW Implementation
2D DCT using 1D DCT Function Block
Input sample
MUX
Column order output
Output coef
1-D DCT
8x8 RAM
Heejune AHN: Image and Video Compression
Row order input
p. 27
Distributed Arithmetic DCT
Multiplier-less architecture
Lookup, Shift, accumulators only
4 bits from u input
Shift(2-1)
LUT
(ROM)
Output coef Fx
accumulator
Add or subtract
Heejune AHN: Image and Video Compression
p. 28
IDCT Mismatch
DCT x IDCT = I ?
DCT is defined: in “floating point” and “direct form.”
Integer Implementation induces ‘error’ after Inverse DCT.
different FDCT has different ‘error’s.
DCT mismatch in MC-DCT
different reference image at encoder and decoder
very small error but it accumulates.
orgE
DCT
Q
IDCTE
IQ
recE
VLC
VLD
IQ
Should Equal but
Mismatch !
Heejune AHN: Image and Video Compression
IDCTD
recD
p. 29
IDCT Mismatch control
Minimum accuracy of DCT algorithm is defined in SPEC.
H.261/3,MPEG-1/2 Restrict the sum of coefficients values
• Oddification rule of sum of all DCT coefficients,
• Make LSB of F[63], the last Coef.
• Decoder check and correct the values
H.264
• (modified) Integer DCT is used
adding random error cancelation
Heejune AHN: Image and Video Compression
p. 30
Appendix
KL Transform, The Optimal Transform
Optimal Transform
Optimality
(No) Redundancy in input signal => (No) Redundant Quantization
Result
No cross-correlation between different components (coefs)
K-L (Karhunen-Loeve) transform
Assumption
• Input Covariance is given
R X , X E[ X X
*t
]
Problem Definition
• find a transform (Y=T X) such that RY,Y = T RX,X TT meets diagonal
matrix (i.e., completely uncorrelated Y)
ET X (T X ) T R
Ry , y E Y Y
0
*t
1
Heejune AHN: Image and Video Compression
*t
t
x, x
T*
diag {k }
N 1
p. 32
Optimal Transform
Solution
• Build T with eigenvectors of RX,X as basis vector
Toptimal 0 N 1
*t
• Then, by the definition of Eigen-vectors & values (of RX,X)
– Rx k k k , k 0, 1, , N 1 ,
0
– Rx , x 0 N 1 0 N 1 0 N 1
0
N 1
• So.
*t
Ry , y 0 N 1 Rx , x 0 N 1
0 N 1 0
*t
0
N 1
N 1
I
N 1
Issue in KLT
• RX,X is varying for image to image: Need to calculate new T, transmit it
to decoder
• Not Separable (vertical, horizontal)
• But, good for benchmarking performance of other transform.
Heejune AHN: Image and Video Compression
p. 33