슬라이드 1

Download Report

Transcript 슬라이드 1

Multimedia Data
Compression
Mee Young Sung
University of Incheon
Department of Computer Science & Engineering
[email protected]
1. Lossless Compression Algorithms
1.
2.
3.
4.
Introduction
Basics of Information Theory
Run-Length Coding
Variable-Length Coding(VLC)
1. Shannon-Fano Algorithm
2. Huffman Coding
3. Adaptive Huffman Coding
5.
6.
7.
8.
Dictionary-Based Coding
Arithmetic Coding
Lossless Image Compression
Differential Coding of Images
1. Differential Coding of Images
2. Lossless JPEG
2.
Lossy Compression Algorithms
1.
2.
3.
4.
Introduction
Distortion Measures
The Rate-Distortion Theory
Quantization
1.
2.
3.
5.
Quantization
1.
2.
6.
Introduction
Continuous Wavelet Transform*
Discrete Wavelet Transform*
Wavelet Packets
Embedded Zerotree of Wavelet Coefficients
1.
2.
3.
9.
Discrete Cosine Transform(DCT)
Karhunen-Loève Transform*
Wavelet-Based Coding
1.
2.
3.
7.
8.
Uniform Scalar Qunantization
Nonuniform Saclar Qunantization
Vector Qunantization
The Zerotree Data Structure
Successive Approximation Quantization
EZW Example
Set Partitioning in Hierarchical Trees (SPIHT)
3. Image Compression Standard
1. The JPEG Standard
1. Main Steps in JPEG Image Compression
2. JPEG Modes
3. A Glance at the JPEG Bitstream
2. The JPEG2000 Standard
1.
2.
3.
4.
Main Steps of JPEG2000 Image Compression*
Adapting EBCOT to JPEG2000
Region-of-Interest Coding
Comparison of JPEG and JPEG2000 Performance
3. The JPEG-LS Standard
1.
2.
3.
4.
Prediction
Context Determination
Residual Coding
Near-Lossless Mode
4. Bilevel Image Compression Standard
1. The JBIG Standard
2. The JBIG2 Standard
4. Basic Video Compression Techniques
1. Introduction to Video Compression
2. Video Compression Based on Motion
Compensation
3. Search for Motion Vectors
1. Sequential Search
2. 2D Logarithmic Search
3. Hierarchical Search
4. H.261
1.
2.
3.
4.
5.
Intra-Frame (I-Frame) Coding
Inter-Frame (P-Frame) Predictive Coding
Quantization in H.261
H.261 Encoder and Decoder
A Glance at the H.261 Video Bitstream Syntax
5. H.263
1. Motion Compensation in H.263
2. Optional H.263 Coding Modes
3. H.263+ and H.263++
5. MPEG Video Coding I — MPEG-1 and 2
1. Overview
2. MPEG-1
1. Motion Compensation in MPEG-1
2. Other Major Differences form H.261
3. MPEG-1 Video Bitstream
3. MEPG-2
1. Supporting Interlaced Video
2. MPEG-2 Scalabilities
3. Other Major Differences form MPEG-1
6.
MPEG Video Coding II – MPEG-4, 7, and Beyond
1.
2.
Overview of MPEG-4
Object-Based Visual Coding in MPEG-4
1.
2.
3.
4.
5.
6.
7.
3.
Synthetic Object Coding in MPEG-4
1.
2.
4.
5.
Core Features
Baseline Profile Features
Main Profile Features
Extended Profile Features
MPEG-7
1.
2.
3.
7.
2D Mesh Object Coding
3D Model-based Coding
MPEG-4 Object Types, Profiles and Levels
MPEG-4 Part10/H.264
1.
2.
3.
4.
6.
VOP-Based Coding vs. Frame-Based Coding
Motion Compensation
Texture Coding
Shape Coding
Static Texture Coding
Sprite Coding
Global Motion Compensation
Descriptor (D)
Description Scheme (DS)
Description Definition Language (DDL)
MPEG-21
7. Basic Audio Compression Techniques
1. ADPCM in Speech Coding
1. ADPCM
2. G.726 ADPCM
3. Vocoders
1.
2.
3.
4.
5.
6.
Phase Insensitivity
Channel Vocoder
Formant Vocoder
Linear Predictive Coding
CELP
Hybrid Excitation Vocoders*
8. MPEG Audio Compression
1. Psychoacoustics
1. Equal-Loudness Relations
2. Frequency Masking
3. Temporal Masking
2. MPEG Audio
1.
2.
3.
4.
5.
MPEG Layers
MPEG Audio Strategy
MPEG Audio Compression Algorithm
MPEG-2 AAC(Advanced Audio Coding)
MPEG-4 Audio
3. Other Commercial Audio Codecs
4. The Future: MPEG-7 and MPEG-21
Run-Length Coding
• 비트열
00 ... 00100 ... 001100 ... 00100 ... 001100 ... 00 : 89 비트
↑
↑
0비트열의길이 14
9
20
30
9
0비트없음
0비트없음
• 실행길이 부호화의 일례
Run length(이진) 1110 1001 0000 1111 0101 1111 1111 0000 0000 1001 : 40비트
Run length(십진) 14
9
0
15
5 15
15
0
0
9
Huffman Coding
•
•
•
•
•
Encoding for Huffman Algorithm:
A bottom-up approach
1. Initialization: Put all nodes in an OPEN list, keep it sorted at all times (e.g., ABCDE).
2. Repeat until the OPEN list has only one node left:
(a) From OPEN pick two nodes having the lowest frequencies/probabilities, create a parent
node of them.
(b) Assign the sum of the children's frequencies/probabilities to the parent node and insert
it into OPEN.
(c) Assign code 0, 1 to the two branches of the tree, and delete the children from OPEN.
•
Symbol
Count
log2(1/pi)
Code
Subtotal
(# of bits)
A
15
1.38
0
15
B
7
2.48
100
21
C
6
2.70
101
18
D
6
2.70
110
18
E
5
2.96
111
15
Huffman Coding
•
산술 부호화 방법과 함께 통계적인 기법을 사용하여 부호화 함으로써, 발생 데이터
량을 최소화하고자 하는 기법
•
서로 다른 문자들을 부호화할 때 고정된 비트 수를 사용하지 않고, 통계적인 분포를
이용하여 자주 나타나는 값에는 보다 적은 비트를, 드물게 나타나는 값에는 보다 많
은 비트를 사용하여 부호화 함으로써 압축하는 기법
•
•
부호화할 문자들과 각각의 발생확률이 주어지면 최소 비트 수를 사용하여 최적의 코
드를 생성할 수 있음
이진 트리 형태의 허프만 코딩
Fourrier Transform
Reinforcement and Interference
Complex Wave
Fundamental and Spectral Frequencies
Line Spectrum
Fundamental Frequency in a Line Spectrum
DCT
• Discrete Cosine Transform (DCT):
2C (u)C (v) M 1 N 1
(2i  1)u
(2 j  1)v
F (u, v) 
cos
cos
f (i, j )

2M
2N
MN i 0 j 0
 2

C ( )   2
 1
if   0
otherwise
C (u)C (v) 7 7
(2i  1)u
(2 j  1)v
F (u, v) 
cos
cos
f (i, j )

4
16
16
i 0 j 0
• Inverse Discrete Cosine Transform (IDCT):
7
7
~
C (u)C (v)
(2i  1)u
(2 j  1)v
f (i, j )  
cos
cos
F (u, v)
4
16
16
i 0 j 0
Example 1
1D DCT:
1D IDCT:
C (u ) 7
(2i  1)u
F (u ) 
cos
f (i)

2 i 0
16
7
~
C (u )
(2i  1)uv
f (i)  
cos
F (u )
2
16
u 0
2
 (1100 1100 1100 1 100
22
 1 100 1 100 1100 1100)
 283
F1 (0) 
1

3
5
7
F1 (1)   (cos 100 cos 100 cos 100 cos 100
2
16
16
16
16
9
11
13
15
 cos 100 cos
100 cos
100 cos
100)
16
16
16
16
0
2

3
5
7
Example 2
1  (100cos  100cos  100cos  100cos
22
16
16
16
16
9
11
13
15
 100cos
 100cos
 100cos
 100cos
)
16
16
16
16
0
2 5
2 7

7
(cos   cos
)
cos

cos
1
8
8
8
8
9
11
cos2
 cos2
1
2 
2 3
2 
2 
8
8
cos
 cos
 cos
 sin
1
8
8
8
8
13
15
cos2
 sin 2
1
8
8
F2 (0) 
1


3
3
5
5
 (cos  cos  cos  cos  cos  cos
2
8
8
8
8
8
8
7
7
9
9
11
11
 cos  cos
 cos  cos  cos
 cos
8
8
8
8
8
8
13
13
15
15
 cos
 cos
 cos
 cos
) 100
8
8
8
8
1
  (1  1  1  1) 100  200
2
F2 (2) 
Example 3
F3 (0)  283,
F3 (2)  200,
F3 (1)  F3 (3)  F3 (4)    F3 (7)  0
 F3 (u )  F1 (u )  F2 (u )
Example 4
f (i)(i  0...7) : 85  65 15 30  56 35 90 60
F (u )(u  0...7) : 69  49 74 11 16 117 44  5
Example of 1D IDCT(Inverse DCT)
F (u )(u  0...7) : 69  49 74 11 16 117 44  5
~
f (i)(i  0...7) : 85  65 15 30  56 35 90 60
T (p  q)  T ( p)  T (q)
F (u)(u  0...7) : 69  49 74 11 16 117 44  5
C (0)
2
 cos0  F (0) 
1 69  24.3
2
22
C (0)
C (1)
(2i  1)
Iteration1 : f (i ) 
 cos0  F (0) 
 cos
 F (1)
2
2
16
1
(2i  1)
(2i  1)
 24.3   (49)  cos
 24.3  24.5  cos
2
16
16
C (0)
C (1)
(2i  1)
C (2)
(2i  1)
Iteration2 : f (i ) 
 cos0  F (0) 
 cos
 F (1) 
 cos
 F (2)
2
2
16
2
8
(2i  1)
(2i  1)
 24.3  24.5  cos
 37  cos
16
8
Iteration0 : f (i ) 
f (i)(i  0...7) : 85  65 15 30  56 35 90 60
 B (i)  B (i)  0
p
q
if p  q
i
 B (i)  B (i)  1
p
q
if p  q
i
(2i  1)  p
(2i  1)  q 

cos

cos


  0 if p  q
16
16

i 0 
7
(2i  1)  p C (q)
(2i  1)  q 
 C ( p)
cos

cos

 2
  1 if p  q
16
2
16

i 0 
7
x  y  (1,0,0)  (0,1,0)  0
x  z  (1,0,0)  (0,0,1)  0
y  z  (0,1,0)  (0,1,0)  0
x  x  (1,0,0)  (1,0,0)  1
y  y  (0,1,0)  (0,1,0)  1
z  z  (0,0,1)  (0,0,1)  1
1
(2 j  1)v
G(i, v)  C (v) cos
f (i, j )
2
16
j 0
7

F ( )   f (t )eit dt
eix  cos(x)  i sin(x)

7
F   f x  e
x 0

2ix
8
7
1
(2i  1)u
F (u, v)  C (u ) cos
G(i, v)
2
16
i 0
7
2x
2x
F   f x cos(
)  i  f x sin(
)
8
8
x 0
x 0
7
JPEG
• Joint Photographic Expert Group
• Motivation
– The compression ratio of lossless methods (e.g., Huffman,
Arithmetic, LZW) is not high enough for image and video
compression, especially when the distribution of pixel values is
relatively flat.
– JPEG uses transform coding, it is largely based on the following
observations:
• Observation 1: A large majority of useful image contents change
relatively slowly across images, i.e., it is unusual for intensity values to
alter up and down several times in a small area, for example, within an
8 x 8 image block. Translate this into the spatial frequency domain, it
says that, generally, lower spatial frequency components contain more
information than the high frequency components which often
correspond to less useful details and noises.
• Observation 2: Pshchophysical experiments suggest that humans are
more receptive to loss of higher spatial frequency components than
loss of lower frequency components.
JPEG (DCT)
• Discrete Cosine Transform (DCT):
2C (u)C (v) M 1 N 1
(2i  1)u
(2 j  1)v
F (u, v) 
cos
cos
f (i, j )

2M
2N
MN i 0 j 0
 2

C ( )   2
 1
if   0
otherwise
C (u)C (v) 7 7
(2i  1)u
(2 j  1)v
F (u, v) 
cos
cos
f (i, j )

4
16
16
i 0 j 0
• Inverse Discrete Cosine Transform (IDCT):
7
7
~
C (u)C (v)
(2i  1)u
(2 j  1)v
f (i, j )  
cos
cos
F (u, v)
4
16
16
i 0 j 0
JPEG
• Encoding
JPEG
• Major Steps
–
–
–
–
–
–
DCT (Discrete Cosine Transformation)
Quantization
Zigzag Scan
DPCM on DC component
RLE on AC Components
Entropy Coding
JPEG (DCT)
• The 64 (8 x 8) DCT
basis functions:
JPEG (DCT)
• Computing the DCT
• Factoring reduces problem to a series of 1D DCTs:
C (u ) 7
(2i  1)u
F (u ) 
cos
f (i)

2 i 0
16
7
~
C (u )
(2i  1)uv
f (i)  
cos
F (u )
2
16
u 0
JPEG (Quantization)
• F'[u, v] = round ( F[u, v] / q[u, v] ). Why? -- To reduce number
of bits per sample
– Example: 101101 = 45 (6 bits).
q[u, v] = 4 --> Truncate to 4 bits: 1011 = 11.
• Quantization error is the main source of the Lossy Compression.
• Uniform Quantization
– Each F[u,v] is divided by the same constant N.
• Non-uniform Quantization -- Quantization Tables
– Eye is most sensitive to low frequencies (upper left corner), less
sensitive to high frequencies (lower right corner)
• The numbers in the above quantization tables can be scaled up
(or down) to adjust the so called quality factor.
• Custom quantization tables can also be put in image/scan header.
JPEG (Quantization)
The Luminance Quantization Table q(u, v)
16
12
14
14
18
24
49
72
11
12
13
17
22
35
64
92
10
14
16
22
37
55
78
95
16
19
24
29
56
64
87
98
24
26
40
51
68
81
103
112
Eye Sensitivity
40
58
57
87
109
104
121
100
51 61
60 55
69 56
80 62
103 77
113 92
120 101
103 99
The Chrominance Quantization Table q(u, v)
17
18
24
47
99
99
99
99
18
21
26
66
99
99
99
99
24
26
56
99
99
99
99
99
47
66
99
99
99
99
99
99
99
99
99
99
99
99
99
99
99
99
99
99
99
99
99
99
99
99
99
99
99
99
99
99
99
99
99
99
99
99
99
99
JPEG (Zig-Zag Scan)
• Zig-zag Scan
– Why? -- to group low frequency coefficients in top of vector.
– Maps 8 x 8 to a 1 x 64 vector
DC
JPEG (DPCM on DC component )
• DPCM (Differential Pulse Code Modulation)
• DC component is large and varied, but often close to previous
value.
• Encode the difference from previous 8 x 8 blocks – DPCM
• DPCM 부호화/복호화 예
– 인접한 신호간에는 일반적으로 상관관계가 매우 크므로, 이들 신호 각각을
부호화하는 것 대신에 이들의 차이값을 부호화하는 방식
(a)부호화 이전의 데이터
14
19
25
36
43
55
66
52
48
34
(b) DPCM부호화 데이터
+14
+5
+6
+11
+7
+12
+11
-14
+4
-14
14
19
25
36
43
55
66
52
48
34
(c)복원된 데이터
JPEG (RLE on AC components )
• RLE (Run Length Encode)
• 1 x 64 vector has lots of zeros in it
• Keeps skip and value, where skip is the number of zeros and value
is the next non-zero component.
• Send (0,0) as end-of-block sentinel value
• Zig-Zag scanned ACs
JPEG (Entropy Coding )
• Categorize DC values into SIZE (number of bits needed to
represent) and actual bits.
• Example: if DC value is 4, 3 bits are needed.
• Send off SIZE as Huffman symbol, followed by actual 3 bits.
• For AC components two symbols are used: Symbol_1: (skip, SIZE),
Symbol_2: actual bits. Symbol_1 (skip, SIZE) is encoded using the
Huffman coding, Symbol_2 is not encoded.
• Huffman Tables can be custom (sent in header) or default.
DC (n-1)
DC (n)
Size
0
1
2
3
Amplitude
0
-1,1
-3,-2,2,3
-7,-6,-5,-4,4,5,6,7
참고 사이트
• http://www.cs.sfu.ca/mmbook/demos.h
tm
• http://www.cs.sfu.ca/CC/365/li/material
/misc/demos.html