Temple University – CIS Dept. CIS661 – Principles of Data

Download Report

Transcript Temple University – CIS Dept. CIS661 – Principles of Data

CIS750 – Seminar in Advanced Topics
in Computer Science
Advanced topics in databases –
Multimedia Databases
V. Megalooikonomou
Data Compression
General Overview

Data Compression




Run Length Coding
Huffman Coding
PCM, DPCM, ADPCM
Quantization




Scalar quantization
Vector quantization
Image Compression
Video Compression
Data Compression





Why data compression?
Storing or transmitting multimedia data requires large
space or bandwidth
The size of one hour 44K sample/sec 16-bit stereo (two
channels) audio is 3600x44000x2x2= 633.6MB, which
can be recorded on one CD (650 MB). MP3 compression
can reduce this number by factor of 10
The size of a 500x500 color image is 750KB without
compression (JPEG can reduced this by a factor of 10 to
20)
The size of one minute real-time, full size, color video
clip is 60x30x640x480x3= 1.659GB. A two-hour movie
requires 200GB. MPEG2 compression can bring this
number down to 4.7 GB (DVD)
Data Compression
Run length coding
Example:
 A scanline of a binary image is 00000 00000
00000 00000 00010 00000 00000 01000 00000
00000
 Total of 50 bits
 However, strings of consecutive 0’s or 1’s can be
represented
 more efficiently 0(23) 1(1) 0(12) 1(1) 0(13)
 If the counts can be represented using 5 bits,
then we can reduce the amount of data to
5+5*5=30 bits. A compression ratio of 40%
Huffman coding






The basic idea behind Huffman coding algorithm
is to assign shorter codewords to more
frequently used symbols
Example: let there be 4 letters in language “A”,
“B”, “S”, “Z”
To uniquely encode each letter, we need two
bits:
A- 00 B-01 S-10 Z-11
A message “AAABSAAAAZ” is encoded with 20
bits
Now how about assign A- 0, B-100, S-101, Z-11
The same message can be encoded using 15
Huffman coding

Given a set of N symbols S={si, i=1,…N} with probabilities of
occurrence Pi, i=1,…N, find the optimal encoding of the
symbol to achieve the minimum transmission rate
(bits/symbol)

Algorithm:




Each symbol is a leaf node in a tree
Combining the two symbols or composite symbols with
the least probabilities to form a new parent composite
symbol, which has the combined probabilities. Assign a
bit 0 and 1 to the two links
Continue this process until all symbols are merged into
one root node (bottom up)
For each symbol, the sequence of the 0s and 1s from the
root node to the symbol is the codeword
Pulse code modulation

The process of digitizing audio signal is
called pulse code modulation



Sampling the analog waveform at a minimum
rate
Each sample is quantized using a fixed
number of bits
To reduce the amount of data, we can


Reduce the sampling rate
Reduce the number of bits per sample
Differential Pulse Code Modulation
(DPCM)




Encode the changes between consecutive samples
Example
The value of the differences between samples are much
smaller than those of the original samples. Less bits are
used to encode the signal (e.g., 7 bits instead of 8 bits)
For decoding the difference is added to the previous
sample to obtain the value of the current sample. Lossless
coding is achieved
Adaptive Differential Pulse Code
Modulation (ADPCM)


One observation is that a small difference between
samples happens more often than large changes
An Entropy coding method such the Huffman
coding scheme can be used to encode the
difference for additional efficiency



The probabilities of occurrence of different differences
are first obtained using a large data base
Huffman coding is used to determine the codeword for
each difference
The codeword is fixed and made available to decoders
Linear Predictive Coding (LPC)



In DPCM, the value of the current sample is
guessed based on the previous sample. Can a
better prediction be made ?
Yes! For example, we can use the previous two
samples to predict the current one
LPC is more general than DPCM. It exploits the
correlation between multiple consecutive samples
General Overview

Data Compression




Run Length Coding
Huffman Coding
PCM, DPCM, ADPCM
Quantization




Scalar quantization
Vector quantization
Image Compression
Video Compression
Quantization
One-dimensional quantizer interval endpoints and levels indicated on a horizontal
line



Quantization is the discretization of a continuousalphabet source (signal)
X: original value, X’: codeword, Q: quantizer
Distortion:


d(X,X’): a measure of overall quality degradation due to Q
Mean Squared Error (MSE): E[(X-X’)2]
Scalar Quantization
One-dimensional quantizer interval endpoints and levels indicated on a horizontal
line



Approximates a source symbol by its closest representative
from a codebook
An N-point scalar quantizer Q is a mapping:
Q: R  C
where R is the real line and C={y1, y2,…, yN}  R is the
codebook of size N
Q(x)=D(E(x))
where E: R I is the encoder and D: IC is the decoder
I={1,2,…,N}
Vector Quantization


VQ: a generalization of scalar quantization to quantization
of a vector
VQ is superior to scalar quantization. Why?
Vector Quantization


VQ: a generalization of scalar quantization to quantization
of a vector
VQ is superior to scalar quantization. Why?




Exploits linear and non-linear dependence that exists among the
components of a vector
VQ is superior even when the components of the random vector are
statistically independent of each other. How?
A vector quantizer Q of dimension k and size N is a
mapping from a vector (or a “point” in Rk), into a finite set
C={y1, y2,…, yN}, yiRk, the codebook of size N
Q: RkC
It partitions Rk into N regions or cells, Ri for iJ{1,2,…,N}
Ri={x Rk: Q(x)=yi}
Motivation for Vector Quantization
Scalar quantization
Plot two successive values as a single vector (x(1), x(2))
Scalar quantization
Vector quantization
Scalar vs Vector Quantization
Vector Quantization - Design

The goal is to find:



A codebook (decoder) – representation levels
A partition rule (encoder) – decision levels
To maximize an overall measure of
performance
VQ Design – Optimality Conditions
Nearest Neighbor Condition

For a given codebook, C, the optimal regions {Ri:
i=1,…,N} satisfy the condition:
Ri  {x: d(x,yi)  d(x,yj); j}
That is
Q(x)=yi only if d(x,yi)  d(x,yj) j
VQ Design – Optimality Conditions
Centroid Condition

For given partition regions {Ri:i=1,…,N} the optimal
codewords satisfy the condition:
yi = cent(Ri)
For the SE measure, the centroid of a set R is the
arithmetic average:
1
cent( R) 
R
for R={xi : i = 1,…,|R|}
R
x
i 1
i
VQ Design –
The Generalized Lloyd Algorithm (GLA)


It produces a locally optimal codebook from a training
sequence, T. It starts with an initial codebook and iteratively
improves it.
Lloyd Iteration
Given a codebook Cm={yi} generate an improved codebook
Cm+1 as follows:



Partition T into cells Rj using the Nearest Neighbor Condition:
Ri={x : d(x,yi)  d(x,yi);  ji}
Using the Centroid Condition compute the centroids of the cells just
found to obtain the new codebook, Cm+1={cent(Ri)}
Compute the average distortion for Cm+1 , Dm+1. If the
fractional drop (Dm- Dm+1) / Dm is below a certain threshold
stop, else continue with mm+1
VQ Design –
The Generalized Lloyd Algorithm (GLA)
To solve the initial codebook
generation problem a
partition split mechanism is
used. How?
VQ Design –
The Generalized Lloyd Algorithm (GLA)
To solve the initial codebook
generation problem a
partition split mechanism is
used. How?
Starts with a codebook
containing only one
codeword (which one?) In
each repetition and before
the application of the Lloyd
iteration, it doubles the
number of codewords from
the previous iteration
See: http://www.data-compression.com/vq.html#animation
General Overview

Data Compression




Run Length Coding
Huffman Coding
PCM, DPCM, ADPCM
Quantization




Scalar quantization
Vector quantization
Image Compression
Video Compression
Image Compression





From the 1D case, we observe that data
compression can be achieved by exploiting the
correlation between samples
This idea is applicable to 2D signals as well.
Instead of predicting sample values, we can use
the so called transformation method to obtain a
more compact representation
Discrete Cosine Transform (DCT)
DCT is the real part of the 2D Fourier Transform
Discrete Cosine Transform (DCT)

DCT

Inverse DCT
DCT transform of 2D Images

DCT Example

DCT of images can also be considered as the
projection of the original image into the DCT basis
functions. Each basis function is in the form of
DCT transform of 2D Images

The basis functions for an 8x8 DCT
DCT compression of 2D Images

After DCT compression, only a few DCT
coefficients have large values

We need to:



Quantize the DCT coefficients
Encode the position of the large coefficients
Compress the value of the coefficients