Lossy Compression
Download
Report
Transcript Lossy Compression
Lossy Compression
CIS 465
Spring 2013
Lossy Compression
• In order to achieve higher rates of
compression, we give up complete
reconstruction and consider lossy
compression techniques
• So we need a way to measure how good
the compression technique is
How close to the original data the
reconstructed data is
Distortion Measures
• A distortion measure is a mathematical
quality that specifies how close an
approximation is to its original
Difficult to find a measure which corresponds
to our perceptual distortion
The average pixel difference is given by the
Mean Square Error (MSE)
Distortion Measures
• The size of the error relative to the signal
is given by the signal-to-noise ratio (SNR)
• Another common measure is the peaksignal-to-noise ratio (PSNR)
Distortion Measures
• Each of these last two measures is
defined in decibel (dB) units
1 dB is a tenth of a bel
If a signal has 10 times the power of the error,
the SNR is 20 dB
The term “decibels” as applied to sounds in
our environment usually is in comparison to a
just-audible sound with frequency 1kHz
Rate-Distortion Theory
We trade off rate (number of bits per symbol)
versus distortion this is represented by a ratedistortion function R(D)
Quantization
• Quantization is the heart of any scheme
The sources we are compressing contains a
large number of distinct output values (infinite
for analog)
We compress the source output by reducing
the distinct values to a smaller set via
quantization
Each quantizer can be uniquely described by
its partition of the input range (encoder side)
and set of output values (decoder side)
Uniform Scalar Quantization
• The inputs and output can be either scalar
or vector
• The quantizer can partition the domain of
input values into either equally spaced or
unequally spaced partitions
• We now examine uniform scalar
quantization
Uniform Scalar Quantization
• The endpoints of partitions of equally
spaced intervals in the input values of a
uniform scalar quantizer are called
decision boundaries
The output value for each interval is the
midpoint of the interval
The length of each interval is called the step
size
A UQT can be midrise or midtread
Uniform Scalar Quantization
Uniform Scalar Quantization
• Midtread quantizer
Has zero as one of its output values
Has an odd number of output values
• Midrise quantizer
Has a partition interval that brackets zero
Has an even number of output values
• For = 1
Uniform Scalar Quantization
• We want to minimize the distortion for a
given input source with a desired number
of output values
Do this by adjusting the step size to match
the input statistics
Let B = {b0, b1, …, bM } be the set of decision
boundaries
Let Y = {y1, y2, …, yM } be the set of
reconstruction or output values
Uniform Scalar Quantization
• Assume the input is uniformly distributed
in the interval [-Xmax, Xmax]
Then the rate of the quantizer is
R = log2 M
R is the number of bits needed to code the M
output values
The step size is given by
= 2Xmax/M
Quantization Error
• For bounded input, the quantization error
is referred to as granular distortion
That distortion caused by replacing a whole
range of values from a maximum values to ∞
(and also on the negative side) is called the
overload distortion
Quantization Error
Quantization Error
• The decision boundaries bi for a midrise
quantizer are
[(i - 1), i], i = 1 .. M/2 (for positive data X)
• Output values yi are the midpoints
i - /2, i = 1.. M/2 (for positive data)
• The total distortion (after normalizing) is
twice the sum over the positive data
Quantization Error
• Since the reconstruction values yi are the
midpoints of each interval, the quantization
error must lie within the range [-/2, /2]
As shown on a previous slide, the quantization
error is uniformly distributed
Therefore the average squared error is the same
as the variance (d)2 of from just the interval [0,
] with errors in the range shown above
Quantization Error
• The error value at x is e(x) = x - /2, so the
variance is given by
(d)2 = (1/) 0 (e(x) - e)2 dx
(d)2 = (1/) 0 [x - (/2) - 0]2 dx
(d)2 = 2/12
Quantization Error
• In the same way, we can derive the signal
variance (x)2 as (2Xmax)2/12, so if the
quantizer is n bits, M = 2n then
SQNR = 10log10[(x)2/(d)2]
SQNR = 10log10 {[(2Xmax)2/12][12/2]}
SQNR = 10log10 {[(2Xmax)2/12][12/ (2Xmax)2]}
SQNR = 10log10M2 = 20n log102
SQNR = 6.02n dB
Nonuniform Scalar Quantization
• A uniform quantizer may be inefficient for
an input source which is not uniformly
distributed
Use more decision levels where input is
densely distributed
This lowers granular distortion
Use fewer where sparsely distributed
Total number of decision levels remains the same
This is nonuniform quantization
Nonuniform Scalar Quantization
• Lloyd-Max quantization
iteratively estimates optimal boundaries based on
current estimates of reconstruction levels then
updates the level and continues until levels converge
• In companded quantization
Input mapped using a compressor function G then
quantized using a uniform quantizer
After transmission, quantized values mapped back
using an expander function G-1
Companded Quantization
• The most commonly used companders are
u-law and A-law from telephony
More bits assigned where most sound occurs
Transform Coding
• Reason for transform coding
Coding vectors is more efficient than coding
scalars so we need to group blocks of
consecutive samples from the source into
vectors
If Y is the result of a linear transformation T of
an input vector X such that the elements of Y
are much less correlated than X, then Y can
be coded more efficiently than X.
Transform Coding
• With vectors of higher dimensions, if most
of the information in the vectors is carried
in the first few components we can roughly
quantize the remaining elements
• The more decorrelated the elements are,
the more we can compress the less
important elements without affecting the
important ones.
Discrete Cosine Transform
• The Discrete Cosine Transform (DCT) is a
widely used transform coding technique
Spatial frequency indicates how many times
pixel values change across an image block
The DCT formalizes this notion in terms of
how much the image contents change in
correspondence to the number of cycles of a
cosine wave per block
Discrete Cosine Transform
• The DCT decomposes the original signal
into its DC and AC components
Following the techniques of Fourier Analysis,
any signal can be described as a sum of
multiple signals that are sine or cosine
waveforms at various amplitudes and
frequencies
• The inverse DCT (IDCT) reconstructs the
original signal
Definition of DCT
• Given an input function f(i, j) over two input variables, the
2D DCT transforms it into a new function F(u, v), with u
and v having the same range as i and j. The general
definition is
• Where i,u = 0, 1, … M-1, j, v = 0, 1, … N-1; and the
constants C(u) and C(v) are defined by:
Definition of DCT
• In JPEG, M = N = 8, so we have
• The 2D IDCT is quite similar
• with i, j, u, v = 0, 1, …,7
Definition of DCT
• These DCTs work on 2D signals like
images.
• For one dimensional signals we have
Basis Functions
• The DCT and IDCT use the same set of
cosine functions - the basis functions
Basis Functions
DCT Examples
DCT Examples
• The first example on the previous slide
has a constant value of 100
Remember - C(0) = sqrt(2)/2
Remember - cos(0) = 1
F1(0) = [sqrt(2)/(22)] (1100 +1100 +
1100 + 1100 + 1100 +1100 + 1100)
283
DCT Examples
• For u = 1, notice that cos(/16) = -cos(15/16),
cos(3/16)= -cos(13/16), etc. Also, C(1). So we have:
• F1(1) = (1/2) [cos(/16) 100 + cos(3/16) 100 +
cos(5/16) 100 + cos(7/16) 100 + cos(9/16) 100
+ cos(11/16) 100 + cos(13/16) 100 + cos(15/16)
100] = 0
• The same holds for F1(2), F1(3), …, F1(7) each = 0
DCT Examples
• The second example shows a discrete
cosine signal f2(i) with the same frequency
and phase as the second cosine basis
function and amplitude 100
When u = 0, all the cosine terms in the 1D
DCT equal 1.
Each of the first four terms inside of the
parentheses has an opposite (so they cancel).
E.g. cos /8 = -cos 7/8
DCT Examples
• F2(0) = [sqrt(2)/(22)] 1 [100cos(/8) + 100cos(3/8)
+ 100cos(5/8) + 100cos(7/8) + 100cos(9/8) +
100cos(11/8) + 100cos(13/8) + 100cos(15/8)]
• =0
• Similarly, F2(1), F2(3), F2(4), …, F2(7) = 0
• For u = 2, because cos(3/8) = sin(/8)
• we have cos2(/8) + cos2(3/8) = cos2(/8) + sin2(/8) =
1
• Similarly, cos2(5/8) + cos2(7/8) = cos2(9/8) +
cos2(11/8) = cos2(13/8) + cos2(15/8) = 1
DCT Examples
• F2(2) =1/2 [cos(/8) cos(/8) + cos(3/8) cos(3/8)
+ cos(5/8) cos(5/8) + cos(7/8) cos(7/8) +
cos(9/8) cos(9/8) + cos(11/8) cos(11/8) +
cos(13/8) cos(13/8) + cos(15/8) + cos(15/8)]
100
• = 1/2 (1 + 1 + 1 + 1) 100
DCT Examples
DCT Characteristics
• The DCT produces the frequency
spectrum F(u) corresponding to the spatial
signal f(i)
The 0th DCT coefficient F(0) is the DC
coefficient of f(i) and the other 7 DCT
coefficients represent the various changing
(AC) components of f(i)
0th component represents the average value of f(i)
DCT Characteristics
• The DCT is a linear transform
A transform is linear iff
Where and are constants and p and q are
any functions, variables or constants.
Cosine Basis Functions
• For better decomposition, the basis
functions should be orthognal, so as to
have the least amount of redundancy
• Functions Bp(i) and Bq(i) are orthognal if
Where “.” is the dot product
Cosine Basis Functions
• Further, the functions Bp(i) and Bq(i) are
orthonormal if they are orthognal and
Orthonormal property guarantees
reconstructability
It can be shown that
Graphical Illustration of 2D DCT
Basis Functions
2D Separable Basis
• With block size 8, the 2D DCT can be
separated into a sequence of 2 1D DCT
steps (Fast DCT)
This algorithm is much more efficient (linear
vs. quadratic)
Discrete Fourier Transform
• The DCT is comparable to the more widely
known (in mathematical circles) Discrete
Fourier Transform
Other Transforms
• The Karhunen-Loeve Transform (KLT) is a reversible
linear transform that optimally decorrelates the input
• The wavelet transform uses a set of basis functions
called wavelets which can be implemented in a
computationally efficient manner by means of multiresolution analysis