Transcript Document

Compression
"lossless" : f[x,y]  { g[x,y] = Decompress ( Compress ( f[x,y] ) |
“lossy” : quality measures
• e2rms = 1/MN ( g[x,y] - f[x,y] )2
• SNRrms = 1/MN  g[x,y]2 / e2rms
• subjective: how does it look to the eye
• application: how does it influence the final results
for both:
• the attained compression ratio
• the time and memory needed for compression
• the time and memory needed for decompression
2007
Theo Schouten
1
Coding redundancy
Average number of bits for code 2: 2.7 bits
Compression ratio: Cr = 3/2.7 = 1.11
2007
Theo Schouten
2
Interpixel redundancy
Run Length
Encoding (RLE):
For whole binary
image:
Cr = 2.63
2007
Theo Schouten
3
Psycho-visual redundancy
2007
Theo Schouten
4
General Model
The "mapper" transforms the data to make it suitable for reducing the inter-pixel
redundancies. This step is generally reversible and can reduce the amount of data,
e.g. RLE, but not in transformations to the Fourier or Discrete Cosinus domains.
The "quantizer" reduces the precision of the output of the mapper according to the
determined reliability criteria. This especially reduces psycho-visual redundancies
and is irreversible.
The "symbol encoder" makes a static or variable length of code to represent the
quantizer's output. It reduces the coding redundancy and is reversible.
2007
Theo Schouten
5
Information theory
Questions such as: "what is the minimum amount of data needed to
represent an image" will be answered in "information theory“.
The generation of information is modeled as a statistical process that
can be measured in the same manner as the intuition of information.
An event E with a probability P(E) has:
I(E) = - logr P(E) r-ary units of information
P(E) = 1/2 then: I(E) = -log2 1/2 = 1 bit information
If a source generates symbols ai with a probability of P(ai), then the
average information per output is:
H(z) = -  P(ai) logr P(ai)
the insecurity of entropy of the source
This is maximal when every symbol has an equal probability (1/N). It
indicates the minimal average length (for r=2 in bits per symbol)
needed to code the symbols.
2007
Theo Schouten
6
Huffman coding
Under the condition that the symbols are coded one by one, an
optimal code for the set of symbols and probabilities is generated.
block code:
•every source symbol is mapped to a static order of code symbols
•instantaneous code: every code is decoded without reference to the
previous symbols
•and is uniquely decodable
2007
Theo Schouten
7
Lempel Ziv Welch coding
This translates variable length arrays of source symbols (with
about the same probability) to a static (or predictable) code length.
The method is adaptive: the table with symbol arrays is built up in
one pass over the data set during both compression and
decompression.
Just as Huffman, this is a symbol encoder which can be used
directly on the input or after a mapper and quantizer.
It is used in GIF, TIFF, PDF and in Unix compress.
2007
Theo Schouten
8
Predictive coding
1D: pn = round ( i=1m ai fn-i ) , first f must be passed in another way
2D: p(x,y)= round (a1f[x,y-1] + a2 f[x-1,y])
2007
Theo Schouten
9
Lossy predictive coding
2007
Theo Schouten
10
Delta modulation
Delta Modulation is a simple but well known form of it:
pn =  pin with  < 1 (here, pi stands for the "predictor input")
qn =  sign( en) and can be represented by a 1-bit value: -  or +
2007
Theo Schouten
11
Differential Pulse Code modulation
With DPCM, pn = i=1m i pin-i. Under the assumption that the
quantization error (en-qn) is small, the optimal values of i can be
found by minimizing E{en2} = E{ [fn-pn]2 } .
These calculations are almost never done for each single image but
rather for a few typical images or models of them.
Original image
2007
Theo Schouten
12
4 prediction methods
fig. 8.24
prediction
0.97 f(x,y-1)
0.5(f(x,y-1)+f(x-1,y))
0.75 (f(x,y-1)+f(x-1,y))
-0.5f(x-1,y-1)
0.97 f(x,y-1) or
0.97 f(x-1,y)
2007
Theo Schouten
13
Lloyd-Max quantizer
Instead of one level more levels can be used. They might be
unequal, e.g. factor 2 beween them. With a Lloyd-Max quantizer
the steps are optimized to achieve a minimum error.
Adjusting the level  ( for each n, e.g. 16 pixels) with a restricted
amount (for example 4) scale factors yields a substantial
improvement of the error in the decoded image with a small
reduction of the compression ratio.
2007
Theo Schouten
14
Adaption
Using the 3 point prediction with
the best of 4 quantizers per 16 bits
error *8 for compression in
bits/pixel:
1.0
2.0
3.0
2007
Theo Schouten
1.125
2.125
3.125
15
Transform coding
JPEG makes use of 8*8 sub-images, DCT transformation, quantization of the 64
coeffients by dividing with a quantization matrix [e.g. fig. 8.37b ], a zigzag ordering
[fig. 8.36d] of the matrix followed by a Huffman encoder, separately for the DC
component.
It uses a YUV color model, for the U and V component blocks of 2 by 2 pixels are
combined into 1 pixel. The quantization matrixes can be scaled to yield several
compression ratios. There are standard coding tables and quantization matrices, but
the user can also indicate others to obtain better results for a certain image.
2007
Theo Schouten
16
examples
25%
DCT
8x8
DCT +
norm arr
zoomed
org
2x2
4x4
8x8
34:1 (3.42)
2007
Theo Schouten
67:1 (6.33)
17
Wavelet transform
Type wavelet
2007
Operaties per pixel
Theo Schouten
Aantal 0’s (<1,5)
18
4 wavelets
2007
Theo Schouten
19
Wavelet Compression ratio’s
34:1 (2.29)
2007
67:1 (2.96)
108:1 (3.72)
Theo Schouten
167:1 (4.73)
20
JPEG 2000
•
•
•
•
Uses wavelets (optionally on parts, tiles, of image)
different for error-free and lossy compression
gray and color images (upto 16 bit signed values)
conversion to (about) YCbCr color space
– Cb, Cr components peak around 0
• complicated coding of wavelet values
– organised in layers and finally packets
– allowing more and more refined decoding (and storage)
– and access to parts of the image
2007
Theo Schouten
21
Fractal compression
GIF original Image (161x261 pixels, 8 bits/pixel), JPEG compression
15:1, JPEG compression 36:1, Fractal compression 36:1.
2007
Theo Schouten
22
MPEG (1,2,4) video
•I-frame (Intraframe or independent frame), JPEG like
•P-frame (predictive frame): difference between frame and prediction from I-frame,
motion compensated
•B-frame (bidirectional): previous I or P and next P
2007
Theo Schouten
23
File formats
The header contains information about:
•type: black and white, 8-bit gray level/color, 3-byte color
•size: number of rows, columns and bands, number of images
•compression method, possible parameters thereof
•data format: for example band or colors per pixel or separated
•origin of the image or conditions during acquisition
•manipulations previously done on the image
A few well-known formats are:
•GIF for binary, gray level and 8-bits color images
•TIFF a multi-type format with many possibilities
•JFIF: JPEG coded, for color or gray images of natural origin
•MPEG for a series of images
•PBM, PGM, PPM: the PBMPLUS formats
•BMP: Microsoft's format
2007
Theo Schouten
24