CM613 - Multimedia storage and retrieval lecture: Lossy

Download Report

Transcript CM613 - Multimedia storage and retrieval lecture: Lossy

CM613 Multimedia storage and retrieval
Lecture: Lossy Compression
CM613 Multimedia storage and retrieval
Lossy Compression
D.Miller
Slide 1
CM613 Multimedia storage and retrieval
Lecture: Lossy Compression
What is “lossy compression”
Trade-offs
• A design trade-off is necessary when there is a contradiction in
requirements:
• Examples:
• Cost versus quality
• Cost versus performance
• Usability for occasional or inexperienced users versus power
or flexibility for expert users.
• Size or download time versus resolution of image
• Decisions are made using criteria based on sufficiency for purpose.
Slide 2
CM613 Multimedia storage and retrieval
Lecture: Lossy Compression
What is “lossy compression”
In lossy compression, as opposed to “Lossless compression”
data is compressed then decompressing so that the retrieved data,
although different from the original, is still "close enough" to be useful in some way.
Depending on the design of the format lossy data compression often suffers from
generation loss,
that is compressing and decompressing multiple times will do
more damage to the data than doing it once.
http://en.wikipedia.org/wiki/Lossy_compression
Slide 3
CM613 Multimedia storage and retrieval
Lecture: Lossy Compression
What is “lossy compression”
There are two basic lossy compression schemes:
• Transform
• In lossy transform codecs,
•samples of picture or sound are taken,
•chopped into small segments,
•transformed into a new basis space,
•and quantized.
•The resulting quantized values are then given variable length codes,
depending on their frequency of occurrence (entropy coding).
http://en.wikipedia.org/wiki/Lossy_compression
Slide 4
CM613 Multimedia storage and retrieval
Lecture: Lossy Compression
What is “lossy compression”
The second type of scheme is:
• Predictive:
•In lossy predictive codecs, previous and/or subsequent decoded data is
used to predict the current sound sample or image frame.
•The error between the predicted data and the real data, together with any
extra information needed to reproduce the prediction, is then quantized and
coded.
•In some systems transform and predictive techniques are combined, with transform
codecs being used to compress the error signals generated by the predictive stage.
http://en.wikipedia.org/wiki/Lossy_compression
Slide 5
CM613 Multimedia storage and retrieval
Lecture: Lossy Compression
How it works: human perception perspective
JPEG exploits the characteristics of human vision, eliminating or reducing data to
which the eye is less sensitive.
JPEG works well on grayscale and color images, especially on photographs, but it
is not intended for two-tone images.
http://www.planetanalog.com/features/OEG20030122S0063
Original Lena
Image (12KB
size)
Lena Image, Compressed
(85% less information,
1.8KB)
Lena Image, Highly
Compressed (96% less
information, 0.56KB)
Slide 6
CM613 Multimedia storage and retrieval
Lecture: Lossy Compression
How it works: technical process
Example: JPEG
Slide 7
CM613 Multimedia storage and retrieval
Lecture: Lossy Compression
JPEG (Joint Photographic Experts Group)
JPEG (pronounced jay-peg) is a most commonly used standard method of lossy
compression for photographic images.
JPEG itself specifies only how an image is transformed into a stream of bytes, but not
how those bytes are encapsulated in any particular storage medium.
A further standard, created by the Independent JPEG Group, called JFIF (JPEG File
Interchange Format) specifies how to produce a file suitable for computer storage and
transmission from a JPEG stream.
In common usage, when one speaks of a "JPEG file" one generally means a JFIF file,
or sometimes an Exif JPEG file.
JPEG/JFIF is the format most used for storing and transmitting photographs on the
web.. It is not as well suited for line drawings and other textual or iconic graphics
because its compression method performs badly on these types of images
Slide 8
CM613 Multimedia storage and retrieval
Lecture: Lossy Compression
http://www.planetanalog.com/features/OEG20030122S0063
Baseline JPEG compression
Slide 9
CM613 Multimedia storage and retrieval
Lecture: Lossy Compression
Y = luminance
Cr, Cb = chrominance
YCbCb colour space is based on YUV colour space
YUV signals are created from an original RGB (red, green and blue) source. The weighted
values of R, G and B are added together to produce a single Y (lumsignal, representing
the overall brightness, or luminance, of that spot. The U signal is then created by
subtracting the Y from the blue signal of the original RGB, and then scaling; and V by
subtracting the Y from the red, and then scaling by a different factor. This can be
accomplished easily with analog circuitry.
The following equations can be used to derive Y, U and V from R, G and B:
Y= 0.299R + 0.587G + 0.114B
U= 0.492(B − Y)= − 0.147R − 0.289G + 0.436B
V= 0.877(R − Y)= 0.615R − 0.515G − 0.100B
Slide 10
CM613 Multimedia storage and retrieval
Lecture: Lossy Compression
Discrete cosine transform
DCT transforms the image from the spatial
domain into the frequency domain
Next, each component (Y, Cb, Cr) of the image
is "tiled" into sections of eight by eight pixels
each, then each tile is converted to frequency
space using a two-dimensional forward discrete
cosine transform (DCT, type II).
The 64 DCT basis functions
Slide 11
CM613 Multimedia storage and retrieval
Lecture: Lossy Compression
…the coefficient in the output DCT matrix at
(2,1) corresponds to the strength of the
correlation between the basis function at
(2,1) and the entire 8x8 input image block.
The coefficients corresponding to highfrequency details are located to the right
and bottom of the DCT block, and it is
precisely these weights which we try to
nullify -- the more zeroes in the 8x8 DCT
block, the higher the compression that is
achieved.
http://www.planetanalog.com/features/OEG20030122S0063
In the Quantization step below, we'll
discuss how to maximize the number of
zeroes in the matrix.
Slide 12
CM613 Multimedia storage and retrieval
Lecture: Lossy Compression
Quantization
This is the main lossy operation in the whole process.
After the DCT has been performed on the 8x8 image block, the results are quantized
in order to achieve large gains in compression ratio.
Quantization refers to the process of representing the actual coefficient values as one
of a set of predetermined allowable values, so that the overall data can be encoded in
fewer bits (because the allowable values are a small fraction of all possible values).
The aim is to greatly reduce the amount of
information in the high frequency components.
This is done by simply dividing each component in
the frequency domain by a constant for that
component, and then rounding to the nearest integer.
As a result of this, it is typically the case that many of
the higher frequency components are rounded to
zero, and many of the rest become small positive or
negative numbers.
Example of a quantizing matrix
Slide 13
CM613 Multimedia storage and retrieval
Lecture: Lossy Compression
Quantization
Quantization is the key irreversible step in the jpeg process.
Jpeg Quality Settings
Typically the only thing that the user can control in Jpeg compression is the
quality setting (and rarely the chroma sub-sampling).
The value chosen is used in the quantization stage, where less common values
are discarded by using tables tuned to visual perception.
This reduces the amount of information while preserving the perceived quality.
http://www.photo.net/learn/jpeg/#percep
Slide 14
CM613 Multimedia storage and retrieval
Lecture: Lossy Compression
Zig-zag sorting
The quantized data needs to be in an
efficient format for encoding.
The quantized coefficients have a
greater chance of being zero as the
horizontal and vertical frequency values
increase.
To exploit this behavior, we can
rearrange the coefficients into a onedimensional array sorted from the DC
value to the highest-order spatial
frequency coefficient
Slide 15
CM613 Multimedia storage and retrieval
Lecture: Lossy Compression
The first element in each 64x1 array represents
the DC coefficient from the DCT matrix, and the
remaining 63 coefficients represent the AC
components.
These two types of information are different
enough to warrant separating them and applying
different methods of coding to achieve optimal
compression efficiency.
All of the DC coefficients (one from each DCT
output block) must be grouped together in a
separate list.
At this point, the DC coefficients will be encoded
as a group and each set of AC values will be
encoded separately.
Slide 16
CM613 Multimedia storage and retrieval
Lecture: Lossy Compression
Coding the DC Coefficients
The DC components represent the intensity of each 8x8 pixel block.
Because of this, significant correlation exists between adjacent blocks.
So, while the DC coefficient of any given input array is fairly unpredictable by itself, real
images usually do not vary widely in a localized area.
As such, the previous DC value can be used to predict the current DC coefficient value.
By using a differential prediction model (DPCM), we can increase the probability that
the value we encode will be small, and thus reduce the number of bits in the
compressed image.
To obtain the coding value we simply subtract the DC coefficient of the previously
processed 8x8 pixel block from the DC coefficient of the current block. This value is
called the "DPCM difference".
Once this value is calculated, it is compared to a table to identify the symbol group to
which it belongs (based on its magnitude), and it is then encoded appropriately using
an entropy encoding scheme such as Huffman coding.
Slide 17
CM613 Multimedia storage and retrieval
Lecture: Lossy Compression
Coding the AC Coefficients (Run-Length Coding)
Because the values of the AC coefficients tend towards zero after the quantization step,
these coefficients are run-length encoded.
The concept of run-length encoding is a straightforward principle.
In real image sequences, pixels of the same value can always be represented as
individual bytes, but it doesn't make sense to send the same value over and over again.
For example, we have seen that the quantized output of the DCT blocks produces many
zero-value bytes.
The zig-zag ordering helps produce these zeros in groups at the end of each sequence.
Instead of coding each zero individually, we simply encode the number of zeros in a
given 'run.'
This run-length coded information is then variable-length coded (VLC), usually using
Huffman codes.
Slide 18
CM613 Multimedia storage and retrieval
Lecture: Lossy Compression
Entropy encoder
This is a final lossless compression performed on the quantized DCT
coefficients to increase the overall compression ratio achieved.
Entropy encoding is a compression
technique that uses a series of bit codes to
represent a set of possible symbols.
Fixed length codes are most often applied in systems where each of the
symbols occurs with equal probability.
In reality, most symbols do not occur with equal probability.
Example of a fixed length
code
In these cases, we can take advantage of this fact and reduce the average number of bits used to
compress the sequence.
The length of the code that is used for each symbol can be
varied based on the probability of the symbol's occurrence.
By encoding the most common symbols with shorter bit
sequences and the less frequently used symbols with longer
bit sequences, we can easily improve on the average
number of bits used to encode a sequence.
Example of entropy encoding
with weighted symbol probabilities.
Slide 19
CM613 Multimedia storage and retrieval
Lecture: Lossy Compression
JPEG File Interchange Format (JFIF)
The encoded data is written into the JPEG File Interchange Format (JFIF), which, as the
name suggests, is a simplified format allowing JPEG-compressed images to be shared
across multiple platforms and applications.
JFIF includes embedded image and coding parameters, framed by appropriate header
information.
Specifically, aside from the encoded data, a JFIF file must store all coding and
quantization tables that are necessary for the JPEG decoder to do its job properly.
Slide 20
CM613 Multimedia storage and retrieval
Lecture: Lossy Compression
So now we can all grow beards!
Quality factor =20
Quality factor = 5
Quality factor = 3
http://www.imaging.org/resources/jpegtutorial/jpgimag1.cfm
Slide 21
CM613 Multimedia storage and retrieval
Lecture: Lossy Compression
Sources
http://www.answers.com/topic/ycbcr-sampling
http://www.answers.com/main/ntquery?method=4&dsid=1512&dekey=YUV&gwp=8&curtab=1512_1&linktext=YUV
http://www.imaging.org/resources/jpegtutorial/jpgimag1.cfm
http://www.photo.net/learn/jpeg/#percep
http://www.planetanalog.com/features/OEG20030122S0063
http://en.wikipedia.org/wiki/Lossy_compression
end
Slide 22