Transcript General
Addressing Image Compression
Techniques on current Internet
Technologies
By: Eduardo J. Moreira &
Onyeka Ezenwoye
CIS-6931
Term Paper
Introduction
Data compression focuses on the assumption that
when transmitting data whether it be images,
music, video and so on, one can benefit in the size
and transmission times associated with such
endeavors.
Topics of Discussion
•
•
•
•
•
•
•
•
Compression Types
Run Length Encoding
Huffman Coding
PNG 0 (Portable Network Graphics)
JPEG Graphics Format
GIF Graphics Format
MPEG Moving Picture Expert Group
Conclusion
Compression Types
Lossless – Recover of the exact original data after
compression will be.
Lossy – Certain loss of accuracy in exchange for a
substantial increase in compression.
Run Length Encoding
Lossles
A simple technique achieves up to an 8:1
compression ratio.
Replacing multiple occurrences of a symbol with
one copy and a count of how many times that
symbol appears.
ex. AAABBBCCCCCCCCCDDDDD encoded as
3A3B9C5D
Run Length Encoding
Can also be used to compress digital images by
comparing pixels that appear adjacent to each
other and only store the changes.
Pros – effective for encoding images with large
white spaces or large homogeneous areas.
Run Length Encoding
Cons – Does not work well when encoding files
that contain even a small degree of variations
among pixels. Because it uses 2 bytes to represent
each symbol, these cases can actually cause an
increase in file size.
Huffman Coding
Based on creating a variable length character code
from frequent occurring characters.
Avg. compression 25%, Maximum 50% - 60%
Code words are composed of variable length
binary strings.
Binary
string mapped to a different character within the
file.
Frequency distribution of characters created.
Decide which code words will be used for each symbol.
Huffman Coding
Example
USING HUFFMAN CODES
Character
Codeword Space required to represent all file characters
(String length) * (frequency of characters in file)
________________________________________________________________________
c
0
1*(100,000)
d
101
3*(30,000)
y
100
3*(5,000)
t
111
3*(1,000)
r
1101
4*(50)
z
1100
4*(25)
________________________________________________________________________
Total Bits Required
208,300
________________________________________________________________________
Dictionary Based Compression
Encode variable length strings of symbols as
single tokens.
Tokens forms an index to a phrase dictionary.
If tokens are smaller than the phrases, they replace
the phrases and compression occurs.
LZ77 is a sliding window technique in which the
dictionary consists of a set of fixed length phrases
found in a window into the previously seen text.
LZ78 builds phrases up one symbol at a time,
adding a new symbol to an existing phrase when a
match occurs.
JPEG Graphics Format
Lossy
Developed to compress gray-scale or color
images.
Stores 24bit color per pixel.
JPEG can:
Achieve
10:1 up to 20:1 compression without visible
loss.
Achieve 30:1 up to 50:1 compression with small to
moderate loss of quality
Achieve up to 100:1 for usage such as previews where
low quality is not an issue.
JPEG Graphics Format
Drawback in time needed to decode and view the
image.
Well suited for real world photographs, scenic
depictions of nature.
Not been shown to work well with line drawings,
cartoon animations, and other similar drawings.
Viewed by the human eye not analyzed by
machines.
JPEG Graphics Format
Flexibility, one has to create smaller lower quality
images or larger higher quality ones by changing
compression parameters.
Extremely useful to a broad scope of real world
applications.
Example - “What is the lowest amount of quality
we need?”
We can control the actual decoding speed as it
relates to the image quality by using inaccurate
approximations instead of exact calculations.
GIF (Graphical Interchange Format)
Lossless
Developed by Compuserve in 1987
Two version GIF87a and GIF89a.
Images have a bit depth of 8 bits per pixels,
giving us a maximum of 256 colors.
Image data is compressed using LZW
(Lempel-Ziv) algorithm.
GIF contd.
Animation is accomplished by having many
gif images together in one file.
Best performance can be reached by using
images with large percentage of solid colors
throughout a wide portion of the image area.
GIF contd.
GIF89a – extends the GIF87a specification
and adds transparency, text comments, and
animation of text.
Becoming less used.
PNG 0 – Portable Network Graphic
Lossless
Uses modified Lempel-Ziv 77 algorithm, similarly
being used by winzip, and other zip applications.
Benefits of 15-35 percent higher compression.
Works well with true color, palette, and grayscale
color areas.
PNG 0 – Portable Network Graphic
When compared to JPEG, this format offers higher
image quality but the compression ratios are not as
great as with JPEG.
PNG 0 also uses filtering techniques. It is applied
toward bytes of data before compression. This
intern prepares data for optimal compression. This
works best when applied to true color images.
MPEG (Moving Picture Expert Group)
Lossy
Most popular video formats currently being
used today.
Major standards: MPEG-1 and MPEG-2
Remove spatial redundancy within a video
frame and temporal redundancy between
video frames
DCT-based compression is used to
reduce spatial redundancy
MPEG contd.
Motion-compensation is used to exploit
temporal redundancy
The idea of motion-compensation is to
encode a video frame based on other video
frames temporally close to it.
Complicated and CPU intensive.
Uses several algorithms to achieve as much
as 30:1 compression rate
MPEG contd.
Quantized Discrete Cosine Transform
(QDCT).
run-length encoding
Huffman encoding
Conclusion
With the growing need to transmit more
data in a faster manner, data compression is
vital.
Application determines what method is
used.