Document 7793945

Download Report

Transcript Document 7793945

Compression of the image
Adolf Knoll
National Library of the Czech
Republic
General schemes for
application of compression
The schemes adapt to the character of
the represented objects:



Bitonal image (1-bit, black-and-white)
Colour photorealistic image
Mixed document (two above-mentioned
components)
Trends

Bitonal


Photorealistic



from CCITT Gr. Fax 3 and 4 to JBIG
variants
Lossless compression: PNG, TIFF/LZW
Lossy: from JPEG DCT to wavelet
Mixed document

Both applied (Mixed Raster Content –
usually vertically)
How is it built into formats?



Trying to have it in ISO TIFF (even JPEG,
LZW, or PNG) – but it is not enough due to
lack of tools for conversion and display.
That is why the other more suitable formats
are used: JPEG, PNG
That is why there is a lot of development in
the area of mixed formats – they do not aim
to become ISO
Relevant directions

Bitonal image


Photorealistic image


JBIG2 (ISO) – no support (exc. Xerox),
but many similar activities
wavelet JPEG2000 and many other nonISO initiatives (WI, LWF, IW44, SID,
Imagepower IW, …)
Mixed content

DjVu, LDF, Imagepower MRC
Aims

Image Archiving

standardized
archival format
(TIFF, JPEG, PNG,
…)

Image Delivery

More efficient
modern format
(JB2, MrSID, DjVu,
LDF, …)
Which relationship will be between both of them?
It will be defined by the goal of the project.
Around compression






Pre-processing of the image
Compression
Encoding in a format
De-coding from the format
De-compression
Display – print-out
Pre-processing of the bitonal
image - I

Efficient schemes are built on possibilities to
apply vocabularies of pixel chunks/groups:

E.g. a text is an image that can be interpreted as
several dozens of images of letters, while the
repeated occurrence of each letter can be
represented by its coordinates (x,y) and
reference to a dictionary in which there is only
one representation of similar letters (digitized
only once as a bitmap)

This method is called PATTERN MATCHING,
but…
Pre-processing of the bitonal
image - II


However, scanned texts have a lot of
information noise in individual pixel chunks
representing, for instance, letters in text
Therefore, it is convenient to reduce
differences between identically indentifiable
chunks



smoothing
pixel flipping
noise removal
Smoothing and pixel flipping
Problems in pattern matching
Česká republika
Low quality original and/or scan + inappropriate processing
Soft pattern matching



Better work with dictionaries;
replacement only there, where the
threshold value of the pixel chunk is
satisfied
If not, the whole small bitmap is stored
Tuning of these mechanisms is a key
to successful application of the lossy
compression of a bitonal image.
How to know…



Libraries have documents of various
qualities- also very bad
These documents are more difficult to
process than good samples presented
by software producers
Tests… tests… tests… on typical
materials
Bitonal compression


Lossless (LZW, PNG, …, CCITT Fax
Group 3 a 4, JB2, JBIG, JBIG2, Algo
Vision/Luratech (1-bit LDF component)
Lossy modern schemes:



AT&T (Lizardtech) (JB2) – soft pattern
matching
ImagePower Inc. JBIG2 (JB2) – only
pattern matching
Summus Inc. (Lightning Strike), ...
GIF would be
slightly worse
than PNG
Květy české – 19th century Czech journal
Impact of the quality of digitized originals on
performance of compression schemes
JB2


Most efficient compression schemes
JB2 from the DjVu format (AT&T).
It enables compression:



lossless
lossy
aggressive – while preserving high
quality
JB2 as a component part of
the DjVu format




More files can be merged and saved into
one (as PDF) – they have the common
dictionary so that together their size will be
smaller than the sum of all individual files
More files can be virtually joined (they are
called one after another from the server)
More advantages: display, references, OCR,
… (DjVu plug-in)
Expensive or free software for Linux or
Solaris
Samples and résumé


Monitor and test new approaches for
image processing
They can be very suitable for
document delivery services



Image servers
Scanned content
CLICK!!!
Which formats to use for
bitonal image?

If you have no special tools:





GIF
If you wish smaller files, use PNG
Both are recommended for WWW
However, TIFF/CCITT Fax Gr. 4 is
better
Use DjVu, if you wish very small files
Problems




Good image editing software does not
support TIFF with Gr. 4 encoding
Display possible within normal Windows
tools
GIF and PNG support also higher brightness
resolution (8-bit / 24-bit) – take care not to
save bi-level image in higher image depth
DjVu – necessary to solve authoring
software problem
Lossy compression –
bitonal image
Compression of colour images
Lossless
 LZW




PNG
Wavelet


GIF (8-bit only)
TIFF (5.0)
…
JPEG2000 (JP2)
Lossy
 DCT (JPEG)
 Fractals
 Wavelet




IW44
LWF, WI
JPEG2000 (JP2)
MrSID, …
Classical (LZW, RLE, DCT) versus wavelet approaches.
True colour image
DCT
wavelet
Testing compression efficiency

Sample




Reference
Full-colour (JPEG, wavelet)
1-bit (establish tresholds – Paint Shop
Pro, LuraWave)
MRC (same sample – DjVu Solo)
Compression efficiency –
bitonal image
Compression efficiency
True colour
How to apply compression?
It depends on the character of objects in
the image:
Photorealistic image (JPEG, wavelet)
 Text and simple blac-and-white graphics
(Fax Group 4, JB2, …)
 Colour graphics (problem to compress
with losses – better lossless PNG or GIF –
application area of vector graphics - SVG)
 Mixed content (composed solutions: DjVu,
LDF, …)

The most efficient solution
To segment images into two or more groups of
objects:
1.
2.
Objects good for bitonal conversion
Objects good for true colour representation
Tto compress each group separately and then
merge into one format.
Horizontal segmentation/zoning
Horizontally
-
-
Text
Grafics
Photographs
Imagepower Inc.
Vertical segmentation/zoning
Vertically

Foreground

Background
Lizardtech Inc. (AT&T)
Luratech GmBH
DjVu, LDF
Comparison of DjVu and LDF
DjVu
6 layers

Foreground:




JB2
IW44
Background:

LDF
3 layers
4 layers IW44
Foreground:



LDF 1-bit Comp.
LFW
Background:

1 layer LWF, JP2
Bitonal versus composed image
Grey level
Other DjVu properties
More images in one:
as TIFF, PDF, LDF, …, with use of the
common dictionary of pixel chunks
 Virtually: pages remaion on server and
only that page that is called is delivered

Multiresolution image
MrSID
 In one file several (up to 8) images in
various resolutions
 Sample
 Efficient with an image server
SAMPLES
Samples of various compression
solutions