JPEG2000 presentation

Download Report

Transcript JPEG2000 presentation

Image Processing seminar (2003)

JPEG2000

The next generation still image-compression standard

Presented by:

Eddie Zaslavsky

Contents :

1. Why another standard?

2. JPEG2000 3. Examples 4. Conclusions 5. Add-on: EWZ algorithm 2

Why another standard?

• Low bit-rate compression nacceptable.

: At low bit-rates (e.g. below 0.25 bpp for highly detailed gray-level i mages) the distortion in JPEG becomes u • Lossless and lossy compression compression in one codestream.

: Need for standard, which provide lossless and lossy • Large images : JPEG doesn't compress images greater then 64x64K without tiling.

3

Why another standard? (cont'd)

• Single decompression architecture JPEG decoders.

: JPEG has 44 modes, many of them are application specific and not used by the majority of the • Transmission in noisy environments encountered.

: in JPEG quality suffers dramatically, when bit errors are • Computer generated imaginary : JPEG is optimized for natural images and performs badly on computer generated images • Compound documents : JPEG fails to compress bi-level (text) imagery .

4

JPEG2000 - Targets

• Coding standard for: different types of still images (gray-level, color, ...) different characteristics (natural, scientific, ...) different imaging models (client/server, real-time,...) within a unified and integrated system .

• This coding system is intended for: low bit-rate applications, exhibiting rate-distortion a nd subjective image quality performance s uperior to existing standards .

5

JPEG2000 (encoder-decoder scheme)

6

JPEG2000 - Overview

• The source image is decomposed into components (up to 256).

• The image components are (optionally) decomposed into rectangular tiles . The tile-component is the basic unit of the original or reconstructed image.

• A wavelet transform is applied on each tile. The tile is decomposed into different resolution levels.

• The decomposition levels coefficients entire image component.

are made up of subbands of that describe the frequency characteristics of local areas of the tile components, rather than across the • The sub-bands of coefficients are quantized into rectangular arrays of code blocks .

and collected 7

JPEG2000 - Overview (cont'd)

• The bit planes of the coefficients in a code block (i.e. the bits of equal significance across the coefficients in a code block) are entropy coded.

• The encoding can be done in such a way that certain regions of interest (ROI) can be coded at a higher quality than the background.

• Markers are added to the bit stream to allow for error resilience.

• The code stream has a other characteristics.

main header at the beginning that describes the original image and the various decomposition and coding styles that are used to locate, extract, decode and reconstruct the image with the desired resolution, fidelity, region of interest or 8

Pre-Processing

1. Image tiling: • Image may be quite large in comparison to the amount of memory available to the codec.

• Partition of the original image into rectangular non overlapping blocks (tiles), to be compressed independently 2. DC-level shifting: • The codec expects its input sample data to have a nominal dynamic range that is approximately centered about zero (0 -- 255 -> -128 -- 128) • If the sample values are unsigned, the nominal dynamic range of the samples is adjusted by subtracting a bias from each of the sample values ( 2 P-1 , P is the component’s precision) 9

Pre-Processing - Tiling

• All operations, including component mixing, wavelet transform, quantization and entropy coding are performed independently on the image tiles.

• Tiling affects the image quality both subjectively and objectively • Smaller tiles create more tiling artifacts 10

Pre-Processing (cont'd)

3. Components transformation: • Maps data from RGB to YCrCb (Y, Cr, Cb - less statistically dependent; compress better); serves to reduce the correlation between components, leading to improved coding efficiency. There are reversible and irreversible transforms. Inverse reversible component transform Forward reversible component transform 11

Pre-Processing - Component Transformations

• Component transformations improve compression and allow visually relevant quantization: • Irreversible component transformation (ICT):   Floating point For use with irreversible (floating point 9/7) wavelet • Reversible component transformation (RCT) :  Integer approximation  For use with reversible (integer 5/3) wavelet 12

ICT (example)

13

Wavelet Transform

• Floating point 9/7 wavelet filter for lossy compression  Best performance at low bit rate  High implementation complexity, especially for hardware • Integer 5/3 wavelet filter for lossless coding  Integer arithmetic, low implementation complexity • We filter each row and column with a high pass and low pass filter , followed by downsampling by 2 (to keep the sample rate). • Now we have divided the tile to sub-bands . All info (index, position, precincts, etc.), regarding the single tile, is put together in a contiguous stream of data called a packet . 14

Code-blocks, precincts and packets

15

Wavelet Transform

Two filtering modes:   Convolution based: performing a series of dot products between the two filter masks and the extended 1-D signal.

Lifting based: sequence of very simple filtering operations for which alternately odd sample values of the signal are updated with a weighted sum of even sample values, and vise versa .

Lossless 1D DWT Lossy 1D DWT

P and U stand for Prediction and Update.

=

1.586,

=

0.052,

= 0.882,

 16

= 0.443, K = 1.230

Wavelet Transform

• Symmetric extension: To ensure that for the filtering operations that take place at both boundaries of the signal, one signal sample exists and spatially corresponds to each co-efficient of the filter mask. 17

DWT (example)

In JPEG2000 used.

multiple stages of the DWT are performed. JPEG2000 supports from 0 to 32 stages. For natural images, usually between 4 to 8 stages are 18

Quantization

• The wavelet coefficients are quantized using a uantizer with deadzone . For each subband b uniform , a basic q q uantizer step size Δ

b

is used to quantize all the coefficients i n that subband according to: • Example: Given a quantizer step of 10 and an encoder input value of 21.82, the quantizer index is determined as shown: 19

Coefficient Bit Modeling

• Wavelet coefficients are associated with different s ub-bands arising from the 2D separable transform a pplied.

• These coefficients are then arranged into rectangular blocks within each sub-band, called code-blocks .

20

Coefficient Bit Modeling (cont'd)

• Code-blocks are then coded a bit-plane at a time starting from the Most Significant Bit-Plane to the Least Significant Bit-Plane (if some MSB-planes contain no 1s, the MSB-plane is set to the top most b it-plane, with at least one 1, the number of bit-planes which are skipped is then encoded in a header.) 3 6 5 1 = MSB-plane

Coefficient Bit Modeling (cont'd)

• For each bit-plane in a code-block, a special code block scan pattern is used for each of three coding passes. 22

3 Passes Scanning

• Each coefficient bit in the bit-plane is coded in only one of the Three Coding Passes: 1. Significance Propagation 2. Magnitude Refinement 3. Clean-up 23

3 Passes Scanning

1. Significance Propagation Pass • If a bit is insignificant (=0) but at least one of it's eight neighbors is significant (=1), then it is encoded.

• If the bit at the same time is a 1, it's significance flag is set to 1 and the sign of the symbol is encoded.

2. Magnitude Refinement Pass: • Samples which are significant and were not coded in the significance propagation pass. 3. Clean-up Pass: • It codes all bits which were passed over by the previous two coding passes (insignificant bits). It is the first pass for MSB plane .

The encoding is done by the MQ-coder, a low complexity entropy coder.

24

Quality layers organization

• The resulting bit streams for each code-block are organized into quality layers . A quality layer is a collection of some consecutive bit-plane coding passes from each tile. Each code block can contribute an arbitrary number of bit-plane coding passes to a layer, but not all coding passes must be assigned to a quality layer. Every additional layer increases the image quality.

25

Rate Control

• Rate control is the process by which the code-stream is altered so that a target bit rate can be reached. • Once the entire image has been compressed, a post processing operation passes over all the compressed blocks and determines the extent to w hich each block's embedded bit stream should be tr uncated in order to achieve the target bit rate. • The ideal truncation strategy is one that minimizes distortion while still reaching the target bit-rate . • The code-blocks are compressed independently, so any bit stream truncation policy can be used.

26

Bit stream organization

• In bit stream organization, the compressed data from the bit-plane coding passes are separated into packets . • Then, the packets are multiplexed together in an ordered m anner to form one code-stream .

• Each precinct generates one packet a , even if the packet is empty. A packet is composed of header and the compressed data .

27

Bit stream organization

28

Bit stream organization (cont'd)

• There are 5 ways to order the packets, called progressions , where position refers to the precinct number: Quality : layer, resolution, component, position Resolution 1 : resolution, layer, component, position Resolution 2 : resolution, position, component, layer Position : position, component, resolution, layer Component : component, position, resolution, layer • The sorting mechanisms are ordered from most significant to least significant . It is also possible for the progression order to change arbitrarily in the code-stream.

29

Code stream organization (diagram)

30

Decoding

• The decoder basically performs the opposite of the encoder : • The code-stream is received by the decoder according to the progression order stated in the header. The coefficients in the packets are then decoded and dequantized, and the reverse-ICT is performed: • In the case of irreversible compression, the decompression results in loss of data. The resulting image is not exactly like the original.

31

Characteristics:

So, what is new in JPEG2000, comparing to previous encoding protocols???

1. Compress once - decompress many ways 2. Region-Of-Interest encoding 3. Progression 4. Error resilience 32

JPEG2000 - Markets & Applications

33

Compress once, decompress many ways

• In JPEG2000, the compressor decides the maximum resolution and maximum image quality to be used. • It is also possible to perform random access by decompressing only a certain region of the image or a specific component of the image (e.g. the grayscale component of a color image). Both can be performed with varying qualities and resolutions.

• In each case it is possible to locate , extract , and decode the bytes required for the desired image product without decoding the entire code-stream .

34

Region-of-interest (ROI)

• A ROI is a part of an image that is encoded with higher quality than the rest of the image (the background). The encoding is done in such a way that the ciated with the ROI precedes the information associated with the background.

information asso • 2 methods : Scaling based and Maxshift 35

Region-of-interest (ROI) - Scaling based

1. The wavelet transform is calculated 2. ROI mask is derived , indicating the set of coefficients that are required for up to lossless ROI reconstruction 3. The wavelet coefficients are quantized 4. The coefficients are downscaled value that lay out of the ROI by a specified scaling 5. The resulting coefficients are progressively entropy encoded (with the most significant bit planes first) 6. ROI's scaling value re added and coordinates to the bit stream.

a 36

Region-of-interest (ROI) -

Maxshift

method

• ROI mask (a bit map) is created describing which quantized transform coefficients must be encoded with better quality.

• The quantized transform coefficients outside the ROI mask (background coefficients) are coded before the background. scaled down so that the bits associated with the ROI are placed in highest bit-planes and • Selection of scaling value S: S  max(Mb) , where Mb is the largest number of magnitude bit planes for a ny background coefficient in any code-block in the current component: after the scaling of the background coefficients, the LSB of all shifted ROI coefficients is above the MSB (non zero) of all background's coefficients.

• Advantage : arbitrary shaped ROIs without the need for shape information at the decoder.

37

ROI - example

Original Image with ROI Defined Decoded Image with ROI Intact 38

Scalability and bit-stream parsing

• 2 important modes of scalability:  Resolution/Spatial  Quality (SNR) • Bit-stream parsing   A combination of spatial and quality scalability.

It is possible to progress by spatial scalability to a given (resolution) level a nd then change the progression by SNR at a higher level.

39

Resolution scalability

40

Resolution scalability

41

Resolution scalability

42

Resolution scalability

43

Quality scalability

44

Quality scalability

45

Quality scalability

46

Error resilience

 Error effects : 1.

In a packet body : corrupted arithmetically coded data for some code-block => severe distortion.

2.

In a packet head : wrong body length can be decoded, code block data can be assigned to wrong code-blocks => total synchronization loss.

3.

Bytes missing (i.e. network packet loss): combined effects of error in packet head and body 47

Protecting code-block data

1. Segmentation symbols (at least).

: special symbol sequence is coded at the end of each bit-plane. If wrong sequence is decoded, an error has occurred and the last bit-plane is corrupted 2. Regular predictable termination last coding pass (at least).

: the arithmetic coder is terminated at the end of each coding pass using a special algorithm (predictable termination). The decoder reproduces the termination and if it does not find the same unused bits at the end, an error has occurred in the 3. Both mechanism can be freely mixed , but slightly decrease the compression efficiency. 48

Protecting packet head

1. SOP resynchronization marker : every packet can be preceded by an SOP marker with a sequence index. If an SOP marker with correct sequence index isn't found just before the packet head, an error has occurred. In such case the next, unaffected packet is searched in the codestream, and decoding proceed from there.

2. PPM/PPT markers error rate.

: the packet head content can be moved to the main or tile headers in the codestream a nd transmitted through a channel with a much lower 3. Precincts : they limit packet head errors to a small image area.

49

Error resilience (cont'd)

50

Examples

Reconstructed images compressed at 0.25 bpp by means of (a) JPEG and (b) JPEG2000 51

Examples

Reconstructed images compressed at 0.125 bpp by means of (a) JPEG and (b) JPEG2000 52

Examples

JPEG 2000 (1.83 KB) Original (979 KB) JPEG (6.21 KB)

53

Conclusion:

• Benefits : lossless and lossy compression, higher image quality and compression ratios, view the file at multiple resolutions, one area of the image to be examined more closely using its Region Of Interest capability. • JPEG2000 uses wavelet technology to compress images (images being compressed more efficiently). Currently, JPEG2000 uses one wavelet for lossy compression and another wavelet for lossless compression, but in the future other wavelets may be used as the need arises. • Many applications , including the Internet, medical imaging digital photography, ..... • Overall , JPEG 2000 is a huge upgrade over current compression methods and looks to be the next image compression standard in the near future. 54

EWZ algorithm (intro)

• The Embedded Zerotree Wavelet algorithm (EZW) is a simple, yet remarkable effective, image compression a lgorithm, having the property that the bits in the bit st ream are generated in order of importance, giving a fu lly embedded (progressive) code.

• The compressed data stream can have th less spectacular results.

any bit rate desired. Any bit rate is only possible if there is i nformation loss somewhere so that the compressor is l ossy. However, lossless compression is also possible wi 55

EZW - 2 observations

The EZW encoder is based on two important observations: 1. Natural images in general have a low pass spectrum 631 544 86 10 -7 29 55 -54 , so the wavelet coefficients will, on average, be smaller in the higher subbands than in the lower subbands. This shows that progressive encoding is a very natural choice for compressing wavelet transformed images, since the higher subbands only add detail.

2. Large wavelet coefficients are more important than smaller wavelet coefficients.

56

Motivation

• Transform Coding Needs “ Significance Map ” to be sent: At low bit rates a large number of the transform coefficients are quantized to zero ( Insignificant Coefficients ). We’d like to not have to actually send any bits to code these. But you need to somehow inform the d ecoder about which coefficients are insignificant: JPEG d oes this using run-length coding.

57

Motivation

• Here is a two-stage wavelet decomposition of an image. Notice the large number of zeros (black): 58

Shapiro’s Idea for Solving Sig. Map Problem

• "Zerotree" - a quad-tree of which all nodes are equal to or smaller than the root. The tree is coded with a single symbol and reconstructed by the decoder as a quad-tree filled with zeroes . The root has to be smaller than the threshold against which the wavelet coefficients are currently being measured.

• Idea: An insignificant coefficient is VERY likely scale).

to have all of its “ descendents” on its quad tree also be insignificant (wavelet coefficients DECREASE with Such a coefficient is called a “ Zerotree Root ” 59

EZW Algorithm

• First step: The DWT of the entire 2-D image will be computed • Second step: Progressively EZW encodes the coefficients by decreasing the threshold • Third step: Arithmetic coding is used to entropy code the symbols 60

EWZ Algorithm (second step)

• Sequence of Decreasing Thresholds : to, t1, . . . ,t(n-1) with ti = t(i-1)/2 and MAX(): the maximum coefficient value in the image y(x,y): the coefficient • Maintain Two Separate Lists : Dominant List :

coordinates of

significant Subordinate List :

magnitudes of

found to be significant.

coeffs not yet found coefficients already • For each threshold, perform two passes: Dominant Pass followed by Subordinate Pass

threshold = initial_threshold; do { dominant_pass(image); subordinate_pass(image); threshold = threshold/2; } while (threshold > minimum_threshold);

61

EWZ Algorithm - Dominant Pass

• Dominant Pass: * All the coefficients are scanned in a special order * If the coefficient is a zero tree root, it will be encoded as reconstructed as zero at this threshold level ZTR . All its descendants don’t need to be encoded – they will be * If the coefficient is insignificant but one of its descendants is significant, it is encoded as IZ (isolated zero). * If the coefficient is significant then it is encoded as POS (positive) or NEG (negative) depends on its sign . At the end, all the coefficients that are in absolute value larger, than the current threshold are extracted and placed without their sign on the subordinate list and their positions in the image are filled with zeroes, to prevent them from being coded again. 62

Dominant pass (scheme)

63

Scanning order

The wavelet coefficients are scanned in one of the following two orders . The scan order seems to be of some influence of the final compression result.

64

EWZ Algorithm - Subordinate Pass

• Subordinate Pass (Refinement Pass): * Now we check, if the values in the Subordinate list are larger or smaller than the current threshold: If larger - a 1 is sent to the entropy encoder and the current threshold is subtracted from the coefficient.

If smaller - a 0 is sent to the entropy encoder * Sort the Subordinate list to place the larger (important) coefficients in the front (also helps the entropy encoder...) * Repeat with next lower threshold, till the total bit budget is exhausted. Encoded stream is an embedded stream 65

EZW - example

66

67