A Parallel Algorithm for Hardware Implementation of Inverse Halftoning Umair F. Siddiqi1, Sadiq M.

Download Report

Transcript A Parallel Algorithm for Hardware Implementation of Inverse Halftoning Umair F. Siddiqi1, Sadiq M.

A Parallel Algorithm for Hardware Implementation of Inverse Halftoning

Umair F. Siddiqi 1 , Sadiq M. Sait 1 Aamir A. Farooqui 2 & 1

Department of Computer Engineering King Fahd University of Petroleum & Minerals, Dhahran 31261, Saudi Arabia

2

Synopsys Inc. Synopsys Module Compiler, Mountain View California, USA

Analog halftoning

► The process of rendition of continuous tone pictures on media on which only two levels can be displayed. ► The size of dots are adjusted according to the local print intensity. ► When looked at a distance it gives the impression of the original picture.

Digital halftoning

► In digital halftoning the input of the system is a grey-level image having more than two levels for example, 256 levels and the resulting image has only two levels. ► The halftone image is comprised of zeros and ones but gives the impression of the original image from a distance.

Inverse halftoning

► Inverse halftoning is the reconstruction of continuous tone picture (e.g. 256 levels) from its halftoned version. ► The input to an inverse halftoning system in an image that consists of zeros and ones and output is an image in which each pixel have value from 256 gray-levels. ► Inverse Halftoning finds application in image compression, printed image processing, scaling, enhancement, etc. ► Inverse halftoning can be for color images but we are concerned with gray-level images and their halftones .

Example of Inverse Halftoning Halftone Image Inverse Halftone or grey-level image

Demonstration of our Inverse halftoning algorithm ► The next few slides can show how inverse halftone operation is being performed in our algorithm.

Lookup Table (LUT) based Inverse Halftone operation ► The Lookup Table (LUT) method proposed by Mese and Vaidyanathan is used for inverse halftone operation. ► The LUT method uses a template “19pels” to select pixels from the neighborhood of the pixel that is going to be inverse halftone. ► This “19pels” then goes into a LUT which compares the “19pels” with its stored values and returns a gray-level for the input “19pels”.

“19pels” Template

1 6 11 2 7 12 15 3 8

0

16 18 4 9 13 17 5 10 14 The pixel numbered

0

is the one going to be inverse halftoned This pattern is associated with each pixel that is to be inverse halftoned

Demonstration of LUT inverse halftoning

This is the first “19pels” selected

This is the second “19pels” selected

This is the third “19pels” selected

This is the fourth “19pels” selected

Our modification to LUT based Inverse Halftoning

Problem of parallel LUT inverse halftone operation ► The LUT method uses one Lookup table that contains inverse halftone values for all “19pels” that are obtained through training set of halftones of standard images. ► To fetch parallel inverse halftone values of more than one 19pels we need to implement multiple copies of the LUT !

Our approach to parallel LUT inverse halftoning ► The single large LUT has been divided into many Smaller LUTs (SLUTs). ► Now more than one 19pels can fetch its inverse halftone value from a separate SLUT independent to other parallel 19pels. ► Next problem is to develop a method to send incoming 19pels to separate SLUTs.

Method to distinguish 19pels from each other ► The task to send many incoming 19pels to their separate SLUTs has been accomplish by defining an operator over 19pels. ► This operator is called Relative XOR Change (RXC). ► When all incoming 19pels are operated through this operator they convert into distintguish values in the range of -t to +t , where t = 19 in our case, but it could be any random integer within a suitable range with respect to total number of SLUTs and hardware complexity.

Demonstration of RXC operation

RXC over gray-level halftones I

Gray-level 230 Corresponding halftone obtained through Floyd and Steinberg Error Diffusion Method

RXC over gray-level halftones II

Gray-level 130 Corresponding halftone obtained through Floyd and Steinberg Error Diffusion Method

Magnified look at the halftones I

Gray-level 210 Gray-level 130 Halftone shows no column-wise periodicity among dots over small 19pels regions Halftone shows column-wise periodicity among dots over small 19pels regions

Magnified look at the halftones II

Gray-level 120 Portion of the halftone from image Boat Halftone shows no periodicity among dots over small 1D 19pels regions Halftone shows no periodicity among dots over small 1D 19pels regions

Behavior of RXC over Grey-level halftones Gray level 210 Gray level 130 NOT Periodic Vibratory Response Periodic Vibratory Response Halftones obtained through Floyd & Steinberg Error Diffusion Method

Representation of RXC values on number line Periodic Vibratory Values RXC values to be used in SLUT access are calculated by adding the RXC to the RXC of the previous “19pels” That is: RXC for SLUT of P n ( Slut )= RXC of P n-1 + RXC of P n-2 (n) From the number line we can see that adding RXC over previous values gives zero or constant result, therefore, we need NOT periodic vibratory response from RXC operator.

NOT Periodic Vibratory RXC Operator ► The operator RXC has been defined that is simple to implement in hardware as well as gives NOT periodic vibratory response over most of the gray levels from 0 to 255. ► We have assumed that a gray level image is a composition of many gray levels and obtaining the performance of RXC over individual gray levels can give a clue about its performance on images. ► This assumption is found to be correct in simulation results.

RXC Operator for P

n 1.

2.

3.

4.

P n-1 = “19pels” with the pixel 0 at position (row,col-1); P n = “19pels” with pixel 0 at position (row,col); xor_1= XOR(P n-1 , P n ); Magnitude of RXC= |RXC|= Number of Ones(xor_1); 5.

Sign of RXC= sgn(RXC)= + when |P n | > |P n-1 | - when |P n | < |P n-1 | Note: pixel 0 is the one that is to be inverse halftoned

Parallel application of RXC

Development of parallel table access algorithm with RXC The addition of Slut values from previous pixels simplifies the hardware design

Formal Algorithm

Simulation

► The algorithm is implemented in MATLAB the performance and quality of inverse halftoning is estimated. ► We assumed LUT inverse halftone operation to be ideal. ► The simulation results show the quality loss with respect to original image that occurred in distribution of parallel “19pels” to different SLUTs through RXC. ► This pixel loss is compensated through replicating gray level values from the neighbors.

Sample Image I

peppers PSNR= 34.7880

Sample Image II

lena PSNR= 32.5685

Sample Image III

mandrill PSNR= 28.1264

Image Boat Clock Peppers Boat Clock Peppers Boat Clock Peppers

Quality of inverse halftones

Halftone Algorithm FS ED FS ED FS ED GN ED GN ED GN ED EG ED EG ED EG ED %pixel coverage w/o pixel compensation 65.0864

70.6667

68.9433

63.7531

69.8765

68.9509

67.3086

68.5926

69.9905

PSNR with pixel compensation 30.3749

30.1671

28.5484

28.7139

31.2554

29.0077

32.1370

29.9289

28.5483

Hardware Implementation

► This section shows the hardware implementation of the proposed parallel algorithm in terms of block diagrams. ► 1.

2.

The specification of the hardware design is: Parallel Pixels to be inverse halftone= n = 15 Number of SLUTs= 19

► Two Blocks of hardware Implementation The hardware system can be divided into two blocks: 1.

2.

RXC and modulus operators 19pels to gray-level decoders

System Block diagram

RXC and modulus operators

► RXC and modulus operators components are responsible for the following tasks: Input: 19pels Output: SLUT numbers Slut 1.

Accept 19pels from the halftone image and assign a sequence number to each entered 19pels. 2.

3.

4.

5.

Perform RXC operation on all 19pels. Add the result. Slut value of the 19pels that has preceding sequence number to the current Then take mod of the current result with a fixed number i.e. 19 in our case to obtain Slut value for the current 19pels. The above three steps are pipelined so new 19pels are coming in while the current 19pels are in process.

RXC and modulus Block Diagram

RXC calculation for 19pels P n P n-1 and P parallel. n are two 19pels among all 19pels to be inverse halftoned in Slut is the Smaller LUT number where the concerned 19pels should go to fetch its inverse halftone value.

Hardware Design of RXC and modulus Operator ► The next slides can show the hardware design of RXC operator for a 19pels pattern named P n with the following parameters: ► Parallel pixels to be inverse halftoned at a time= 15 ► Total number of SLUTs= 19 is from 0 to 19. , therefore, Slut

Bit to Bit XOR

The figure shows bit-to-bit XOR of 19pels P n and P n+1

Addition of Number of ones in XOR result to obtain |RXC|

Comparing of |P n | and |P n+1 | to obtain sign(RXC)

Determination of

Slut

from RXC

Block diagram showing gray-level decoding process

Routing of a 19pels to 5

th

SLUT

Routing of a 19pels to 16

th

SLUT

Routing of a 19pels to 3

rd

SLUT

Routing of a 19pels to 17

th

SLUT

1p x 19p

DEMUX

i(i=2)

19p x 1p

MUX

n(n=16)

SLUT

i(i=16)

Method to generate contents of SLUT ► The algorithm is applied on images in a training set and Sluts values are obtained. ► The 19pels then placed in the SLUT given by the corresponding Slut value.

Properties of SLUTs

► The SLUTs were developed using training set composed of FS ED halftone images of Boat and Peppers of size 256x256-pixels. ► The size of one SLUT is found to be 2.5K entries . ► The summation of entries in all 19 SLUTs comes to be 42.6K. ► The size of LUT in single LUT method is 9.86K entries, however, if the single LUT method is implemented multiple times for 15 parallel pixels the total size could become 148K entries. ► In this way, our method can provide 3.5 times decrease in lookup table size over single LUT based method.

Performance Evaluation of SLUT

► FS ED halftone image of Barbara is inverse halftone, among the 19pels of Barbara that are present in the training set {boat, peppers} 95% can fetch their inverse halftone values.

Conclusion

► A method to parallel inverse halftone 15 pixels at a time is proposed and implemented. ► The results show that it can provide 3.5 times decrease in lookup table size over single LUT based method if implemented for parallel inverse halftone operation.