DCS-426: PATTERN RECOGNITION AND IMAGE PROCESSING

Download Report

Transcript DCS-426: PATTERN RECOGNITION AND IMAGE PROCESSING

IMAGE PROCESSING
Chakravarthy Bhagvati
Dept. of Computer and Information Sciences
University of Hyderabad
OVERVIEW

What is image processing?
What is pattern recognition?
What are computer vision and graphics?

Focus of today’s presentation: image processing


introduction
 history and progress
 images, image models and representation

INTRODUCTION
INTRODUCTION…
WHAT IS IN A PICTURE?
 Regions of different
brightnesses and colours
 2D flat surface
 Fantastic amount of
information about subtle
shades and colours (mostly
unused?)


WHAT DO WE SEE?
 Objects, faces, people, and
symbols
 3D world
 We interpret and describe
the picture in very few
words
(a thousand?!)
Image processing helps in going from left to right
How do we do it? Extremely difficult to analyze!
HISTORY

1960s and early 1970s
computational approaches
 mainly low-level vision, image models and
representations


Late 1970s and early 1980s


1980s


model-based image understanding systems
Applications
1990s
learning in vision
 image databases…

DEFINITIONS

Image processing
manipulating basic characteristics of a picture such as
colour, brightness, contrast, edges, etc.
 reduction of information by recognizing essential and
non-essential features


Pattern recognition

classification, listing, identification and symbolic
description of features or aspects of an image
COMPUTER VISION

Computer Vision


techniques, algorithms, features
& research that aim to mimic
human vision system
D. E. Marr’s theory
HIGH LEVEL
INTERMEDIATE LEVEL
three levels of analysis -low, intermediate and high
 image processing fits in lowlevel or early processing
 complexity increases
LOW LEVEL
significantly from low to high

VISION AND GRAPHICS

Computer Vision
techniques, algorithms, methods, features and research
that aim at mimicking human vision system
 D. E. Marr’s theory: image processing fits in low-level or
early processing
 image processing is more than a part of computer vision


Computer Graphics

techniques, algorithms, methods, features and research
that aim at generating pictures
IMAGE ACQUISITION

Imaging System


converts 3D world scenes into 2D digital images (i.e., arrays of
numbers)
Cameras, Digitizers and Frame-Grabbers


Still photos (SLR, Auto-focus,…), Video and movie cameras, and
digital cameras
scanners, digitizers and frame-grabbers provide digital output
Imaging
System
Digital Image
World Scene
University of Hyderabad
IMAGE FORMATION


Transforming scene into image
Scene
3-D physical objects
 light source
 physical properties such
as roughness, type, etc.


Image
P
p
2-D
 only brightness or colour
attributes
P(X, Y, Z, <physical properties …>)  p(x, y, <brightness/colour>)

IMAGE MODELS

Geometric Models
where does the image of point in scene form?
 orthographic and perspective projections


Radiometric Models
how bright is the image of a point in the scene?
 bidirectional reflectance distribution function (BRDF) -Lambertian and specular surfaces
 Point-spread function (PSF) that characterizes the
imaging system


Functional Model
z = f(x)
 I(x, y) = z

0 <= z <= M
PROJECTION GEOMETRIES

Orthographic Projection




rays travel in parallel from
object to image
(xi, yi) = (Xi, Yi)
object’s height is independent
of distance
Perspective Projection



pin-hole camera geometry
(xi, yi) = f(Xi, Yi)/Zi
object’s height decreases with
increasing distance
I
University of Hyderabad
P
P
I
CAN YOU VISUALLY
PROCESS STEREO IMAGES?
A
A
TRY THIS PAIR NOW ...
A
A
Which stroke in X was made first: / or \ ?
WHAT ARE STEREO IMAGES
?

Images taken with two identical cameras placed
adjacent to one another


camera arrangement analogous to our two eyes
Stereo images or stereo pairs help us recover 3D
information from 2D images
the two cameras look at the same object from slightly
different positions
 differences are used to calculate depth from simple
geometric analysis

DEPTH FROM STEREO
X1
-X2
Z
Z
 x1
X1

f
Zf
B
x2  X2

f
Zf
-f
-x1
f
x2  x1 
(X1  X2 )
Zf
x2
fB
Zf
x2  x1
DEPTH EXTRACTION:
DISPARITY IMAGE

Match corresponding
features in left and right
images


Disparity image



disparity = x1 - x2
at each pixel, encode
intensity as a function of
disparity
useful in quickly
recognizing relative
positions of objects
Which stroke was made
first?
Disparity Image (Colour Coded)
Brighter areas are nearer to the
camera
RADIOMETRIC MODELS

Light source – irradiance


dE(qi,fi)
Physical properties –
Bidirectional Reflectance
Distribution Function
dE(qi,fi)
dL(qi,fi)
BRDF = dL(qi,fi) / dE(qi,fi)
BRDF
 Lambertian, Specular


Imaging system – Point
Spread Function
PSF
WHAT IS AN IMAGE?
Array of numbers whose values are proportional to brightness
0
...
127
186 188 189 193 146 153 149 149 12
1 21 164 177 166 164 170 177 171
University of Hyderabad
255
...
WHAT IS AN IMAGE?
...
186 188 189 193 146 153 149 149 12
1 21 164 177 166 164 170 177 171
250
250
200
200
150
150
100
100
50
50
0
0
A Signal - EE Origins?
A Surface
University of Hyderabad
...
IMAGE TYPES

Binary




Gray-Scale




pure black and white - need only ‘0’s and ‘1’s
1-bit images
usually 8 pixels / byte
shades of black and white
4- or 8-bit images
usually 1 pixel / byte
Colour Images



Red, Green, and Blue layers (RGB)
true-colour - 256x256x256 colours: 24-bit
images
3 bytes / pixel
University of Hyderabad
CONNECTED COMPONENT

Pixel neighbourhood



4 or 8 neighbours
Connected if pixel
shares some property
with its neighbours
A region of connected
pixels is a connected
component
p1
p2
p4
p6
p3
p5
p7
p8
4 - connectedness
8
DISTANCE METRICS


How do you measure
distance between two
pixels?
Distance metrics
city-block or Manhattan
 Euclidean
 8-connected (violates
triangular inequality)
 M-connected

Manhattan
Euclidean
2 ++(y
2=
|(x22-x11))|
|(y
-y
=5
(x
-y
)
2 21 1)|13
0
1
2
3
4
5
6
7
(x2,y2)
(5, 2)
(x1,y1)
(2, 4)
0 1
2 3 4
5 6 7
CONNECTED COMPONENT
LABELLING
p1 p2


Very simple algorithm
Four cases of 8 – connectedness

First pass – initial labelling
p4
p6
p5
p7
if none of p1 … p4 are labelled, assign a new label
 if any one of p1 … p4 are labelled, assign same label
 if more than one of p1 … p4 are labelled and their labels are
identical, assign same label
 if more than one of p1 … p4 are labelled and their labels are
different


Second pass – renumbering and merging
assign any of the labels
 mark different labels of p1 … p4 are equivalent
 renumber equivalent labels on second pass

p3
p8
EXAMPLE
IMAGE STORAGE

Need to Interpret Numbers Retrieved from
Storage
image dimensions
 type, number of colours, etc.


Images Require a Lot of Space
a 512x512 full colour image requires 750 KB!
 a 2048x2048 BW image requires 4 MB!!
 compression schemes

University of Hyderabad
IMAGE STORAGE …

Different Applications Require Different Storage
Schemes
WWW - small sizes, portability
 Image Processing - pixel oriented
 On-screen Display - resolution, screen colour maps


Two Main Schemes
Raster based: very common, very popular
 Vector, Intermediate Model based …: special
applications

27
IMAGE FILE STRUCTURE

Image files have two major parts
header
 data


Header contains
image type identifier
 dimensions
 look-up table
 file structure information


Data section contains

listing of pixel values
28
POPULAR IMAGE FORMATS

Common Image Formats
PGM, PPM, BMP, SunRasterFiles… (uncompressed)
 GIF, JPEG, TIFF,… (compressed)
 MPEG (video), AVI (multi-media)


World Wide Web Formats
GIF, ANIMATED and TRANSPARENT GIF
 JPEG

29
PGM / PPM FORMAT

Header
magic number, P5 or P6
 comments
 width height
 depth (either 1, 4 or 8)


Data
intensities in one byte per
pixel
 uncompressed



PGM is gray scale
PPM is colour
Image of UH Logo
P5
# Created by Paint Shop Pro
387 375
255
ÿÿ÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷÷
PGM File of the Same Logo
Size: 142 KB
SUNRASTER FILE

Header (32 B + 768 B)
magic number
 width, height, depth, size
 look-up table


Data

compressed or uncompressed
BMP FORMAT

Header (variable)
magic number
 height, width and depth (1, 4, 8, 16 or 24 bits)
 screen resolution
 length of the header
 look-up table, others


Data
compressed or uncompressed
 stored in reverse order, i.e., last row first

GIF AND JPEG


GIF proprietary format of
CompuServe
Header (variable)



dimensions
encoding scheme




Data



compressed using LempelZvi or Huffman schemes
lossless
can be interlaced,
transparent or animated
JPEG allows great
flexibility in dealing with
images
Header (variable)

dimensions
encoding scheme
Data



compressed using discrete
cosine transform
lossy
allows trade-off between
image size and quality
TIFF


TIFF stands for Tagged Image File Format
Header consists of numerous tags
compression type
 number of bytes / pixel
 comments, key words
 resolution, etc.


Highly standardised, expecially for document
images

almost universally used for fax transmissions
IMAGE LAYERING

Image comprises different
components




Very few formats support
layering



background
foreground
special objects, etc.
PSD - Adobe Photoshop format
Paint Shop Pro version 4.0 and
later
Useful in questioned
document analysis
COMPARISON OF IMAGE
FORMATS
FORMAT
PGM
PPM
BMP
TIFF
GIF
JPEG
MPEG
PSD
PLATFORM
UNIX
UNIX
WINDOWS
WINDOWS
UNIX
WINDOWS
UNIX
WINDOWS
UNIX
WINDOWS
UNIX
ADOBE
P’SHOP
COLOUR
NO
YES
YES
YES
SIZE
BIG
BIG
V. BIG
VARIABLE
RESOL’N
V. GOOD
V. GOOD
GOOD
VARIABLE
LAYERS
NO
NO
NO
NO
YES
SMALL
V. GOOD
NO*
YES
VARIABLE
VARIABLE
NO
YES
SMALL
POOR
NO
YES
BIG
V. GOOD
YES
COLOUR MAPS

Displaying 256 colours on a 256-colour monitor
STANDARD AND PRIVATE
COLOUR MAPS
THANK YOU
End of Block 1
IMAGE PROCESSING
What is Image Processing? (again!)
Performing meaningful operations on arrays of
numbers

Point Processes






Histogram operations

Spatial Operations
Frequency Operations
Spatial-frequency
Operations
Mathematical Morphology





Enhancement & Restoration
Feature Extraction
Texture Analysis
Segmentation
Image Matching
3D Information Extraction
Interpretation
University of Hyderabad
POINT OPERATIONS

Point operations work directly on image pixels


modify value of a pixel independently of its neighbours
All point operations
may be written as:

Examples


I’(x,y) = f[I(x,y)]



digital negatives
contrast enhancement
bit-plane and gray scale slicing
histogram stretching
histogram equalization
CONTRAST ENHANCEMENT

Intensity transfer function
255
220
Maps input to output
intensities
 Default: a straight line with 192
a slope of 45o
 Higher slope gives greater
contrast
 Also possible to make
10
0
piece-wise modifications

0
100
192
255
POINT OPERATIONS…
Digital Negative
I’(x,y) = 255 - I(x,y)
Increase Contrast
BIT-PLANE SLICING





Each pixel in a grayscale
image comprises 8-bits
Gather a specific bit, e.g., bit
7 from each pixel
We get an array of ‘0’s and
‘1’s
If we get ‘0,’ mark it BLACK
else WHITE
The result is Plane-7 slicing
0 0
1 1
0 0
0 0
1 1
1 1
1 0
0 0
78 76
0 0 1 1 1
1 1 0 0 0
0 0 0 0 0
0 0 0 0 0
1 0 0 0 1
0 0 1 1 1
1 1 1 0 1
1 0 1 1 1
75 66 135 133 143
… 45 46 47 48 49 50 51 …
GRAY LEVEL SLICING



Select upper and lower
thresholds on gray levels 164
Any pixel with a value
lower than the lower
threshold or higher than
128
the upper threshold is
made ‘0’
A pixel with a value
between the thresholds is
sent unchanged to output
BIT-PLANE AND GRAY
LEVEL SLICING
Plane 7
Slice 128 - 164
Plane 0
POINT OPERATIONS…
Histogram Stretch
I’(x,y) = 256 [I(x,y) - 3]
(31 - 3)
Histogram Equalize
IMAGE OPERATIONS

Spatial Operations


Apply a mask
... 135 149132 126 ...
1 128
1 138
1 243 131 ...
...
1 135
1 129134
1
...
136 ...
1 130
1 135
1 15 126 ...
...
...
...
...
...
Averaging
Mask
Frequency Operations

Fourier analysis
2
1.5
1
0.5
0
-0.5 0
5
10
20
-1.5
-2
+
1.5
1.5
1
1
0.5
0.5
0
0
Image
15
-1
-0.5
0
5
10
15
20
-0.5
-1
-1
-1.5
-1.5
University of Hyderabad
0
5
10
15
20
SPATIAL OPERATIONS

Spatial Operations are the most widely used
operations in image processing


far too many to list in entirety
Examples:
averaging / blurring
 median / noise removal
 high-boost / detail enhancement
 edge detection


Sobel’s, Prewitt’s, Roberts’, Canny’s, etc.
University of Hyderabad
SPATIAL OPERATIONS

Modify the value of a pixel based on the values of its
neighbours


What are the neighbours - how many neighbours?
Key Idea 1: Mask, Kernel or Filter

small array of numbers

enables generalization and simplification of operations
I’(x,y) = I * Kernel

what is ‘*’?
CONVOLUTION
4

Key Idea 2: Convolution makes
spatial operations work

Convolution: a special kind of
multiplication

Example: Image * Mask
f(x) = 4
=0
g(x) = 1
if x >= 0
otherwise
if -1 <= x <= 1
f(x)
g(x)
1
f(x) * g(x) 12
8
4
EXAMPLE

Simple average - 1 Dimensional:
1 1 1
Image:
… 135 140 127 152 85 64 73 69 …
Output:

What happens if we use the mask:


-1 0 +1
SPATIAL OPERATIONS
1
x
9
1
1
1
1
1
1
1
1
1
Average
-1
-1
-1
-1
9
-1
-1
-1
-1
High Boost
University of Hyderabad
SPATIAL OPERATIONS…
Median Filter
1
1
2
-1
-1
1
1
2
-1
-1
1
1
2
-1
-1
Deblur Filter
University of Hyderabad
EXAMPLES …
Deblurring Mask:
1
1
1
University of Hyderabad
1
1
1
0
0
0
-1
-1
-1
0
0
0
EDGE DETECTORS…
Original
Sobel_H
-1 -2 -1
0
0
0
1
2
1
Sobel
Negative
Canny
University of Hyderabad
FOURIER ANALYSIS
A Single Row from an Image
250
Brightness
200
150
Frequency
Representation
100
50
Fourier Components
3
192
179
166
154
141
128
115
102
90
77
64
51
38
26
13
0
0
X Coordinate
2.5
2
1.5
2.8sin(x)
1
0.5sin(2x)
192
179
166
154
141
128
115
102
90
77
64
51
-0.5
38
0.6cos(x/2)
26
0
13
0.18sin(6x)
0
0.5
-1
-1.5
-2
University of Hyderabad
1.9cos(5x/2)
FREQUENCY DOMAIN
OPERATIONS…
Original
FFT
Low Pass
University of Hyderabad
Band Pass
High Pass
A REAL EXAMPLE…
Original
High Pass
FFT
University of Hyderabad
IMAGE RESTORATION



Recover the original image from a degraded one
Sources of degradation - noise, camera shake,
wrong focus, etc.
Restoration mostly done in frequency domain
INVERSE FILTERING






Input image is f
f
g
h
Output image is g
Degradation function is h
Linear system theory is g = f * h
In Fourier (frequency) domain
G = FH
F = G/H
Given g the output image, need H, to recover f
LINEAR MOTION BLUR





Degradation function h
corresponds to camera shake
Simple example - shake in x
direction
h is a step whose width depends
on extent of shake a
H is proportional to
‘0’s at p / na, n = 1, 2, 3, …
2a
LINEAR MOTION BLUR …
Linear Motion
of 16 pixels
from left to right
Recovered
image
Original
image
WRONG FOCUS BLUR



Another example - wrong focus
Modelled as Gaussian blur, i.e., h = exp(-x2/2s2)
H is also a Gaussian, H a exp(-s2x2)
HOW DO WE COMPUTE H?

Model the degradation process


only possible for known sources of blur
Get the impulse-response function
if the camera system is accessible, then take a picture of
a single bright point of light
 the output is h


Use approximations
EXTENSION TO NOISY CASE



We generally assume additive rather than
multiplicative noise - easier to handle using linear
system theory
g = (f + n) * h
G = (F + N) H
F = (G - NH) / H
F = G/H - N
N is assumed to be a constant, i.e., white noise
Rest is the same as the noise-free case
IMAGE MOSAICKING



Combine several small images to generate a big
image
Issue: seamless merging
Features to consider
Translation – obtain sliding vector and superimpose
 Rotation – find rotation angle, correct and then
superimpose
 Scaling – normalize and superimpose
 Different image parameters – matching problem

MOSAIC EXAMPLES
HUSSAIN SAGAR IMAGE
IMAGE PROCESSING
enhancement
modification
restoration and reconstruction
mosaicking
August 25, 2002
Dr. Chakravarthy Bhagvati
University of Hyderabad
THREE DAY WORKSHOP ON IMAGE PROCESSING AND APPLICATIONS