Content-Based Image Indexing Joel Ponianto Supervisor: Dr. Sid Ray

Transcript Content-Based Image Indexing Joel Ponianto Supervisor: Dr. Sid Ray

Content-Based Image Indexing
Joel Ponianto
Supervisor:
Dr. Sid Ray
Outline







Introduction to Content-Based Image Indexing
Image’s Features Extraction
Tree Structure
System Model
Retrieval Approach
Experiment Results
Conclusion
Introduction to Content-Based
Indexing




Content-Based Image Indexing (CBII) is an
interrelated issue with Content-Based Image
Retrieval (CBIR).
CBIR depends on CBII and vice versa.
CBIR focus on how to retrieve image
accurately and efficiently.
While CBII concern with how to support
retrieval process.
Introduction to Content-Based
Image Indexing Cont…





CBiI as pre-process of CBIR sequences.
Cannot ignore retrieval process to create good
indexing structure.
The idea of indexing is similar with a library
Every book has a unique id
Every book has properties
Introduction to Content-Based
Image Indexing Cont…





Examples: title, author, publisher, etc
Those properties are used to search the book.
People know it as “keyword”
Similar idea with images, however not that
simple.
Cannot represent an image with simple text.
(can but not make sense)
Introduction to Content-Based
Image Indexing Cont…




How to represent an image?
By using its properties such as, colour, shape,
texture and others.
Choose which properties need to be extracted
for indexing purpose ( and also retrieval).
Also choose which method to extract those
properties / features.
Image’s Features Extraction Cont…




Colour, shape and texture have their own subfeatures.
Colour: grey level, RGB/HUE value, grey
sigma, local histogram and average colour
value.
Shape: area, centroid, circularity and moment
invariant.
Texture: contrast, orientation and anisotropy.
Image’s Features Extraction Cont…




The selection of features is also effected by the
data set.
what we want to achieve at the retrieval stage
is effected by the data set.
If the data set is full of houses’ image and a
user want to look for a car image.
Try to select features that can differentiate
each class in the data set.
Image’s Features Extraction Cont…

For this project I select the following features:
–
–
–
–
–
–
–
–
Colour Sigma (Global)
Edge density (Global)
Colour Average (Global)
Boolean edge Density (Global)
Edge Direction (Global)
Region area (Region)
Moment invariant (Region)
Grey level (Region)
Image’s Features Extraction Cont…

Colour Sigma
–
Find the standard deviation (σ) of the image, for
each colour layer.
Image’s Features Extraction Cont…

Edge Density
–

Enhance the pixels that belong to the edges and
boundaries by using a standard edge detector.
Pixels far from edges will drop to 0 and those near
to an edge will increase to max. calculate the mean
pixel value of the resultant image.
Colour Average
–
Sum all the pixel value for each colour layer and
divide by the number of pixel.
Image’s Features Extraction Cont…

Boolean Edge Density
–

From above edge density, the image is thresholded
so that what could be called edge pixels are white
(1) and non-edge pixels are black (0). Count white
pixel in the image.
Edge Direction
–
With some edge detection (Sobel Operator), allow
us to make a crude estimation of a edge direction
for particular region.
Image’s Features Extraction Cont…

Area, Grey Value and moment invariant
–
–
–
–
These features is calculate on regional basis.
The region is calculated with combination of “kmean clustering” and “Connected Component
labelling Algorithm”
Calculate a grey level value of an image and
perform the k-mean clustering.
Use the connectivity algorithm to group similar grey
value by its location.
Image’s Features Extraction Cont…


http://www.cis.rit.edu/class/simg782.old/talkMo
ments/momentEquations.html
I use the first four of seven invariant moment
for this project.
Image’s Features Extraction Cont…
Image’s Features Extraction Cont…

Quantisation
–
–
–
To be suitable for computer processing and features
extraction (colour), an image must be digitized in
amplitude.
The idea is to reduce the colour space while gaining
the ability to localize colour information spatially.
this project applies quantisation at HSV colour
space.
Image’s Features Extraction Cont…
Image’s Features Extraction Cont…

RGB to HSV
–
Let RGB values ranged from 0 to 1 and MIN/MAX corresponds with
RGB values.
Image’s Features Extraction Cont…

HSV to RGB
–
–
–
H range from 0 - 360
V and S range from 0 – 1
If S == 0 then RGB = V
Else use next formula
Image’s Features Extraction Cont…
Image’s Features Extraction Cont…
Tree Structure


There are many choices of tree structures that
can handle multi-dimensional space. Such as
R-Tree, R*-Tree and Vp-Tree
We look at R-Tree tree structure:
–
–
This project used R-Tree to simplify the
computation.
Other tree structures can be use on the system.
Tree Structure Cont…

R-Tree (Antonin Guttman)
–
–
–
–
–
–
A R-Tree is a height balance tree and all leaves are on the same
level.
Root node has at least two children unless it is the leaf node.
Every non-leaf node contains between m and M entries unless it is
the root.
For each entries (I, childnode-pointer) in a non-leaf node, I is the
smallest rectangle that spatially contains all rectangles in its child
nodes.
Every leaf node contains between m and M index records unless it is
the root.
For each index record (I, tuple-identifier) in a leaf node, I is the
smallest rectangle that spatially contains the n-dimensional data
object represented by the indicated tuple.
Tree Structure Cont…
Tree Structure Cont…
System Model

Put into data base
Binary Threshold
Original Image
Quantised Image
K-mean
clustering
Apply Global features
extraction.
Connected Component
labelling
Apply Region features
extraction.
Insert into tree
structure
System Model Cont…



The System input around 300 images into the
data base.
Those images is divided into 10 different
classes: animal, car, flower, face, fruit, house,
lake, mountain, plane and sunset.
Store into persistence storage.
System Model Cont…


In the “binary threshold” stage, I attempt to
separate the background image with the
object.
Although this stage is very weak, but in some
images. The result can be helpful (and possible
the other way around).
System Model Cont…

Binary Threshold good result
System Model Cont…

Binary Threshold bad result
Retrieval Approach

Query sequence
Display the result in
ascending order
Query Image
Global Extraction
Find similarity with
data set
Pre-process stage
Region Extraction
Retrieval Approach Cont…

For finding similarity, I use Euclidean distance
measure formula:
W”i

Where:
–
–
–
–
–
–
p is the database image
q is the query image
Pi is the database images ith features
Qi is the query’s ith features
n is the number of features
W” is the weight for ith feature
Retrieval Approach Cont…




w’i is the weight of feature i
from relevant images
(σi) is the standard deviation of
feature i from relevant images
w’t is the total weight of feature I
w”t is the normalised weight
Retrieval Approach Cont…

Gaussian Normalisation (for feature normalization):
–
–
–
d’(fi,fj) is the similarity of image fi and fj,
range in [-1, 1]
σij and μij are the standard deviation and
mean of each feature respectively.
d”(fi,fj) is to make d’(fi,fj) in range [0, 1]
Experiment Result






Go to Excel file
m1-m8 only use global features
m3 uses colour avg, colour sigma and edge
density
m2 uses colour avg and colour sigma
m8 uses colour sigma and edge density
m9 use region features + m3
Conclusion





Indexing depend on retrieval and vice versa
No universal system / method for indexing or
retrieval.
We can try to develop something that robust.
Indexing base on regional features give better
result then global features.
With more time, more result can be produced.
Reference
•
Kompatsiaris, I., Triantafillou, E. and Strintzis, M. G., “Region-Based Color Image
Indexing and Retrieval”, 2001
•
Parker, J. R., Behm, B., “Use of Multiple Algorithm in Image Content Searches”,
International Conference on Information Technology: Coding and Computing
(ITCC’04) Volume2 p.246.
•
Smith, J. R., Chang, S., “Single Color Extraction and Image Query”, International
Conference on Image Processing (ICIP-95), Washington, DC, Oct, 1995.
•
Park, J. M., Looney, C. G., Chen, H. C., ”Fast Connected Component Labeling
Algorithm Using A Divide and Conquer Technique”, Technical Report, 2000
•
Chiueh, T., "Content-Based Image Indexing," in Proceedings of International Very
Large DataBase Conference, VLDB '94, Santiago, Chile, September, 1994.
•
Gonzalez, R. C. and Woods, R. E., “Digital Image Processing”, 1993, AddisonWesley Publishing Company, inc, 3rd edition.

Content-Based Image Indexing Joel Ponianto Supervisor: Dr. Sid Ray

Transcript Content-Based Image Indexing Joel Ponianto Supervisor: Dr. Sid Ray

Directory