Representation and Retrieval of Visual Content
Download
Report
Transcript Representation and Retrieval of Visual Content
Retrieval of Visual Content
Images comprise the vast majority of
data in many application domains
Remote sensing (NASA, 1 terabyte per day)
Astronomy
Geographic Information Systems (GIS)
Medicine (CT, MRI, etc.)
Criminal investigation
Trademark authentication
E.G.M. Petrakis
Visual Content
1
Images in Multimedia Systems
Images co-exist with other types of
data in Multimedia Documents
text
attribute
video
sound
E.G.M. Petrakis
Visual Content
2
Content-Based Image Retrieval
Descriptions of image content are
extracted and stored
Manually: mainly text descriptions
Difficult
Subjective
Automatically: features from content
Computationally expensive
Inexact
Domain specific
E.G.M. Petrakis
Visual Content
3
System Architecture
E.G.M. Petrakis
Visual Content
4
Design Issues
Feature Extraction (functions)
Feature Selection
Organization of stored information, file
structures, indexing
Search and retrieval strategies
Sequential / Indexed search / Query refinement
Query language: conditional / example queries
User interface design
E.G.M. Petrakis
Visual Content
5
Image Descriptions
Subjective interpretation of content: means
different things to different people
Different features for different applications
Colour is important of out-door image but not for
X-rays, CT, MRI etc.
Motion features are sometimes important
(ultrasound)
Different systems for different applications
E.G.M. Petrakis
Visual Content
6
Levels Representation
Low at pixel level (e.g., intensities,
colors)
Intermediate at region level (e.g.,
region, shape, motion features, motion)
High – Semantic human interpretations
(e.g., a class per object or image or
domain concepts such as diagnosis …)
E.G.M. Petrakis
Visual Content
7
Conflicting issues
Dependence on image content,
computational overhead and uncertainty
increases from low to high level
Selection depends on application, image
type, user requirements, query types
E.G.M. Petrakis
Visual Content
8
Reliability Criteria
Uniqueness
Proportionality of variation
Robustness against noise
Invariance under translation, rotation,
scaling
Computationally efficient
Content at various level of detail
E.G.M. Petrakis
Visual Content
9
Generic Features
Feature vectors of
intensity / color
texture
spatial relationships
motion
combinations of the above
Two kinds of features
global: computed for the entire image
local: computed for objects or image parts
E.G.M. Petrakis
Visual Content
10
RGB Color Space
Popular hardware
oriented scheme
Colors form a unit cube
r = R/(R+G+B)
g = G/(R+G+B)
b = B/(R+G+B)
RGB is good for
acquisition and display
but not for the
perception of colors
E.G.M. Petrakis
Visual Content
11
Munsell Color Space
Color in cylindrical
coordinates
Brightness: vertical
axis
Hue: angular
displacement
Saturation:
cylindrical radius
E.G.M. Petrakis
Visual Content
12
Color Definitions
Brightness: intensity of color,
average intensity over all
wavelengths
Hue: proportional to the average
wavelength of the color percept
Saturation: amount of white, highly
saturated colors have no white
deep red has S=1
pinks have S=0
E.G.M. Petrakis
Visual Content
13
HSV Color Space
Value
Hue
V =
1
3
(R + G + B )
H cos
Saturation
1
2 R G B
2
S =1-
2
( R G ) ( R B )( G B )
3
R +G + B
min( R , G , B )
H = undefined for S = 0
H = 360 – H if B/V > G/V
E.G.M. Petrakis
Visual Content
14
Color in Retrievals
Color Histograms are very common
Simple to compute and compare
For the entire image or for image parts
3D histogram on RGB or HSV space (224 bins!)
1D histogram over the 3 primaries (256 bins)
Use HSV histograms: changes in lighting and
viewing angles may cause major variations in
RGB histograms
Invariant under translation, rotation, viewing
angle and scaling
E.G.M. Petrakis
Visual Content
15
1D histogram
A.Del Bimbo 99
E.G.M. Petrakis
Visual Content
16
Histogram Comparison
Histogram intersection
Q, I: histograms of a
query and database
image
N: histogram bins
3D (RGB, HSV)
intersection is defined
accordingly
A.Del Bimbo 99
S (I ,Q )
N
i 1
min( I i , Q i )
N
i 1
Qi
normalized intersection
E.G.M. Petrakis
Visual Content
17
Reducing Complexity
Reduce number of histogram bins
Transform RGB histogram to (rg,by,wb)
rg = R – G, by = 2B – R – G, wb = R + G + B
Intensity wb is more coarsely sampled than
rg, by
wb (8 sections), rg, by (16 sections)
The resulting histogram has 2048 bins
Reduced sensitivity to variations of
intensity
E.G.M. Petrakis
Visual Content
18
Reducing Complexity (cont,d)
Clustering detects the K most
prominent colors (e.g., K-means)
Histogram with K bins (e.g., K=64 or 256)
Each bin is the normalized count of pixels
in the cluster
E.G.M. Petrakis
Visual Content
19
Reducing Complexity (cont,d)
Recognize that only a small number of
bins capture the majority of pixels
Threshold to take only the large bins
Small bins are likely to be noisy bins thus
distorting the intersection
Does not degrade the performance
E.G.M. Petrakis
Visual Content
20
Distance Function
Certain pairs of bins correspond to
perceptually similar colors
In intersection all bins are compared
independently of each other
Define new Distance function:
A=(aij) represents bin proximity
aij based on proximity in the L*u*v space
2
t
D ( I , Q ) ( I Q ) A( I Q )
E.G.M. Petrakis
K
K
i 1
j 1
Visual Content
a ij ( x i y j )( x i y j )
21
L*u*v color space
A.Del Bimbo 99
E.G.M. Petrakis
Visual Content
22
Color Indexing
Color (feature) vector: histogram
Problems:
K is large (K=64 or 256)
Quadratic complexity of matching
SAMs assume independent attributes
Solution: GEMINI
Map to low dimensionality feature space
Lower bound distance: Df(I,Q) <= D(I,Q)
E.G.M. Petrakis
Visual Content
23
Definition of Df(I,Q)
Take some average color value on color
space (e.g., R,G,B)
average color of image: (Ravg,Gavg,Bavg)=
( R avg , G avg , B avg )
1 / P
P
i 1
R ( p ), 1 / P
P
i 1
G ( p ), 1 / P
P
i 1
B( p)
and
D f (I ,Q) (x - y) (x - y)
t
E.G.M. Petrakis
Visual Content
3
i 1
( xi - yi )
2
24
GEMINI Approach
Indexing in the 3D color space
Df < D(I,Q): see QBIC paper for proof
Map query Q to the same 3D space
Search the feature space
Clean-up answer set to eliminate false
drops
E.G.M. Petrakis
Visual Content
25
Texture
Repeative patterns of local variations of
intensity
Structural: identify placement rules of
structural primitives
the less effective approach
Statistical: characterize spatial
distribution of intensity in terms of
measurements
Haralick, Tamura features etc.
E.G.M. Petrakis
Visual Content
26
Texture Examples
Ballard and Brown 84
E.G.M. Petrakis
Visual Content
27
Structural Texture
Ballard and Brown 84
E.G.M. Petrakis
Visual Content
28
Statistical Texture
Ballard and Brown 84
E.G.M. Petrakis
Visual Content
29
Haralick Features [Haralick 73]
Set of 4 features characterizing the
intensity transitions of neighboring
pixels in various directions using
Gray-Tone Spatial-Dependence (GTSD)
arrays
One GTSD for each pixel neighborhood
Neighborhood: pixels in direction θ and
distance d
E.G.M. Petrakis
Visual Content
30
GTSD Array Pd,θ[i,j]
Counts pixel pairs in distance d having gray
levels i, j in direction θ
One GTSD for θ=(00, 450,900, 1350) and d=(1,2,..)
Intensity in range [0,k-1]: Pd,θ[i,j] is a k x k matrix
2
0
0
1
2
1
2
1
2
0
E.G.M. Petrakis
2
1
2
2
1
0
1
2
0
0
1
2
0
1
1
P[i,j]
d=1
i
1
j
Visual Content
16
0
2
2
0
2
1
2
1
2
3
2
2
0
1
2
31
Computing Pd,θ[i,j]
Count all pairs of pixels in which the first
pixel has value i and its matching pair
displaced by d=1 in θ = 450 or 1350 direction
has value j
Enter this count in the (i,j) position of Pd,θ[i,j]
E.g., there are 3 pairs [2,1], then P[2,1] = 3
Pd,θ[i,j] is not symmetric: Pd,θ[i,j] < > Pd,θ[j,i]
Normalize Pd,θ[i,j] by the total number of pairs
Pd,θ[i,j]: probability mass function
E.G.M. Petrakis
Visual Content
32
Texture Features
1) Angular Second Moment (ASM):
f1
k 1
k 1
i0
j0
p (i, j )
2
Small values for non homogeneous regions
2) Contrast:
f2
k 1
k 1
i0
j0
p [ i , j ]( i j )
2
Large values for many large transitions or
for many transitions
E.G.M. Petrakis
Visual Content
33
Texture Features (cont,d)
3) Correlation:
f3
x
y
k 1
i0
k 1
j0
ijp [ i , j ] x y
x y
k 1
i0
ip x [ i ],
k 1
j0
σx
jp y [ j ], σ y
k 1
i0
(i x ) p x [i ] , p x
k 1
j 0
2
( j y ) p y [ j], p y
2
k 1
j 1
k 1
i0
p [i , j ]
p [i , j ]
Frequency of intensity transitions
E.G.M. Petrakis
Visual Content
34
Texture Features (cont,d)
4) Entropy:
f4
k 1
i0
k 1
j0
p [ i , j ] log p [ i , j ]
High values for uniform p[i,j] i.e., no preferred
gray-level (no texture)
A vector for each Tθ,d=(f1,f2,f3,f4) or
A vector for every θ, d taking all Tθ,d in a
sequence
Correlated features: apply K-L to decorrelate and to reduce dimensionality
E.G.M. Petrakis
Visual Content
35
Shape
Assume that objects are extracted
Requires image segmentation
Difficult problem
Criteria for reliable shape recognition
Uniqueness of representation
Robustness against noise and distortion
Proportionality of variation
Invariance under scale, rotation and translation
Efficiency of computation
Occlusion: handle partially visible shapes
E.G.M. Petrakis
Visual Content
36
Shape Matching Methods
Two categories of methods based on:
Regions: represent and match properties of
regions
Contours: represent and match properties
of boundaries
Techniques: local/global, model based,
fuzzy, statistical, neural networks
E.G.M. Petrakis
Visual Content
37
Input/Output
For any two shapes and compute:
Their distance
The correspondences between similar parts
Petrakis 02
E.G.M. Petrakis
Visual Content
38
Moment Invariants
An object is represented by its binary
R
image
A set of 7 features can be defined
based on central moments
m
m
m x y , x
y
m
m
p
q
10
01
00
00
pq
pq
( x , y ) R
(x
p
x )( y y ), p, q 0,1,2...
q
( x , y ) R
E.G.M. Petrakis
Visual Content
39
Central Moments [Hu 62]
Invariant to translation and rotation
Use ηpq=μpq/μγ00 where γ=(p+q)/2 + 1 for p+q=2,3…
instead of μ’s in the above formulas to achieve scale
invariance
E.G.M. Petrakis
Visual Content
40
More Shape Methods
Moments can also be defined on the closed bounding
contours of objects [Gupta and Shinath 87]
Moments can also be defined for open curves [Koch
and Kashyap 89]
Methods based on the Fourier Transform of the
bounding contour have also been used [Wallace and
Wintz 80, Rauber and Steiger 92]
More efficient methods has also been proposed
[Petrakis, Diplaros and Milios 2002]. Examines many
of the above methods based on Fourier and Moments
and shows many experiments and comparisons
E.G.M. Petrakis
Visual Content
41
Spatial Relationships
Find images showing similar objects in similar
spatial relationships
find X-rays similar to Smith’s examination
find images showing a tree close to a house
one of the two images may contain extra objects
Q
E.G.M. Petrakis
I
Visual Content
Petrakis02
42
Methods
Two main categories of methods
Spatial projections (2D strings and variants
like 2D C strings, Expanded 2D strings etc).
Attributed Relational Graphs (ARGs)
Image distance is defined accordingly
Editing distance on ARGs
2D string matching
E.G.M. Petrakis
Visual Content
43
Image Segmentation
All methods assume segmented images
image are segmented manually or semimanually
image segmentation is a difficult problem
Petrakis02
E.G.M. Petrakis
Visual Content
44
Image Features
Individual objects: 5-dimensional vectors
Size: number of pixels in a region
Perimeter: length of bounding contour
Roundness: ratio of smallest/largest second
moment
orientation: angle with x direction (sin,cos)
Spatial Relationships: 4-dimensional vectors
Position: inside or outside
Distance: minimum distance of contours
Orientation: angle with x (sin,cos) of c.g.’s
E.G.M. Petrakis
Visual Content
45
Attributes Relational Graphs
(ARGs)
Objects are
represented by nodes
Relationships are
represented by arcs
Nodes and arcs are
labeled by feature
vectors
Matching: ARG editing
distance, Hungarian
[Petrakis 02]
E.G.M. Petrakis
Visual Content
Petrakis02
46
ARG Editing Distance
Matching: sequence of edit operations
that transform a query Q to an image I
Edit operation: node or arc insertion,
deletion or substitution
D ( Q , I ) min S ( G ' ) F ( S ( I ))
min S ( I ) F f ( k ), f ( k 1 ),... f ( 1 )
F combines the costs of edit operations
f is the cost of an edit operation defined
as a vector distance
E.G.M. Petrakis
Visual Content
47
Matching Algorithm [Messmer95]
Find the sequence of edit operations
that yield the minimum total cost
Formulated as tree search problem
Expand all possible matching sequences
Branch and bound
Tree node: matching of ARG node
Tree arc: matching of ARG edges
Subtree: matching of subgraphs of Q, I
E.G.M. Petrakis
Visual Content
48
Query Q
E.G.M. Petrakis
Model
I
Visual Content
49
Hungarian Method [Petrakis 02]
Matching: assignment
D F (Q , I )
problem
The relationships are
ignored
F: cost of a mapping
C(i,F(i)): vector
distance
E.G.M. Petrakis
Visual Content
min
F
n
i 1
C ( i , F ( i ))
50
2D String [Chang 87]
2D string: projections of c.g.’s along x and y
Each object is represented by a name or class
Matching: string matching (type 0,1 and 2)
E.G.M. Petrakis
Visual Content
51
Discussion [Petrakis 02]
The ARG editing distance is the most
accurate method followed by Hungarian and
2D strings
2D strings is the faster method followed by
Hungarian and ARG distance
Speed and Accuracy are traded-off: the most
accurate a method the slower it is
Indexing: Petrakis 2002, Petrakis & Faloutsos
97 (ARGs), Petrakis 93 (2D strings)
http://www.ced.tuc.gr/~petrakis/publications
/publications.htm
E.G.M. Petrakis
Visual Content
52
Image Segmentation
All methods assumed segmented images
Segmentation is the process of partitioning an
image into groups of connected pixels
(regions) with similar properties
Gray levels
Colors
Textures
Motion characteristics (motion vectors)
Edge continuity
E.G.M. Petrakis
Visual Content
53
Segmentation Methods
Two approaches
Region segmentation
Edge segmentation
Regions may correspond to objects
Not always perfect (noise, bad
illumination, 3D world etc.)
Further reading: "Machine Vision'', R.
Jain, R. Kasturi, B. G. Schunck, Mc
Graw-Hill, 1995
E.G.M. Petrakis
Visual Content
54
Region Segmentation
Converts a gray-level image into a
binary one by applying carefully
selected thresholds on intensity
histograms
The image is partitioned into two sets
Black pixels: objects
White pixels: background
E.G.M. Petrakis
Visual Content
55
Histogram Thresholding
The threshold distinguishes the objects from
the background
The objects have similar gray-level values
E.G.M. Petrakis
Visual Content
56
Thresholding
Find thresholds automatically by analyzing
the gray value distribution (histogram) of the
image
Objects are dark against a light background
Their gray-value distributions can be separated
putting thresholds between them
Automatic thresholding is based on
peackiness and valleyness measurements at
each point of the histogram
E.G.M. Petrakis
Visual Content
57
Further Reading
C. Faloutsos et.al. “Efficient and Effective Querying by Image Content”,
Journal of Intelligent Information Systems, Vol. 3, No. ¾, pp. 231-262,
1994
M. Flicknet et.al. “Query by Image and Video Content: the QBIC
Systems”, IEEE Computer, Vol. 28, No. 9, pp. 13-32, Sept. 1995
R.C.Veltkamp and M.Tanase “Content-Based Image Retrieval Systems: A
Survey”, TR UU-CS-2000-34, Utrecht University, March 2001
A.W.M.Smeulders et.al., “Content Based Image Retrieval at the End of
the Early years”, IEEE Transactions on PAMI, 22(12): 1349-1380, 2000
R.Schettini, et al. “A Survey on Methods for Colour Image Indexing and
Retrieval in Image Databases” , in: R.Luo and L.MacDonald (Eds.), Color
Imaging Science: Exploiting Digital Media, John Wiley, 2001
M. Swain, D.H.Ballard, “Color Indexing”, Intern. Journ. of Comp. Vision,
Vol. 7, No. 1, pp. 11-32, 1991
D. Androutsos et.al. “A Novel Vector-Based Approach to Color Image
Retrieval Using a Vector Angular-Based Distance Measure”, Comp.
Vision and Image Understanding, Vol. 75, No. ½, July/Aug. 1999, pp. 4658.
E.G.M. Petrakis
Visual Content
58
References
J.R.Smith, S-F.Chang, “Tools and Techniques for Color Image Retrieval”,
IS&T/SPIE Proc., Vol. 2670, Storage and Retrieval for Image and Video
Databases IV
R.M. Haralick, K. Shanmungam, I. Dinstein “Textural Features for Image
Classification”, IEEE Trans. on Systems Man and Cybernetics, 1973, pp. 610-621.
E.G.M. Petrakis, A. Diplaros and E. Milios: "Matching and Retrieval of Distorted
and Occluded Shapes using Dynamic Programming", IEEE Trans. on PAMI, Vol.
24, No. 11, Nov. 2002, pp. 1-16.
M.-K. Hu. Visual Pattern Recogn. by Moment Invariants. IRE Trans. on Info.
Theory, IT-8:179–187, 1962.
T. P. Wallace and P. A. Wintz. An Efficient Three-Dimensional Aircraft
Recognition Algorithm Using Normalized Fourier Descriptors. Computer Graphics
and Image Processing, 13:99–126, 1980.
T.W. Rauber and A.S. Steiger-Carcao, “Shape Description by UNL Fourier
Features – An Application to Handwritten Character Recognition, 11th IAPR
Intern. Conf. on Pattern Recogn., 30.Aug.-3.Sept. 1992, The Hague, The
Netherlands (click here for implementation).
E.G.M. Petrakis
Visual Content
59
References
L. Gupta and M.D. Shrinath, “Contour Sequence Moments for the Classification
of Closed Planar Shapes”, Pattern Recognition, Vol. 20, No. 3, pp. 267-272, 1987
M.W.Koch and R.L.Kashyap, “Matching Polygon Fragments”, Pattern Recognition
Letters, No. 10, pp. 297-308,1989.
Euripides G.M. Petrakis, Aristeidis Diplaros and Evangelos Milios: "Matching and
Retrieval of Distorted and Occluded Shapes using Dynamic Programming", IEEE
Transactions on Pattern Analysis and Machine Intelligence, Vol. 24, No. 11,
November 2002, pp. 1-16.
Euripides G.M. Petrakis, Aristeidis Diplaros and Evangelos Milios: "Matching and
Retrieval of Distorted and Occluded Shapes using Dynamic Programming", IEEE
Transactions on Pattern Analysis and Machine Intelligence, Vol. 24, No. 11,
November 2002, pp. 1-16.
Euripides G.M. Petrakis: "Fast Retrieval by Spatial Structure in Image
DataBases", Journal of Visual Languages and Computing, Vol. 13, No. 5, October
2002, pp. 545-569.
Euripides G.M. Petrakis: "Design and Evaluation of Spatial Similarity Approaches
for Image Retrieval", Image and Vision Computing, January 2002, Number 1,
Volume 20, pp. 59-76.
Euripides G.M. Petrakis and Christos Faloutsos: "Similarity Searching in Medical
Image Databases", IEEE Transactions on Knowledge and Data Engineering, Vol. 9,
No. 3, pp. 435-447, May/June 1997.
E.G.M. Petrakis
Visual Content
60