Representation and Retrieval of Visual Content

Download Report

Transcript Representation and Retrieval of Visual Content

Retrieval of Visual Content
Images comprise the vast majority of
data in many application domains
Remote sensing (NASA, 1 terabyte per day)
Astronomy
Geographic Information Systems (GIS)
Medicine (CT, MRI, etc.)
Criminal investigation
Trademark authentication
E.G.M. Petrakis
Visual Content
1
Images in Multimedia Systems
Images co-exist with other types of
data in Multimedia Documents
text
attribute
video
sound
E.G.M. Petrakis
Visual Content
2
Content-Based Image Retrieval
Descriptions of image content are
extracted and stored
Manually: mainly text descriptions
Difficult
Subjective
Automatically: features from content
Computationally expensive
Inexact
Domain specific
E.G.M. Petrakis
Visual Content
3
System Architecture
E.G.M. Petrakis
Visual Content
4
Design Issues
Feature Extraction (functions)
Feature Selection
Organization of stored information, file
structures, indexing
Search and retrieval strategies
Sequential / Indexed search / Query refinement
Query language: conditional / example queries
User interface design
E.G.M. Petrakis
Visual Content
5
Image Descriptions
Subjective interpretation of content: means
different things to different people
Different features for different applications
Colour is important of out-door image but not for
X-rays, CT, MRI etc.
Motion features are sometimes important
(ultrasound)
Different systems for different applications
E.G.M. Petrakis
Visual Content
6
Levels Representation
Low at pixel level (e.g., intensities,
colors)
Intermediate at region level (e.g.,
region, shape, motion features, motion)
High – Semantic human interpretations
(e.g., a class per object or image or
domain concepts such as diagnosis …)
E.G.M. Petrakis
Visual Content
7
Conflicting issues
Dependence on image content,
computational overhead and uncertainty
increases from low to high level
Selection depends on application, image
type, user requirements, query types
E.G.M. Petrakis
Visual Content
8
Reliability Criteria
Uniqueness
Proportionality of variation
Robustness against noise
Invariance under translation, rotation,
scaling
Computationally efficient
Content at various level of detail
E.G.M. Petrakis
Visual Content
9
Generic Features
Feature vectors of
intensity / color
texture
spatial relationships
motion
combinations of the above
Two kinds of features
global: computed for the entire image
local: computed for objects or image parts
E.G.M. Petrakis
Visual Content
10
RGB Color Space
Popular hardware
oriented scheme
Colors form a unit cube
r = R/(R+G+B)
g = G/(R+G+B)
b = B/(R+G+B)
RGB is good for
acquisition and display
but not for the
perception of colors
E.G.M. Petrakis
Visual Content
11
Munsell Color Space
Color in cylindrical
coordinates
Brightness: vertical
axis
Hue: angular
displacement
Saturation:
cylindrical radius
E.G.M. Petrakis
Visual Content
12
Color Definitions
Brightness: intensity of color,
average intensity over all
wavelengths
Hue: proportional to the average
wavelength of the color percept
Saturation: amount of white, highly
saturated colors have no white
deep red has S=1
pinks have S=0
E.G.M. Petrakis
Visual Content
13
HSV Color Space
Value
Hue
V =
1
3
(R + G + B )
H  cos
Saturation
1
2 R G  B
2
S =1-
2
( R  G )  ( R  B )( G  B )
3
R +G + B
min( R , G , B )
H = undefined for S = 0
H = 360 – H if B/V > G/V
E.G.M. Petrakis
Visual Content
14
Color in Retrievals
Color Histograms are very common
Simple to compute and compare
For the entire image or for image parts
3D histogram on RGB or HSV space (224 bins!)
1D histogram over the 3 primaries (256 bins)
Use HSV histograms: changes in lighting and
viewing angles may cause major variations in
RGB histograms
Invariant under translation, rotation, viewing
angle and scaling
E.G.M. Petrakis
Visual Content
15
1D histogram
A.Del Bimbo 99
E.G.M. Petrakis
Visual Content
16
Histogram Comparison
Histogram intersection
Q, I: histograms of a
query and database
image
N: histogram bins
3D (RGB, HSV)
intersection is defined
accordingly
A.Del Bimbo 99
S (I ,Q ) 

N
i 1
min( I i , Q i )

N
i 1
Qi
normalized intersection
E.G.M. Petrakis
Visual Content
17
Reducing Complexity
Reduce number of histogram bins
Transform RGB histogram to (rg,by,wb)
rg = R – G, by = 2B – R – G, wb = R + G + B
Intensity wb is more coarsely sampled than
rg, by
wb (8 sections), rg, by (16 sections)
The resulting histogram has 2048 bins
Reduced sensitivity to variations of
intensity
E.G.M. Petrakis
Visual Content
18
Reducing Complexity (cont,d)
Clustering detects the K most
prominent colors (e.g., K-means)
Histogram with K bins (e.g., K=64 or 256)
Each bin is the normalized count of pixels
in the cluster
E.G.M. Petrakis
Visual Content
19
Reducing Complexity (cont,d)
Recognize that only a small number of
bins capture the majority of pixels
Threshold to take only the large bins
Small bins are likely to be noisy bins thus
distorting the intersection
Does not degrade the performance
E.G.M. Petrakis
Visual Content
20
Distance Function
Certain pairs of bins correspond to
perceptually similar colors
In intersection all bins are compared
independently of each other
Define new Distance function:
A=(aij) represents bin proximity
aij based on proximity in the L*u*v space
2
t
D ( I , Q )  ( I  Q ) A( I  Q ) 
E.G.M. Petrakis
K
K
i 1
j 1
 
Visual Content
a ij ( x i  y j )( x i  y j )
21
L*u*v color space
A.Del Bimbo 99
E.G.M. Petrakis
Visual Content
22
Color Indexing
Color (feature) vector: histogram
Problems:
K is large (K=64 or 256)
Quadratic complexity of matching
SAMs assume independent attributes
Solution: GEMINI
Map to low dimensionality feature space
Lower bound distance: Df(I,Q) <= D(I,Q)
E.G.M. Petrakis
Visual Content
23
Definition of Df(I,Q)
Take some average color value on color
space (e.g., R,G,B)
average color of image: (Ravg,Gavg,Bavg)=
( R avg , G avg , B avg ) 
1 / P 
P
i 1
R ( p ), 1 / P 
P
i 1
G ( p ), 1 / P 
P
i 1
B( p)

 and
D f (I ,Q)  (x - y) (x - y) 
t
E.G.M. Petrakis
Visual Content

3
i 1
( xi - yi )
2
24
GEMINI Approach
Indexing in the 3D color space
Df < D(I,Q): see QBIC paper for proof
Map query Q to the same 3D space
Search the feature space
Clean-up answer set to eliminate false
drops
E.G.M. Petrakis
Visual Content
25
Texture
Repeative patterns of local variations of
intensity
Structural: identify placement rules of
structural primitives
the less effective approach
Statistical: characterize spatial
distribution of intensity in terms of
measurements
Haralick, Tamura features etc.
E.G.M. Petrakis
Visual Content
26
Texture Examples
Ballard and Brown 84
E.G.M. Petrakis
Visual Content
27
Structural Texture
Ballard and Brown 84
E.G.M. Petrakis
Visual Content
28
Statistical Texture
Ballard and Brown 84
E.G.M. Petrakis
Visual Content
29
Haralick Features [Haralick 73]
Set of 4 features characterizing the
intensity transitions of neighboring
pixels in various directions using
Gray-Tone Spatial-Dependence (GTSD)
arrays
One GTSD for each pixel neighborhood
Neighborhood: pixels in direction θ and
distance d
E.G.M. Petrakis
Visual Content
30
GTSD Array Pd,θ[i,j]
Counts pixel pairs in distance d having gray
levels i, j in direction θ
One GTSD for θ=(00, 450,900, 1350) and d=(1,2,..)
Intensity in range [0,k-1]: Pd,θ[i,j] is a k x k matrix
2
0
0
1
2
1
2
1
2
0
E.G.M. Petrakis
2
1
2
2
1
0
1
2
0
0
1
2
0
1
1
P[i,j]
d=1
i
1
j
Visual Content
16
0
2
2
0
2
1
2
1
2
3
2
2
0
1
2
31
Computing Pd,θ[i,j]
Count all pairs of pixels in which the first
pixel has value i and its matching pair
displaced by d=1 in θ = 450 or 1350 direction
has value j
Enter this count in the (i,j) position of Pd,θ[i,j]
E.g., there are 3 pairs [2,1], then P[2,1] = 3
Pd,θ[i,j] is not symmetric: Pd,θ[i,j] < > Pd,θ[j,i]
Normalize Pd,θ[i,j] by the total number of pairs
Pd,θ[i,j]: probability mass function
E.G.M. Petrakis
Visual Content
32
Texture Features
1) Angular Second Moment (ASM):
f1 
k 1
k 1
i0
j0
 
p (i, j )
2
 Small values for non homogeneous regions
2) Contrast:
f2 
k 1
k 1
i0
j0
 
p [ i , j ]( i  j )
2
 Large values for many large transitions or
for many transitions
E.G.M. Petrakis
Visual Content
33
Texture Features (cont,d)
3) Correlation:
f3 
x 
y 

k 1
i0

k 1
j0

ijp [ i , j ]   x  y
 x y


k 1
i0
ip x [ i ],
k 1
j0
σx 
jp y [ j ], σ y 


k 1
i0
(i   x ) p x [i ] , p x 
k 1
j 0
2
( j   y ) p y [ j], p y 
2

k 1
j 1

k 1
i0
p [i , j ]
p [i , j ]
Frequency of intensity transitions
E.G.M. Petrakis
Visual Content
34
Texture Features (cont,d)
4) Entropy:
f4  
k 1
i0

k 1
j0
p [ i , j ] log p [ i , j ]
 High values for uniform p[i,j] i.e., no preferred
gray-level (no texture)
 A vector for each Tθ,d=(f1,f2,f3,f4) or
 A vector for every θ, d taking all Tθ,d in a
sequence
 Correlated features: apply K-L to decorrelate and to reduce dimensionality
E.G.M. Petrakis
Visual Content
35
Shape
Assume that objects are extracted
Requires image segmentation
Difficult problem
Criteria for reliable shape recognition
Uniqueness of representation
Robustness against noise and distortion
Proportionality of variation
Invariance under scale, rotation and translation
Efficiency of computation
Occlusion: handle partially visible shapes
E.G.M. Petrakis
Visual Content
36
Shape Matching Methods
Two categories of methods based on:
Regions: represent and match properties of
regions
Contours: represent and match properties
of boundaries
Techniques: local/global, model based,
fuzzy, statistical, neural networks
E.G.M. Petrakis
Visual Content
37
Input/Output
For any two shapes and compute:
Their distance
The correspondences between similar parts
Petrakis 02
E.G.M. Petrakis
Visual Content
38
Moment Invariants
An object is represented by its binary
R
image
A set of 7 features can be defined
based on central moments
m
m
m  x y , x 
y
m
m
p
q
10
01
00
00
pq
 pq 
( x , y ) R
 (x
p
 x )( y  y ), p, q  0,1,2...
q
( x , y ) R
E.G.M. Petrakis
Visual Content
39
Central Moments [Hu 62]
 Invariant to translation and rotation
 Use ηpq=μpq/μγ00 where γ=(p+q)/2 + 1 for p+q=2,3…
instead of μ’s in the above formulas to achieve scale
invariance
E.G.M. Petrakis
Visual Content
40
More Shape Methods
 Moments can also be defined on the closed bounding
contours of objects [Gupta and Shinath 87]
 Moments can also be defined for open curves [Koch
and Kashyap 89]
 Methods based on the Fourier Transform of the
bounding contour have also been used [Wallace and
Wintz 80, Rauber and Steiger 92]
 More efficient methods has also been proposed
[Petrakis, Diplaros and Milios 2002]. Examines many
of the above methods based on Fourier and Moments
and shows many experiments and comparisons
E.G.M. Petrakis
Visual Content
41
Spatial Relationships
Find images showing similar objects in similar
spatial relationships
find X-rays similar to Smith’s examination
find images showing a tree close to a house
one of the two images may contain extra objects
Q
E.G.M. Petrakis
I
Visual Content
Petrakis02
42
Methods
Two main categories of methods
Spatial projections (2D strings and variants
like 2D C strings, Expanded 2D strings etc).
Attributed Relational Graphs (ARGs)
Image distance is defined accordingly
Editing distance on ARGs
2D string matching
E.G.M. Petrakis
Visual Content
43
Image Segmentation
All methods assume segmented images
image are segmented manually or semimanually
image segmentation is a difficult problem
Petrakis02
E.G.M. Petrakis
Visual Content
44
Image Features
Individual objects: 5-dimensional vectors
Size: number of pixels in a region
Perimeter: length of bounding contour
Roundness: ratio of smallest/largest second
moment
orientation: angle with x direction (sin,cos)
Spatial Relationships: 4-dimensional vectors
Position: inside or outside
Distance: minimum distance of contours
Orientation: angle with x (sin,cos) of c.g.’s
E.G.M. Petrakis
Visual Content
45
Attributes Relational Graphs
(ARGs)
Objects are
represented by nodes
Relationships are
represented by arcs
Nodes and arcs are
labeled by feature
vectors
Matching: ARG editing
distance, Hungarian
[Petrakis 02]
E.G.M. Petrakis
Visual Content
Petrakis02
46
ARG Editing Distance
Matching: sequence of edit operations
that transform a query Q to an image I
Edit operation: node or arc insertion,
deletion or substitution
D ( Q , I )  min S ( G ' ) F ( S ( I ))  
min S ( I ) F  f (  k ), f (  k 1 ),... f (  1 ) 
F combines the costs of edit operations
f is the cost of an edit operation defined
as a vector distance
E.G.M. Petrakis
Visual Content
47
Matching Algorithm [Messmer95]
Find the sequence of edit operations
that yield the minimum total cost
Formulated as tree search problem
Expand all possible matching sequences
Branch and bound
Tree node: matching of ARG node
Tree arc: matching of ARG edges
Subtree: matching of subgraphs of Q, I
E.G.M. Petrakis
Visual Content
48
Query Q
E.G.M. Petrakis
Model
I
Visual Content
49
Hungarian Method [Petrakis 02]
Matching: assignment
D F (Q , I ) 
problem
The relationships are
ignored
F: cost of a mapping
C(i,F(i)): vector
distance
E.G.M. Petrakis
Visual Content
min
F

n
i 1
C ( i , F ( i ))
50

2D String [Chang 87]
2D string: projections of c.g.’s along x and y
Each object is represented by a name or class
Matching: string matching (type 0,1 and 2)
E.G.M. Petrakis
Visual Content
51
Discussion [Petrakis 02]
The ARG editing distance is the most
accurate method followed by Hungarian and
2D strings
2D strings is the faster method followed by
Hungarian and ARG distance
Speed and Accuracy are traded-off: the most
accurate a method the slower it is
Indexing: Petrakis 2002, Petrakis & Faloutsos
97 (ARGs), Petrakis 93 (2D strings)
http://www.ced.tuc.gr/~petrakis/publications
/publications.htm
E.G.M. Petrakis
Visual Content
52
Image Segmentation
All methods assumed segmented images
Segmentation is the process of partitioning an
image into groups of connected pixels
(regions) with similar properties
Gray levels
Colors
Textures
Motion characteristics (motion vectors)
Edge continuity
E.G.M. Petrakis
Visual Content
53
Segmentation Methods
Two approaches
Region segmentation
Edge segmentation
Regions may correspond to objects
Not always perfect (noise, bad
illumination, 3D world etc.)
Further reading: "Machine Vision'', R.
Jain, R. Kasturi, B. G. Schunck, Mc
Graw-Hill, 1995
E.G.M. Petrakis
Visual Content
54
Region Segmentation
Converts a gray-level image into a
binary one by applying carefully
selected thresholds on intensity
histograms
The image is partitioned into two sets
Black pixels: objects
White pixels: background
E.G.M. Petrakis
Visual Content
55
Histogram Thresholding
The threshold distinguishes the objects from
the background
The objects have similar gray-level values
E.G.M. Petrakis
Visual Content
56
Thresholding
Find thresholds automatically by analyzing
the gray value distribution (histogram) of the
image
Objects are dark against a light background
Their gray-value distributions can be separated
putting thresholds between them
Automatic thresholding is based on
peackiness and valleyness measurements at
each point of the histogram
E.G.M. Petrakis
Visual Content
57
Further Reading
 C. Faloutsos et.al. “Efficient and Effective Querying by Image Content”,
Journal of Intelligent Information Systems, Vol. 3, No. ¾, pp. 231-262,
1994
 M. Flicknet et.al. “Query by Image and Video Content: the QBIC
Systems”, IEEE Computer, Vol. 28, No. 9, pp. 13-32, Sept. 1995
 R.C.Veltkamp and M.Tanase “Content-Based Image Retrieval Systems: A
Survey”, TR UU-CS-2000-34, Utrecht University, March 2001
 A.W.M.Smeulders et.al., “Content Based Image Retrieval at the End of
the Early years”, IEEE Transactions on PAMI, 22(12): 1349-1380, 2000
 R.Schettini, et al. “A Survey on Methods for Colour Image Indexing and
Retrieval in Image Databases” , in: R.Luo and L.MacDonald (Eds.), Color
Imaging Science: Exploiting Digital Media, John Wiley, 2001
 M. Swain, D.H.Ballard, “Color Indexing”, Intern. Journ. of Comp. Vision,
Vol. 7, No. 1, pp. 11-32, 1991
 D. Androutsos et.al. “A Novel Vector-Based Approach to Color Image
Retrieval Using a Vector Angular-Based Distance Measure”, Comp.
Vision and Image Understanding, Vol. 75, No. ½, July/Aug. 1999, pp. 4658.
E.G.M. Petrakis
Visual Content
58
References
 J.R.Smith, S-F.Chang, “Tools and Techniques for Color Image Retrieval”,
IS&T/SPIE Proc., Vol. 2670, Storage and Retrieval for Image and Video
Databases IV
 R.M. Haralick, K. Shanmungam, I. Dinstein “Textural Features for Image
Classification”, IEEE Trans. on Systems Man and Cybernetics, 1973, pp. 610-621.
 E.G.M. Petrakis, A. Diplaros and E. Milios: "Matching and Retrieval of Distorted
and Occluded Shapes using Dynamic Programming", IEEE Trans. on PAMI, Vol.
24, No. 11, Nov. 2002, pp. 1-16.
 M.-K. Hu. Visual Pattern Recogn. by Moment Invariants. IRE Trans. on Info.
Theory, IT-8:179–187, 1962.
 T. P. Wallace and P. A. Wintz. An Efficient Three-Dimensional Aircraft
Recognition Algorithm Using Normalized Fourier Descriptors. Computer Graphics
and Image Processing, 13:99–126, 1980.
 T.W. Rauber and A.S. Steiger-Carcao, “Shape Description by UNL Fourier
Features – An Application to Handwritten Character Recognition, 11th IAPR
Intern. Conf. on Pattern Recogn., 30.Aug.-3.Sept. 1992, The Hague, The
Netherlands (click here for implementation).
E.G.M. Petrakis
Visual Content
59
References
 L. Gupta and M.D. Shrinath, “Contour Sequence Moments for the Classification
of Closed Planar Shapes”, Pattern Recognition, Vol. 20, No. 3, pp. 267-272, 1987
 M.W.Koch and R.L.Kashyap, “Matching Polygon Fragments”, Pattern Recognition
Letters, No. 10, pp. 297-308,1989.
 Euripides G.M. Petrakis, Aristeidis Diplaros and Evangelos Milios: "Matching and
Retrieval of Distorted and Occluded Shapes using Dynamic Programming", IEEE
Transactions on Pattern Analysis and Machine Intelligence, Vol. 24, No. 11,
November 2002, pp. 1-16.
 Euripides G.M. Petrakis, Aristeidis Diplaros and Evangelos Milios: "Matching and
Retrieval of Distorted and Occluded Shapes using Dynamic Programming", IEEE
Transactions on Pattern Analysis and Machine Intelligence, Vol. 24, No. 11,
November 2002, pp. 1-16.
 Euripides G.M. Petrakis: "Fast Retrieval by Spatial Structure in Image
DataBases", Journal of Visual Languages and Computing, Vol. 13, No. 5, October
2002, pp. 545-569.
 Euripides G.M. Petrakis: "Design and Evaluation of Spatial Similarity Approaches
for Image Retrieval", Image and Vision Computing, January 2002, Number 1,
Volume 20, pp. 59-76.
 Euripides G.M. Petrakis and Christos Faloutsos: "Similarity Searching in Medical
Image Databases", IEEE Transactions on Knowledge and Data Engineering, Vol. 9,
No. 3, pp. 435-447, May/June 1997.
E.G.M. Petrakis
Visual Content
60