Transcript Document

Computer Vision, Part 1
Topics for Vision Lectures
1. Content-Based Image Retrieval (CBIR)
2. Object recognition and scene
“understanding”
Content-Based Image Retrieval
Example: Google “Search by Image”
Basic technique
Extract Features
(Primitives)
Matched
Results
Similarity
Measure
Query Image
Image
Database
Features Database
Relevance
Feedback
Algorithm
From http://www.amrita.edu/cde/downloads/ACBIR.ppt
Each image in database is represented by a feature vector: x1, x2, ...xN, where xi = (xi1, xi2, …, xim)
Query is represented in terms of same features: Q =(Q1, Q2, …, Qm)
Goal: Find stored image with vector xi most similar to query vector Q
• Distance measure:
x i  Q  x1iQ1  x2iQ2  ... xm iQm
• Possible distance measures d(Q, xi):

– Inner (dot) product
– Histogram distance (for histogram features)
– Graph matching (for shape features)
.
.
.
Some issues in designing a CBIR system
• Query format, ease of querying
• Speed
• Crawling, preprocessing
• Interactivity, user relevance feedback
• Visual features — which to use? How to combine?
• Curse of dimensionality
• Indexing
• Evaluation of performance
Types of Features Typically Used
• Intensities
• Color
• Texture
• Shape
• Layout
http://www.clear.rice.edu/elec301/Projects02/
artSpy/intensity.html
Intensity histograms
Color Features
Hue, saturation, value
Color Histograms (8 colors)
http://www.owlnet.rice.edu/~elec301/Projects02/artSpy/patmac/mcolhist.gif
http://www.owlnet.rice.edu/~elec301/Projects02/artSpy/patmac/mcolhist.gif
Color auto-correlogram
• Pick any pixel p1 of color Ci in the image I.
• At distance k away from p1 pick another pixel p2.
• What is the probability that p2 is also of color Ci?
Red ?
k
p2
p1
Image: I
From: http://www.cse.ucsc.edu/classes/ee264/Winter02/xgfeng.ppt
• The auto-correlogram of image I for color Ci , distance k:
 C(k ) (I )  Pr[| p1  p2 | k, p2  IC | p1  IC ]
i
i
i
• Integrates both color information and space information.
From: http://www.cse.ucsc.edu/classes/ee264/Winter02/xgfeng.ppt
Two images with their autocorrelograms. Note that the
change in spatial layout would be ignored by color
histograms, but causes a significant difference in the
autocorrelograms.
From http://www.cs.cornell.edu/rdz/Papers/ecdl2/spatial.htm
From:
http://www.cs.cornell.ed
u/rdz/Papers/ecdl2/spati
al.htm
Color histogram rank: 411; Auto-correlogram rank: 1
Color histogram rank: 310; Auto-correlogram rank: 5
Color histogram rank: 367; Auto-correlogram rank: 1
Texture representations
• Gray-level co-occurrence
• Entropy
• Contrast
• Fourier and wavelet transforms
• Gabor filters
Texture Representations
Each image has the same intensity distribution, but different textures
Can use auto-correlogram based on intensity (“gray-level co-occurrence”)
Texture from entropy
Images filtered by entropy:
Each output pixel contains entropy value of 9x9 neighborhood
around original pixel
From: http://www.siim2011.org/abstracts/advanced_visualization_tools_ss_pao.html
Texture from fractal dimension
From:
http://www.cs.washington.edu/homes/rahul/data/ic
cv07.pdf
Texture from Contrast
• Example:
http://www.clear.rice.edu/elec301/Projects02/artSpy/grainin
ess.html
Texture from Wavelets
http://www.clear.rice.edu/elec301/Projects02/
artSpy/dwt.html
http://www.clear.rice.edu/elec301/Projects02/
artSpy/dwt.html
Shape representations
Some of these need segmentation (another whole story!)
• Area, eccentricity, major axis orientation
• Skeletons, shock graphs
• Fourier transformation of boundary
• Histograms of edge orientations
From: http://www.lems.brown.edu/vision/researchAreas/ShockMatching/shock-ed-match-results1.gif
Histogram of edge orientations
From: http://www.cs.ucl.ac.uk/staff/k.jacobs/teaching/prmv/Edge_histogramming.jpg
Visual abilities largely missing from current CBIR
systems
• Object recognition
• Perceptual organization
• Similarity between semantic concepts
Examples of “semantic” similarity
Image 1
Image 2
Examples of “semantic” similarity
Image 1
Image 2
Examples of “semantic” similarity
Image 1
Image 2
“In general, current systems have not yet had
significant impact on society due to an
inability to bridge the semantic gap between
computers and humans.”
Image Understanding and AnalogyMaking
Bongard problems as an idealized domain
for exploring the “semantic gap”