Learning with Ambiguity in Computer Vision

Download Report

Transcript Learning with Ambiguity in Computer Vision

Boris Babenko, Steve Branson, Serge Belongie
University of California, San Diego
ICCV 2009, Kyoto, Japan
• Recognizing multiple categories
– Need meaningful similarity metric / feature space
• Recognizing multiple categories
– Need meaningful similarity metric / feature space
• Idea: use training data to learn metric, plug
into kNN
– Goes by many names:
•
•
•
•
metric learning
cue combination/weighting
kernel combination/learning
feature selection
• Learn a single global similarity metric
Category 1
Category 4
[ Jones et al. ‘03,
Chopra et al. ‘05,
Goldberger et al. ‘05,
Shakhnarovich et al. ‘05
Torralba et al. ‘08]
Category 3
Category 2
Monolithic
Query Image Similarity Metric Labeled Dataset
• Learn similarity metric for each category (1-vs-all)
Category 1
Category 3
Category 4
[ Varma et al. ‘07,
Frome et al. ‘07,
Weinberger et al. ‘08
Nilsback et al. ’08]
Category
Specific
Category 2
Monolithic
Query Image Similarity Metric Labeled Dataset
• Per category:
– More powerful
– Do we really need thousands of metrics?
– Have to train for new categories
• Global/Monolithic:
– Less powerful
– Can generalize to new categories
• Would like to explore space between two
extremes
• Idea:
– Group categories together
– Learn a few similarity metrics, one for each supercategory
• Learn a few good similarity metrics
Category 1
Category 2
Category 3
Category 4
Category
Specific
MuSL
Monolithic
Query Image Similarity Metric Labeled Dataset
• Need some framework to work with…
• Boosting has many advantages:
– Feature selection
– Easy implementation
– Performs well
• Can treat metric learning as binary
classification
• Training data:
Images
Category Labels
• Generate pairs:
– Sample negative pairs
(
,
), 1
(
,
), 0
• Train similarity metric/classifier:
• Choose
to be binary -- i.e.
•
= L1 distance over binary vectors
– Can pre-compute for training data
– Efficient to compute (XOR and sum)
• For convenience:
[Shakhnarovich et al. ’05, Fergus et al. ‘08]
• Given some objective function
• Boosting = gradient ascent in function space
• Gradient = example weights for boosting
chosen weak
classifier
current strong
classifier
other weak classifiers
function space
[Friedman ’01, Mason et al. ‘00]
• Goal: train
mapping
• At runtime
and recover
4ategory 3 Category 2 Category 1
Category C
– To compute similarity of query image to
use
• Run pre-processing to group categories (i.e. kmeans), then train as usual
• Drawbacks:
– Hacky / not elegant
– Not optimal: pre-processing not informed by class
confusions, etc.
• How can we train & group simultaneously?
• Definitions:
Sigmoid Function
Parameter
• Definitions:
• Definitions:
How well
works with category
• Objective function:
• Each category “assigned” to classifier
• Replace max with differentiable approx.
where
is a scalar parameter
• Each training pair has
weights
• Intuition:
Approximation of
Difficulty of pair
(like regular boosting)
Difficult Pair
Assigned to
Easy Pair
Assigned to
-4
-5
x 10
x 10
2
1.5
1
6
w1i
5
w2i
4
w3i
3
2
0.5
1
10
20
(boosting iteration)
30
10
20
(boosting iteration)
30
for
for
- Compute weights
- Train
on weighted pairs
end
end
Assign
• Created dataset with hierarchical structure of
categories
Accuracy
0.8
0.75
MuSL+retrain
MuSL
k-means
Rand
Monolithic
Per Cat
0.7
Merged categories from:
• Caltech 101 [Griffin et al.]
• Oxford Flowers [Nilsback et al.]
• UIUC Textures [Lazebnik et al.]
0.65
5
10
15
K (number of classifiers)
20
k-means
MuSL
New categories only
Both new and old categories
Training more metrics overfits!
• Studied categorization performance vs
number of learned metrics
• Presented boosting algorithm to
simultaneously group categories and train
metrics
• Observed overfitting behavior for novel
categories
• Supported by
– NSF CAREER Grant #0448615
– NSF IGERT Grant DGE-0333451
– ONR MURI Grant #N00014-08-1-0638
– UCSD FWGrid Project (NSF Infrastructure Grant
no. EIA-0303622)