Visual Categorization with Bags of Keypoints

Download Report

Transcript Visual Categorization with Bags of Keypoints

Visual Categorization with Bags of Keypoints

Gabriella Csurka Christopher R. Dance Lixin Fan Jutta Willamowski Cedric Bray Presented by Yun-hsueh Liu EE 148, Spring 2006 5/30/2006 1

What is Generic Visual Categorization?

 Categorization: distinguish different classes  Generic Visual Categorization:  Generic  to cope with many object types simultaneously  readily extended to new object types.

 Handle the variation in view, imaging, lighting, occlusion, and typical object and scene variations 5/30/2006 EE 148, Spring 2006 2

Previous Work in Computational Vision

 Single Category Detection Decide if a member of

one visual category

is present in a given image. (faces, cars, targets)  Content Based Image Retrieval Retrieve images on the basis of low-level image features, such as colors or textures.

 Recognition Distinguish between images of structurally distinct objects within one class. (say, different cell phones) 5/30/2006 EE 148, Spring 2006 3

Bag-of-Keypoints Approach

Interesting Point Detection Key Patch Extraction Feature Descriptors Bag of Keypoints Multi-class Classifier  0 .

1        .

.

.

  0 .

5 1 .

5        5/30/2006 EE 148, Spring 2006 4

SIFT Descriptors

Interesting Point Detection Key Patch Extraction Feature Descriptors Bag of Keypoints Multi-class Classifier  0 .

1        .

.

.

  0 .

5 1 .

5        5/30/2006 EE 148, Spring 2006 5

Bag of Keypoints (1)

Interesting Point Detection Key Patch Extraction Feature Descriptors Bag of Keypoints Multi-class Classifier  Construction of a vocabulary  Kmeans clustering  find “centroids”  (on all the descriptors we find from all the training images) Define a “vocabulary” as a set of “centroids”, where every centroid represents a “word”.

5/30/2006 EE 148, Spring 2006 6

Bag of Keypoints (2)

Interesting Point Detection Key Patch Extraction Feature Descriptors Bag of Keypoints Multi-class Classifier  Histogram  Counts the number of occurrences of different

visual words

in each image 5/30/2006 EE 148, Spring 2006 7

Multi-class Classifier

Interesting Point Detection Key Patch Extraction Feature Descriptors Bag of Keypoints Multi-class Classifier  In this paper, classification is based on conventional machine learning approaches  Naïve Bayes  Support Vector Machine (SVM) 5/30/2006 EE 148, Spring 2006 8

Multi-class classifier – Naïve Bayes (1)

 Let V = {v i }, i = 1,…,N, be a visual vocabulary, in which each v i represents a visual word (cluster centers) from the feature space.

 A set of labeled images

I

= {I i } .  Denote C j to represent our Classes, where j = 1,..,M  N(t,i) = number of times v i occurs in image I i (keypoint histogram)  Score approach: want to determine P(C j |I i ), where (*) 5/30/2006 EE 148, Spring 2006 9

Multi-class Classifier – Naïve Bayes (2)

 Goal: Find one specific class C j so that has maximum value  In order to avoid zero probability, use Laplace smoothing: 5/30/2006 EE 148, Spring 2006 10

Multi-class classifier – Support Vector Machine (SVM)

 Input: the keypoints histogram for each image  Multi-class  one-against-all approach  Linear SVM gives better performances than quadratic or cubic SVM  Goal: find hyperplanes which separate multi-class data with maximun margin 5/30/2006 EE 148, Spring 2006 11

Multi-class classifier – SVM (2)

5/30/2006 EE 148, Spring 2006 12

Evaluation of Multi-class Classifiers

 Three performance measures:  The confusion matrix   Each column of the matrix represents the instances in a predicted class Each row represents the instances in an actual class  The overall error rate   = Pr(output class = true class)  The mean ranks  The mean position of the correct labels when labels output by the multi class classifier are sorted by the classifier score.

5/30/2006 EE 148, Spring 2006 13

n-Fold Cross Validation

 What is “fold”?  Randomly break the dataset into n partitions  Example: suppose n = 10  Training on 2, 3,…,10; testing on 1 = result 1    Training on 1, 3,…,10; testing on 2 = result 2 … Answer = Average of result 1, result 2, ….

5/30/2006 EE 148, Spring 2006 14

Experiment on Naïve Bayes – k’s effect

 Present the overal error rate as a function of # of clusters k  Result  Error rate decreases as k increases  Selecting point: k = 1000  After passing the selecting point, the error rate decreases slowly 5/30/2006 EE 148, Spring 2006 15

Experiment on Naïve Bayes – Confusion Matrix

faces buildings trees cars phones bikes books error rate mean rank faces

76

2 3 4 9 2 4 24 1.49

buildings 4

44

2 1 15 15 19 56 1.88

trees 2 5

80

0 1 12 0 20 1.33

5/30/2006 EE 148, Spring 2006 cars 3 0 0

75

16 0 6 25 1.33

phones 4 5 0 3

70

8 7 27 1.63

bikes 4 1 5 1 14

73

2 27 1.57

books 13 3 0 4 11 0

69

31 1.57

16

Experiment on SVM – Confusion Matrix

faces buildings trees cars phones bikes books error rate mean rank 5/30/2006 faces

98

1 1 0 0 0 0 2 1.04

buildings 14

63

10 1 5 4 3 27 1.77

trees 10 3

81

1 4 1 0 19 1.28

1.30

EE 148, Spring 2006 cars 10 0 1

85

3 0 1 15 phones 34 3 0 5

55

1 2 45 1.83

bikes 0 1 6 0 2

91

0 9 1.09

books 13 6 0 5 3 0

73

27 1.39

17

Interpretation of Results

 The confusion matrix  In general, SVM has more correct predictions than Naïve Bayes does  The overall error rate  In general, Naïve Bayes > SVM  The Mean Rank  In general, SVM < Naïve Bayes 5/30/2006 EE 148, Spring 2006 18

Why do we have errors?

   There are objects from more than 2 classes in one image The data set is not totally clean (noise) Each image is given only one training label 5/30/2006 EE 148, Spring 2006 19

Conclusion

 Bag-of-Keypoints is a new and efficient generic visual categorizer.

 Evaluated on a seven-category database, this method is proved that it is robust to  Choice of clusters, background clutter, multiple objects  Any Questions?

 Thank you for listening to my presentation!! :) 5/30/2006 EE 148, Spring 2006 20