Agenda • Introduction • Bag-of-words models • Visual words with spatial location • Part-based models • Discriminative methods • Segmentation and recognition • Recognition-based image retrieval • Datasets.

Download Report

Transcript Agenda • Introduction • Bag-of-words models • Visual words with spatial location • Part-based models • Discriminative methods • Segmentation and recognition • Recognition-based image retrieval • Datasets.

Agenda

• Introduction • Bag-of-words models • Visual words with spatial location • Part-based models • Discriminative methods • Segmentation and recognition • Recognition-based image retrieval • Datasets & Conclusions

Retrieval domains

Internet image search Video search for people/objects Searching home photo collections

• Learning from Internet Image Search • Joint learning of text and images • Large scale retrieval

Noisy labels

Improving Google’s Image Search

• • Fergus, Fei-Fei, Perona, Zisserman, ICCV 2005 Variant of pLSA that includes spatial information

Re-ranking result: Motorbike

Topics in model Automatically chosen topic

Animals on the Web

Berg and Forsyth, CVPR 2006 Gather images using text search Use LDA to discover “good” images using features based on nearby text, shape, color

Boostrapping of Image Search

Schroff, Zisserman, Criminisi, Harvesting Image Databases from the Web, ICCV 2007

Images returned with PENGUIN query Final ranking using SVM Removal of drawings and abstract images Naives Bayes ranking using noisy metadata Train SVM…….

Li, Wang, Fei-Fei CVPR 07

OPTIMOL

• Learning from Internet Image Search • Joint learning of text and images • Large scale retrieval

Matching Words and Pictures • Barnard, Duygulu, de Freitas, Forsyth, Blei, Jordan, JMLR 2003

Text to Images

Images to text

• Use Blobworld or nCuts to segments images into regions • Need to deduce labels attached to each image

Images to text result

Names and Faces in the News

Berg, Berg, Edwards, Maire, White, Teh, Learned-Miller, Forsyth. CVPR 2004 Collected 500,000 images and text captions from Yahoo! News 1. Find faces (standard face detector), rectify them to same pose.

2. Perform Kernel PCA and Linear Discriminant Analysis (LDA). 3. Extract names from text.

4. Cluster faces, with each name corresponding to a cluster.

5. Use language model to refine results

• Initial clusters

• Clusters refined with language model

• Learning from Internet Image Search • Joint learning of text and images • Large scale retrieval

Vocabulary tree

Nistér & Stewénius CVPR 2006.

KD-tree in descriptor space Inverse lookup of features Specific object recognition  Not category-level

Slide from D. Nister

Slide from D. Nister

Slide from D. Nister

Slide from D. Nister

Slide from D. Nister

Slide from D. Nister

Slide from D. Nister

Slide from D. Nister

Slide from D. Nister

Slide from D. Nister

Slide from D. Nister

Slide from D. Nister

Slide from D. Nister

Slide from D. Nister

Slide from D. Nister

Slide from D. Nister

Slide from D. Nister

Slide from D. Nister

Slide from D. Nister

Pyramid Match Hashing

Grauman & Darell, CVPR 2007 • Combines Pyramid Match Kernel (efficient computation of correspondences between two set of vectors) with Locality Sensitive Hashing (LSH) [Indyk & Motwani 98] • Allows matching of the set of features in a query image to sets of features in other images in time that is sublinear in # images • Theoretical guarantees

Semantic Hashing

• • Salakhutdinov and Hinton, SIGIR 2007 Torralba, Fergus, Weiss, CVPR 2008 • • Map images to compact binary codes Hash codes for fast lookup