Agenda • Introduction • Bag-of-words models • Visual words with spatial location • Part-based models • Discriminative methods • Segmentation and recognition • Recognition-based image retrieval • Datasets.
Download ReportTranscript Agenda • Introduction • Bag-of-words models • Visual words with spatial location • Part-based models • Discriminative methods • Segmentation and recognition • Recognition-based image retrieval • Datasets.
Agenda
• Introduction • Bag-of-words models • Visual words with spatial location • Part-based models • Discriminative methods • Segmentation and recognition • Recognition-based image retrieval • Datasets & Conclusions
Retrieval domains
Internet image search Video search for people/objects Searching home photo collections
• Learning from Internet Image Search • Joint learning of text and images • Large scale retrieval
Noisy labels
Improving Google’s Image Search
• • Fergus, Fei-Fei, Perona, Zisserman, ICCV 2005 Variant of pLSA that includes spatial information
Re-ranking result: Motorbike
Topics in model Automatically chosen topic
Animals on the Web
Berg and Forsyth, CVPR 2006 Gather images using text search Use LDA to discover “good” images using features based on nearby text, shape, color
Boostrapping of Image Search
Schroff, Zisserman, Criminisi, Harvesting Image Databases from the Web, ICCV 2007
Images returned with PENGUIN query Final ranking using SVM Removal of drawings and abstract images Naives Bayes ranking using noisy metadata Train SVM…….
Li, Wang, Fei-Fei CVPR 07
OPTIMOL
• Learning from Internet Image Search • Joint learning of text and images • Large scale retrieval
Matching Words and Pictures • Barnard, Duygulu, de Freitas, Forsyth, Blei, Jordan, JMLR 2003
Text to Images
Images to text
• Use Blobworld or nCuts to segments images into regions • Need to deduce labels attached to each image
Images to text result
Names and Faces in the News
Berg, Berg, Edwards, Maire, White, Teh, Learned-Miller, Forsyth. CVPR 2004 Collected 500,000 images and text captions from Yahoo! News 1. Find faces (standard face detector), rectify them to same pose.
2. Perform Kernel PCA and Linear Discriminant Analysis (LDA). 3. Extract names from text.
4. Cluster faces, with each name corresponding to a cluster.
5. Use language model to refine results
• Initial clusters
• Clusters refined with language model
• Learning from Internet Image Search • Joint learning of text and images • Large scale retrieval
Vocabulary tree
Nistér & Stewénius CVPR 2006.
KD-tree in descriptor space Inverse lookup of features Specific object recognition Not category-level
Slide from D. Nister
Slide from D. Nister
Slide from D. Nister
Slide from D. Nister
Slide from D. Nister
Slide from D. Nister
Slide from D. Nister
Slide from D. Nister
Slide from D. Nister
Slide from D. Nister
Slide from D. Nister
Slide from D. Nister
Slide from D. Nister
Slide from D. Nister
Slide from D. Nister
Slide from D. Nister
Slide from D. Nister
Slide from D. Nister
Slide from D. Nister
•
Pyramid Match Hashing
Grauman & Darell, CVPR 2007 • Combines Pyramid Match Kernel (efficient computation of correspondences between two set of vectors) with Locality Sensitive Hashing (LSH) [Indyk & Motwani 98] • Allows matching of the set of features in a query image to sets of features in other images in time that is sublinear in # images • Theoretical guarantees
Semantic Hashing
• • Salakhutdinov and Hinton, SIGIR 2007 Torralba, Fergus, Weiss, CVPR 2008 • • Map images to compact binary codes Hash codes for fast lookup