Transcript slides

Large Scale Visual
Recognition Challenge (ILSVRC) 2013:
Detection spotlights
Toronto A team
Xiaolong Wang ([email protected])
Intelligent Media Computing Laboratory
http://vision.sysu.edu.cn/
Approach for Detection Task :
• The basic method is based on the deformable part models (DPM). I implemented
the release 4.0 version code by C++ and extended it to an MPI version. I trained the
models in a distributed system with 200 computers.
(It costs 30 hours to train 200 DPMs)
•
A convolutional neural network (CNN) is trained to do classification on 200
categories in the detection dataset. I used the classification results to provide
context information and rescored the detection results from DPM.
Results:
• DPM without context rescoring, mAP:
7.55%
• DPM with CNN context rescoring, mAP: 10.45% (with around 3% improvement)
• Three categories won: 69: flute; 116: nail; 151: saxophone
Thanks for the computing resources supported by Baidu,Inc
ICCV’2013
Sydney,
Australia
ILSVRC 2013 Spotlight
Latent Hierarchical Model with GPU
Inference for Object Detection
Yukun Zhu, Jun Zhu, Alan Yuille
UCLA Computer Vision Lab
Thank L. Zhu, Y. Chen, A. Yuille and W. Freeman for the work
“Latent hierarchical structural learning for object detection”in
CVPR 2010.
Latent Hierarchical Model with GPU
Inference for Object Detection
Hierarchical Model
Model for Car
Root-Part Configuration
Model for Horse
Latent Hierarchical Model with GPU
Inference for Object Detection
• The latent hierarchical model encoding holistic
object and parts w.r.t. viewpoint variations
• Support richer appearance features: HOG, color,
etc.
• Fast training with incremental concave-convex
procedure (iCCCP) algorithm
• Quick model inference via GPU (CUDA)
implementation
Latent Hierarchical Model with GPU
Inference for Object Detection
[1] Felzenszwalb P, McAllester D, Ramanan D, “A discriminatively trained, multiscale, deformable part model,”
Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on. IEEE, 2008: 1-8.
[2] Felzenszwalb P F, Girshick R B, McAllester D, “Cascade object detection with deformable part models,”
Computer vision and pattern recognition (CVPR), 2010 IEEE conference on. IEEE, 2010: 2241-2248.
ILSVRC2013 Task 1: Detection
Team name: Delta
Members:
Che-Rung Lee, Hwann-Tzong Chen, Hao-Ping Kang,
Tzu-Wei Huang, Ci-Hong Deng, Hao-Che Kao
National Tsing Hua University
Generic Object Detector
~ 15 proposals per image
ConvNet Multiclass Classifier
each proposal gets one of the
(200+backgrounds) class-labels
Generic object detector: “What is an object” + salient region segmentation
0.28 mAP on the validation images (ignoring class labels)
Multiclass classifier: cuda-convnet [Krizhevsky et al.]
Training: 590,000 bounding boxes, 3 days using 2 GPUs
0.5 error rate for classifying the validation bounding boxes
Overall:
0.057 mAP on validation data, 0.06 mAP on test data
Agenda
8:30 Classification&localization
8:50
9:20
9:05
9:35
9:50
Spotlights
10:30 Detection
10:50
11:10
11:30
Spotlights
11:40
Noon Discussion panel
14:00 Invited talk by Vittorio Ferrari:
Auto-annotation and self-assessment in ImageNet
14:40 Fine-Grained Challenge 2013
http://www.image-net.org/challenges/LSVRC/2013/iccv2013