Transcript slides
Large Scale Visual Recognition Challenge (ILSVRC) 2013: Detection spotlights Toronto A team Xiaolong Wang ([email protected]) Intelligent Media Computing Laboratory http://vision.sysu.edu.cn/ Approach for Detection Task : • The basic method is based on the deformable part models (DPM). I implemented the release 4.0 version code by C++ and extended it to an MPI version. I trained the models in a distributed system with 200 computers. (It costs 30 hours to train 200 DPMs) • A convolutional neural network (CNN) is trained to do classification on 200 categories in the detection dataset. I used the classification results to provide context information and rescored the detection results from DPM. Results: • DPM without context rescoring, mAP: 7.55% • DPM with CNN context rescoring, mAP: 10.45% (with around 3% improvement) • Three categories won: 69: flute; 116: nail; 151: saxophone Thanks for the computing resources supported by Baidu,Inc ICCV’2013 Sydney, Australia ILSVRC 2013 Spotlight Latent Hierarchical Model with GPU Inference for Object Detection Yukun Zhu, Jun Zhu, Alan Yuille UCLA Computer Vision Lab Thank L. Zhu, Y. Chen, A. Yuille and W. Freeman for the work “Latent hierarchical structural learning for object detection”in CVPR 2010. Latent Hierarchical Model with GPU Inference for Object Detection Hierarchical Model Model for Car Root-Part Configuration Model for Horse Latent Hierarchical Model with GPU Inference for Object Detection • The latent hierarchical model encoding holistic object and parts w.r.t. viewpoint variations • Support richer appearance features: HOG, color, etc. • Fast training with incremental concave-convex procedure (iCCCP) algorithm • Quick model inference via GPU (CUDA) implementation Latent Hierarchical Model with GPU Inference for Object Detection [1] Felzenszwalb P, McAllester D, Ramanan D, “A discriminatively trained, multiscale, deformable part model,” Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on. IEEE, 2008: 1-8. [2] Felzenszwalb P F, Girshick R B, McAllester D, “Cascade object detection with deformable part models,” Computer vision and pattern recognition (CVPR), 2010 IEEE conference on. IEEE, 2010: 2241-2248. ILSVRC2013 Task 1: Detection Team name: Delta Members: Che-Rung Lee, Hwann-Tzong Chen, Hao-Ping Kang, Tzu-Wei Huang, Ci-Hong Deng, Hao-Che Kao National Tsing Hua University Generic Object Detector ~ 15 proposals per image ConvNet Multiclass Classifier each proposal gets one of the (200+backgrounds) class-labels Generic object detector: “What is an object” + salient region segmentation 0.28 mAP on the validation images (ignoring class labels) Multiclass classifier: cuda-convnet [Krizhevsky et al.] Training: 590,000 bounding boxes, 3 days using 2 GPUs 0.5 error rate for classifying the validation bounding boxes Overall: 0.057 mAP on validation data, 0.06 mAP on test data Agenda 8:30 Classification&localization 8:50 9:20 9:05 9:35 9:50 Spotlights 10:30 Detection 10:50 11:10 11:30 Spotlights 11:40 Noon Discussion panel 14:00 Invited talk by Vittorio Ferrari: Auto-annotation and self-assessment in ImageNet 14:40 Fine-Grained Challenge 2013 http://www.image-net.org/challenges/LSVRC/2013/iccv2013