Object Size ↔ Camera Viewpoint

Transcript Object Size ↔ Camera Viewpoint

CV輪講
Putting Objects in Perspective
藤吉研究室土屋成光
2008年7月1日
Back ground
 一般物体認識/画像シーン認識
– 低解像度
– 見えの違い
– 奥行きによるサイズの違い
⇒局所的な認識法が通用しない
 人間は物体間の関係を利用
– 三次元構造のモデル化
– 局所的な認識手法を高精度に
Putting Objects in Perspective
Derek Hoiem，Alexei A. Efros，Martial Hebert
Carnegie Mellon University Robotics Institute
CVPR2006
Understanding an Image
Today: Local and Independent
検出結果
Local Object Detection
False
Detections
True
Detection
Missed
Missed
True
Detections
Local Detector: [Dalal-Triggs 2005]
Object Support
Surface Estimation
Image
V-Left
Support
Vertical
Sky
V-Center
V-Right
V-Porous
Object
Surface?
Support?
V-Solid
[Hoiem, Efros, Hebert ICCV 2005]
Software available online
Object Size in the Image
Image
World
Object Size ↔ Camera Viewpoint
Input Image
Loose Viewpoint Prior
Object Size ↔ Camera Viewpoint
Input Image
Loose Viewpoint Prior
Object Size ↔ Camera Viewpoint
Object Position/Sizes
Viewpoint
Object Size ↔ Camera Viewpoint
Object Position/Sizes
Viewpoint
Object Size ↔ Camera Viewpoint
Object Position/Sizes
Viewpoint
Object Size ↔ Camera Viewpoint
Object Position/Sizes
Viewpoint
Efficient from surface and viewpoint
Image
P(surfaces)
P(viewpoint)
P(object)
P(object | surfaces)
P(object | viewpoint)
Efficient from surface and viewpoint
Image
P(object)
P(surfaces)
P(viewpoint)
P(object | surfaces, viewpoint)
Scene Parts Are All Interconnected
Objects
Camera Viewpoint
3D Surfaces
Input to Algorithm
Object Detection
Surface Estimates
Viewpoint Prior
Local Car Detector
Local Ped Detector
Local Detector: [Dalal-Triggs 2005]
Surfaces: [Hoiem-Efros-Hebert 2005]
Approximate Model
Objects
Viewpoint
3D Surfaces
Inference over Tree
Viewpoint
θ
Local Object
Evidence
Local Object
Evidence
Objects
o1
Local Surface
Evidence
...
on
Local Surface
Evidence
Local Surfaces
s1
…
sn
Viewpoint estimation
Viewpoint Final
Likelihood
Likelihood
Viewpoint Prior
Height
Horizon
Height
Horizon
Object Identitie
 Local detector
Surface Geometry
Probability map
Object detection
Car: TP / FP
Initial (Local)
Final (Global)
4 TP / 2 FP
4 TP / 1 FP
3 TP / 2 FP
4 TP / 0 FP
Ped: TP / FP
Car Detection
Ped Detection
Local Detector: [Dalal-Triggs 2005]
Experiments on LabelMe Dataset
 Testing with LabelMe dataset: 422 images
– 923 Cars at least 14 pixels tall
– 720 Peds at least 36 pixels tall
Each piece of evidence improves performance
Car Detection
Pedestrian
Detection
Local Detector from [Murphy-Torralba-Freeman 2003]
Can be used with any detector that outputs
confidences
Car Detection
Pedestrian
Detection
Local Detector: [Dalal-Triggs 2005]
(SVM-based)
Accurate Horizon Estimation
Horizon Prior
Median
Error:
90%
Bound:
8.5%
[Murphy-Torralba- [DalalTriggs
Freeman 2003]
2005]
4.5%
3.0%
Qualitative Results
Car: TP / FP Ped: TP / FP
Initial: 2 TP / 3 FP
Final: 7 TP / 4 FP
Local Detector from [Murphy-Torralba-Freeman 2003]
Qualitative Results
Car: TP / FP Ped: TP / FP
Initial: 1 TP / 14 FP
Final: 3 TP / 5 FP
Local Detector from [Murphy-Torralba-Freeman 2003]
Qualitative Results
Car: TP / FP Ped: TP / FP
Initial: 1 TP / 23 FP
Final: 0 TP / 10 FP
Local Detector from [Murphy-Torralba-Freeman 2003]
Qualitative Results
Car: TP / FP Ped: TP / FP
Initial: 0 TP / 6 FP
Final: 4 TP / 3 FP
Local Detector from [Murphy-Torralba-Freeman 2003]
Geometric Context
 Estimate surface
ground: green, sky: blue, vertical: red, o:porous, x: solid
Geometric Cues
Color
Texture
Location
Perspective
Robust Spatial Support
RGB Pixels
Superpixels
[Felzenszwalb and Huttenlocher 2004]
 oversegmentation
Multiple Segmentations
Multiple Segmentations
…
Superpixels
 単一のセグメントではセグメントエラーの可能性
 複数のセグメント数でセグメンテーション
Labeling Segments
…
…
各セグメント結果を統合
Learn from training images
Homogeneity Likelihood
Label Likelihood
 前準備
–
–

multiple segmentationの算出
各セグメントのラベルの算出 – ground, vertical, sky, or
“mixed”
boosted decision trees による密度計算
–
–
8 nodes per tree
Logistic regression version of Adaboost
[Collins and Schapire and Singer 2002]
Image Labeling
Labeled Segmentations
…
Learned from
training images
Labeled Pixels
meter
s
Summary & Future Work
Pe
d
Pe
d
Car
Reasoning in 3D:
• Object to object
• Scene label
• Object segmentation
meters
Conclusion
 Image understanding is a 3D problem
– Must be solved jointly
 This paper is a small step
– Much remains to be done
CV輪講
Recovering Occlusion Boundaries
from a Single Image,
Closing the Loop in Scene
Interpretation
藤吉研究室土屋成光
2008年8月26日
Back ground
 一般物体認識/画像シーン認識
– 低解像度
– 見えの違い
– 奥行きによるサイズの違い
⇒局所的な認識法が通用しない
 人間は物体間の関係を利用
– 三次元構造のモデル化
– 局所的な認識手法を高精度に
Recovering Occlusion Boundaries
from a Single Image
Derek Hoiem，Andrew N. Stein,
Alexei A. Efros，Martial Hebert
Carnegie Mellon University Robotics Institute
ICCV’07
単画像からのオクルージョン理解
 オクルージョン，境界理解
– 物体を探索する際に必須
– Edge, region, depthによって推定
手法の流れ
1. 千領域にセグメンテーション
Watershed with Pb soft boundaries
2. Region, Boundary, 3D Cuesの算出
depth : horizon + junction to ground
3. Boundaryの算出
Conditional random field (CRF)
4. Boundaryを用いて更にセグメンテーション
results
 Boundary
 Object popout
Closing the Loop
in Scene Interpretation
Derek Hoiem，Alexei A. Efros，Martial Hebert
Carnegie Mellon University Robotics Institute
CVPR’08
Putting Objects in Perspective
Car: TP / FP
Initial (Local)
Final (Global)
4 TP / 2 FP
4 TP / 1 FP
3 TP / 2 FP
4 TP / 0 FP
Ped: TP / FP
Car Detection
Ped Detection
Local Detector: [Dalal-Triggs 2005]
Scene Parts Are All Interconnected
Objects
Camera Viewpoint
3D Surfaces
with Occlusions
 一般物体認識フレームワーク
Putting Objects in Perspective
 シーン構造認識
Automatic Photo Pop-up
Occlusion, Boundary情報の利用
関係モデル
 相互に関係
Putting Objects への利用
 相互的に情報を利用することで高精度に
Car : Up, Ped : Down
群衆の境界線の精度が問題
Initial : Dalal-Triggs Iter 1 : Hoiem et al. Final : This paper
Photo popup への利用
 Occlusion, Objectの利用により高精度化
まとめ
 Occlusion/Boundaryの算出
– 一枚の画像からgeometry, depthなどを用いて算出
– 高精度なセグメンテーション
 Occlusion/Boundaryの利用
– セグメンテーションによるエラーの低減
– 一般物体認識に有用
 課題：
– 群衆などから得られるBoundaryの高精度化

Object Size ↔ Camera Viewpoint

Transcript Object Size ↔ Camera Viewpoint

Directory