oral - Oxford Brookes University

Download Report

Transcript oral - Oxford Brookes University

Hierarchical Part-Based
Human Body Pose Estimation
* Ramanan Navaratnam
* Arasanathan Thayananthan
† Prof. Phil Torr
* Prof. Roberto Cipolla
* University Of Cambridge
† Oxford Brookes University
1
Introduction
Input
2
Introduction
Input
Output
3
Overview
1. Motivation
2. Hierarchical parts
3. Template search
4. Pose estimation in a single frame
5. Temporal smoothing
6. Summary & Future work
4
Overview
1. Problem motivation ???
2. Hierarchical parts
3. Template search
4. Pose estimation in a single frame
5. Temporal smoothing
6. Summary & Future work
5
Overview
1. Problem motivation ???
2. Hierarchical parts
3. Template search
4. Pose estimation in a single frame
5. Temporal smoothing
6. Summary & Future work
6
Overview
1. Problem motivation ???
2. Hierarchical parts
3. Template search
4. Pose estimation in a single frame
5. Temporal smoothing
6. Summary & Future work
7
Motivation

‘Real-time Object Detection for Smart Vehicles’
– D. M. Gavrila & V. Philomin (ICCV 1999)

‘Filtering using a tree-based estimator’
– Stenger et.al. (ICCV 2003)
8
Motivation

‘Real-time Object Detection for Smart Vehicles’
– D. M. Gavrila & V. Philomin (ICCV 1999)

‘Filtering using a tree-based estimator’
– Stenger et.al. (ICCV 2003)

Exponential increase of templates with dimensions
9
Motivation


‘Pictorial Structures for Object Recognition’
– P. Felzenszwalb & D. Huttenlocher (IJCV 2005)
‘Human upper body pose estimation in static images’
– M.W. Lee & I. Cohen (ECCV 2004)
10
Motivation


‘Pictorial Structures for Object Recognition’
– P. Felzenszwalb & D. Huttenlocher (IJCV 2005)
‘Human upper body pose estimation in static images’
– M.W. Lee & I. Cohen (ECCV 2004)


Part based approach
Assembling parts together is complex
11
Motivation

‘Automatic Annotation of Everyday Movements’
– D. Ramanan & D. A. Forsyth (NIPS 2003)

‘3-D model-based tracking of humans in action:a multi-view approach’
– D. M. Gavrila & L. S. Davis (CVPR 1996)
12
Motivation

‘Automatic Annotation of Everyday Movements’
– D. Ramanan & D. A. Forsyth (NIPS 2003)

‘3-D model-based tracking of humans in action:a multi-view approach’
– D. M. Gavrila & L. S. Davis (CVPR 1996)

‘State space decomposition’
13
Hierarchical Parts
14
Hierarchical Parts
15
Hierarchical Parts
16
Hierarchical Parts
17
Hierarchical Parts
Conditional prior
18
Hierarchical Parts
Head and torso
Upper arm
Lower Arm
False Positive
19
Hierarchical Parts
Detection Threshold = 0.81
Part
Detections
Head and torso
56
61
20
Hierarchical Parts
Detection Threshold = 0.81
Part
Detections
Head and torso
56
61
Lower arm
13 199
44 993
21
Template Search
22
Template Search
23
Template Search
24
Template Search
Features


Chamfer distance
Appearance
25
Template Search
Features


Chamfer distance
Appearance
26
Template Search
Features


Chamfer distance
Appearance
27
Template Search
Features


Chamfer distance
Appearance
28
Template Search
Features


Chamfer distance
Appearance
29
Template Search
Features


Chamfer distance
Appearance
30
Template Search
Features


Chamfer distance
Appearance
31
Template Search
Features


Chamfer distance
Appearance
32
Template Search
Features


Chamfer distance
Appearance
33
Template Search
Learning Appearance


Match ‘T’ pose based on edge likelihood only in initial
frames
Update 3D histograms in RGB space that approximates
P(RGB/part) and P(RGB)
34
Pose Estimation in a Single Frame
35
Pose Estimation in a Single Frame
36
Pose Estimation in a Single Frame
37
Temporal Smoothing
HMM
38
Temporal Smoothing
T=t
HMM
39
Temporal Smoothing
HMM
Viterbi back tracking
40
Temporal Smoothing
41
Viterbi back tracking
Temporal Smoothing
42
Summary & Future work
Summary
Realtime process (unoptimized code at 1Hz, 2.4 Ghz IG RAM)
3D pose
Automatic initialisation and recovery from failure
43
Summary & Future work
Summary
Realtime process (unoptimized code at 1Hz, 2.4 Ghz IG RAM)
3D pose
Automatic initialisation and recovery from failure
Future work
Extend robustness to illumination changes
Non-fronto-parallel poses
Poses when arms are inside the body silhouette
Simple gesture recognition by assigning semantics to regions of articulation
space
44