Transcript Slides

Face Alignment with Part-Based
Modeling
Vahid Kazemi
Josephine Sullivan
CVAP
KTH Institute of Technology
Objective: Face Alignment
• Find the correspondences between landmarks of a
template face model and the target face.
Annotated images (source: IMM dataset)
Test image (source: YouTube)
Why: Possible Applications
• The outcome can be used for:
- Motion Capture: by determining head pose and facial
expressions.
- Face Recognition: by comparing registered facial features
with a database.
- 3D Reconstruction: by determining camera parameters using
correspondences in an image sequence
- Etc.
Global Methods
• Overview:
- Create a constrained generative template model
- Start with a rough estimate of face position.
- Refine the template to match the target face.
• Properties:
- Model deformations more precisely
- Arbitrary number of landmarks
• Examples:
- Active Shape Models [Cootes 95]
- Active Appearance Model [Cootes 98]
- 3D Morphable Models [Blanz 99]
Part-Based Methods
• Overview:
- Train different classifiers for each part.
- Learn constraints on relative positions of parts.
• Properties:
- More robust to partial occlusion
- Better generalization ability
- Sparse results
• Examples:
- Elastic Bunch Graph Matching [Wiskott 97]
- Pictorial Structures [Felzenszwalb 2003]
Our approach to face alignment
• How can we avoid the draw backs of existing models?
Our approach to face alignment
• Find the mapping, q, from appearance to the landmark
positions:
q

• But q is complex and non-linear…
Linearizing the model
• Use piece-wise linear functions
qi

Linearizing the model
• Use a part based model
qi

Linearizing the model
• Use a suitable feature descriptor
Feature Descriptor
Part Selection Criteria
• Detect the parts accurately and reliably
- Contain strong features
• Ensure a simple (linear) model
- Minimum variation
• Capture the global appearance
- Cover the whole object
Part Selection for the face
We chose nose, eyes, and mouth as good candidates
Image from IMM dataset
Appearance descriptor
• Variation of PHOG descriptor
- Divide the patch into 8 sub-regions
- Recursively repeat for square regions
Part detection
• Build a tree-structured model of the face, with
nose at the root, and eyes and mouth as the leafs
of the tree.
Part detection
• Detect the parts by sliding a patch on image and
calculating the Mahalanobis distance of the patch
from the mean model
Part detection
• Find the optimal solution by minimizing the pictorial
structure cost function:
• We can solve this efficiently by using generalized
distance transform [Felzenszwalb 2003] by limiting the
cost function
Regression
• Model the mapping between the patch’s appearance
feature (f) and its landmark positions (x) as a linear
function:
• Estimate weights from training set using Ridge
regression
Regression
• Comparison of different regression methods
Robustify the regression function
• Why
• Compensate for bad part detection
• Deformable parts don’t exactly fit in a box
• How
• Extend training set by adding noise to part positions
Experiments
• Use 240 face images from IMM dataset.
• Dataset contains still images from 40 individual subjects
with various facial expressions under the same lighting
settings
• 58 landmarks are used to represent the shape of subjects
Results
• Comparison of localization accuracy of our algorithm
comparing to some existing methods on IMM dataset.
* Mean error is the mean Euclidean distance between predicted and
ground truth location of landmarks in pixels
Results
• The results of cross validation on IMM dataset
Predicted
Ground truth
Demo
More videos: http://www.csc.kth.se/~vahidk/face/
Conclusion and future work
• Part-Based models can be used to simplify complicated
models
• The choice of parts is very important
• HOG descriptors are not fully descriptive
• Questions?