Evaluating AAM Fitting Methods for Facial Expression

Download Report

Transcript Evaluating AAM Fitting Methods for Facial Expression

Akshay Asthana, Jason Saragih, Michael Wagner and
Roland Göcke
ANU, CMU & U Canberra
In part funded by ARC grant TS0669874
Background
 Thinking Head project
 http://thinkinghead.edu.au/
 5-year multi-institution
(Canberra, UWS, Macquarie,
Flinders) project in Australia
 Develop a research platform for
human communication sciences
 “An Approach for Automatically
Measuring Facial Activity in
Depressed Subjects”, McIntyre, Göcke,
Hyett, Green, Breakspear, ACII 2009
Aim for this Study
 Active Appearance Models (AAM) have
become a popular tool for markerless
face tracking in recent years
 A number of different AAM fitting
methods exist
 Which one should we use?
 We wanted to evaluate these in the context of facial
expression recognition (FER)
 How well do AAMs generalise?
 How robust are these methods w.r.t. initialisation error?
 How does their fitting accuracy affect the FER accuracy?
AAM
 Shape:
 Texture:
AAM – Shape Variation
 Shape variation
Mean
AAM – Texture Variation
 Texture variation
Mean
AAM – Modelling Appearance
 Appearance = Shape + Texture
Mean
AAM (cont.)
 Alignment based on finding model parameters that
iteratively fit learnt model to the image
Initialisation
After 5 iterations
Converged
AAM Fitting Methods Compared in this Study
 Fixed Jacobian (FJ): Cootes, Edwards & Taylor, 1998
 Project-Out Inverse Compositional (POIC): Baker &
Matthews, 2001
 Simultaneous Inverse Compositional (SIC): Baker, Gross &
Matthews, 2003
 Robust Inverse Compositional (RIC): Gross, Matthews &
Baker, 2005
 Iterative Error-Bound Minimisation (IEBM), aka
Linear Discriminative-Iterative: Saragih & Goecke, 2006
 Haar-like Feature Based Iterative-Discriminative
Method (HFBID): Saragih & Goecke, 2007
System Overview
2
1
Experiments
 (1) Generalisation, (2) Robustness to initialisation error
 Person-dependent models (PDFER): individual models
 Person-independent models (PIFER): general models
 Not for POIC as has previously been shown to not
generalise well across different people
 Cohn-Kanade database:
 Subset of 30 subjects (15f / 15m)
 Total of 3424 images:

992 images for Neutral, 448 images for Anger, 296 images for
Disgust, 346 images for Fear, 532 images for Joy, 423 images for
Sorrow and 387 images for Surprise.
Initialisation
 Traditionally, beside generalisation, one of the most




challenging problems for AAMs has been robustness
to initialisation error
Common face detectors, e.g. Viola-Jones, often give
you an error (translation) of up to 30 pixels
We simulate this by deliberately misaligning the initial
AAM: ±5, ±10, ±20, ±25 (PIFER) / ±30 pixels (PDFER)
Multi-class SVM using a linear kernel for PDFER and a
Radial Basis Function kernel for PIFER
Classify expressions as Neutral or one of the ‘Big 6’ (7class problem
Facial Expression Recognition
 In this study, we were interested in recognising the ‘Big
6’ + Neutral expressions
 Since the scope of most of the vision based expression
recognition systems is based on changes in
appearance, we grouped AUs together on a ‘regional
basis’
 In that way, we did not have to recognise individual
AUs but analysed movement patterns in various facial
regions, which made the FER process more robust
FER (2)
FER Results - Video
Ground truth
“Unstable”
Stable
Results – Person-dependent Models
“Unstable”
Stable
Results – Person-independent Models
Conclusions
 Investigate the utility of different AAM fitting





algorithms in the context of real-time FER
Iterative-Discriminative (ID) approach adopted in
IEBM and HFBID boosts the fitting performance
significantly and thus leads to improved FER results
More robust to initialisation error than other methods
IEBM and HFBID generalise well
Rapid fitting (real-time capable) ~ as fast as POIC
Future work:
 Pose-invariant FER