Slides (PPT)

Download Report

Transcript Slides (PPT)

Lior Wolf and Noga Levy
The SVM-minus Similarity Score for
Video Face Recognition
24.06.2013
Makarand Tapaswi
CVPR Reading Group @ VGG
1
Same / Not Same ?
2
One liner
“How similar is the face in one video sequence
to the other, where the similarity is uncorrelated
with pose-induced similarity”
• illumination, expression, image quality, pose
• classifier should
– discriminate positive/negative AND
– uncorrelate w.r.to additional feature set
3
Basic Notation
• 𝑋1
• 𝑋2
• 𝐵: the background
(negative) set
5
Matched Background Similarity
same person
B1
X1
B
X2
B2
8
MBGS
different persons
B1
B2
X1
X2
B
9
MBGS
10
SVM-minus Classifier
• Inputs
– Training set {𝑥𝑖 }
– Privileged info. 𝑥𝑖′
– Labels 𝑦𝑖
+ + + + + - - - - -
11
SVM-minus classifier
• 𝑀′ = train 𝑋 ′ , 𝑦
• 𝑐 = test(𝑋 ′ , 𝑀′) signed dist. to hyperplane
• Un-correlate 𝑐 with 𝑤 𝑇 𝑋
12
SVM-minus Classifier (2)
• Split
– 𝑋 into 𝑋𝑝 (𝑦𝑖 = +1) and 𝑋𝑛 (𝑦𝑖 = −1)
– 𝑐 into 𝑐𝑝 and 𝑐𝑛
– necessary since classifiers are correlated
• Normalization
– 𝑋𝑝 and 𝑋𝑛 each feature-dimension to 0 mean
– 𝑐𝑝 and 𝑐𝑛 to mean 0, and 𝜎 𝑐𝑝 = 𝜎 𝑐𝑛 = 1
13
SVM– loss function
• Pearson correlation coefficient
P 𝑤 𝑇 𝑋𝑝 , 𝑐𝑝
𝑤 𝑇 𝑋𝑝 𝑐𝑝
=
𝜎(𝑤 𝑇 𝑋𝑝 )
• Convexity: ignore denom. and square num.
14
Reduce to standard SVM
Can be reduced to standard SVM in the dual form.
In the dual 𝛼, and 𝛼𝑦 signed by 𝑦
15
Projection Matrix
• 𝜆𝑝 ≥ 0, 𝜆𝑛 ≥ 0 thus 𝐴 is positive definite
• 𝐴−1 is p.d., Cholesky decomposition 𝐴−1 = 𝐿𝐿𝑇
16
SVM-minus Similarity
same person
Pan angle
• Cancel influence from pose
+ve scoring poses need not be same person
–ve scoring poses need not be different person
One-Side SVM-minus Similarity
18
SVM-minus Similarity
• Use one-side SVM-minus for online tasks
19
YouTube Faces DB
• DB from [36]
• Video LFW
–
–
–
–
3,425 videos; 1,595 people
~2.15 videos / person
min-duration: 48 frames
Average-clip length: 181.3 frames
• Evaluation
–
–
–
–
5000 pairs
10 fold cross-validation
250+, 250–
Person exclusive splits (person appears only in one split)
20
Experimental info
• Detect face, expand bbox, align, resize 100x100
• Extract features
– LBP
– Center-Symm. LBP
– Four-Patch LBP
• 3D head orientation (𝑋′) from face.com API*
• 𝐶 = 250
• 𝜆𝑝 = 𝜆𝑛 = 1
21
MBGS Results from [36]
Lior Wolf, Tal Hassner and Itay Maoz. Face Recognition in Unconstrained
Videos with Matched Background Similarity. CVPR 2011.
22
This paper results
Results where SVM– did most better than MBGS
23
Results
• MBGS > SVM– at Accuracy
• but, MBGS + SVM– wins
• Combination done by stacking
– learning yet another SVM for the 2D scores
24
Is it really useful?
•
•
•
•
Combined score “statistically significant” for [FP]LBP
Use entire background set, AUC: 83.6% to 79.9%
Online applications (one-side), AUC: 83.6% to 81.9%
Correlations:
– Within method higher, different scores
– Across methods, highest for same feature (as expected)
25
Conclusion
• SVM– : unlearn using additional features
• MBGS : be choosy about the negative set
• 3D Pose : a good “privileged” information source
• They don’t talk about pose estimation accuracy
• Different types of privileged info that might work
• Metric learning (and relatives) not compared
Thank You!
26
Some more results from other sources
YouTube Faces DB
Ref
Method
Accuracy ± SE
AUC
EER
[1]
MBGS L2 mean, LBP
76.4 ± 1.8
82.6
25.3
[2]
MBGS+SVM-
78.9 ±1.9
86.9
21.2
[3]
APEM-FUSION
79.1 ±1.5
86.6
21.4
[4]
STFRD+PMML
79.5 ±2.5
88.6
19.9
References:
[1] Lior Wolf, Tal Hassner and Itay Maoz. Face Recognition in Unconstrained Videos with Matched
Background Similarity. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2011.
[2] Lior Wolf and Noga Levy. The SVM-minus Similarity Score for Video Face Recognition. IEEE Conf. on
Computer Vision and Pattern Recognition (CVPR), 2013.
[3] Haoxiang Li, Gang Hua, Zhe Lin, Jonathan Brandt, Jianchao Yang. Probabilistic Elastic Matching for Pose
Variant Face Verification. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2013.
[4] Zhen Cui, Wen Li, Dong Xu, Shiguang Shan and Xilin Chen. Fusing Robust Face Region Descriptors via
Multiple Metric Learning for Face Recognition in the Wild. IEEE Conf. on Computer Vision and Pattern
Recognition (CVPR), 2013.
27