Hidden Conditional Random Fields - Universidade Federal de São

Transcript Hidden Conditional Random Fields - Universidade Federal de São

- A comparison with Neural Networks and Hidden Markov Models -
César R. de Souza, Ednaldo B. Pizzolato and Mauro dos Santos Anjo
Universidade Federal de São Carlos (Federal University of São Carlos)
IBERAMIA 2012
Cartagena de Índias, Colombia
2012
Context, Motivation, Objectives and
the Organization of this Presentation
3
 Multidisciplinary
 Computing and Linguistics
 Ethnologue lists about 130 sign
languages existent in the world
 (LEWIS, 2009)
Motivation
Objectives
Agenda
4
 Two fronts
 Social
 Aim to improve quality of life for the
deaf and increase the social inclusion
 Scientific
 Investigation of the distinct interaction
methods, computational models and
their respective challenges
Context
Motivation
Objectives
Agenda
5
 This paper
 Investigate the behavior and applicability of SVMs
and HCRFs in the recognition of specific signs from
the Brazilian Sign Language
 Long term
 Walk towards the creation of a full-fledged
recognition system for LIBRAS
 This work represents a small but important
step in achieving this goal
Context
Motivation
Objectives
Agenda
6
Introduction
Libras - Brazilian Sign Language
Literature Review
Methods and Tools
Support Vector Machines
Conditional Random Fields
Experiments
Results
Conclusion
Context
Motivation
Objectives
Agenda
Structures and the manual alphabet
7
8
Natural language
Not mimics
Not universal
LIBRAS
Difficulties
Grammar
 It is not only “a problem of the
deaf or a language pathology”
 (QUADROS & KARNOPP, 2004)
Introduction
Libras
Review
Methods
Experiments
Results
Conclusion
9
 Highly context-sensitive
 Same sign may have distinct meanings
 Interpretation is hard
even for humans
Introduction
Libras
Review
Methods
LIBRAS
Difficulties
Grammar
Experiments
Results
Conclusion
10
 Fingerspelling is only
part of the Grammar
 Needed when explicitly spelling
the name of a person or a location
 Subset of the full-language
recognition problem
Introduction
Libras
Review
Methods
LIBRAS
Difficulties
Grammar
Experiments
Results
Conclusion
11
Introduction
Libras
Review
Methods
Experiments
Results
Conclusion
12
Literature
 Layer architectures are common
 Static gestures x Dynamic gestures
 One of the best works on LIBRAS handles
only the movement aspect of the language
(Dias et al.)
Introduction
Libras
Review
Methods
Experiments
Results
Conclusion
13
 Few studies explore SVMs
 But many use Neural Networks
 No studies on HCRFs and LIBRAS
Introduction
Libras
Review
Methods
Experiments
Results
Conclusion
 Example
 Recognition of a fingerspelled word
using a two-layered architecture
Pato
HMM
Sequence classifier
P
P
P
A
A
A
T
T
T
O
O
Static gesture classifier
ANN
Introduction
Libras
Review
Methods
Experiments
Results
Conclusion
15
 YANG, SCLAROFF e LEE, 2009
 Multiple layers, SVMs
 Elmezain, 2011
 HCRF, in-air drawing recognition
Introduction
Libras
Review
Methods
Experiments
Results
Conclusion
Overview of the chosen techniques
and reasons for their choice
16
Neural Networks and Support Vector
Machines for the detection of static signs
17
18
 Find 𝑓(𝑥) such that…
c
𝑓(𝑥)
b
a
Introduction
Libras
Review
Methods
Experiments
Results
Conclusion
19
Neural Networks
 Biologically inspired
 McCulloch & Pitts, Rosenblatt, Rumelhart
Support Vector Machines
Maximum Margin
Multiple Classes
Introduction
Libras
Review
Methods
Experiments
Results
Conclusion
 Perceptron
 Hyperplane decision
 Linearly separable problems
 Learning is a ill-posed problem
 Multiple local minima, ill-conditioning
 Layer architecture
 Universal approximator
Introduction
Libras
Review
Methods
Experiments
Results
Conclusion
21
Neural Networks
 Strong theoretical basis
 Statistical Learning Theory
 Structural Risk Minimization (SRM)
Support Vector Machines
Maximum Margin
Multiple Classes
Introduction
Libras
Review
Methods
Experiments
Results
Conclusion
22
 Large-margin classifiers
 Risk minimization through margin maximization
 Capacity control through margin control
 Sparse solutions considering only a few support vectors
Neural Networks
Support Vector Machines
Maximum Margin
Multiple Classes
Introduction
Libras
Review
Methods
Experiments
Results
Conclusion
23
 Problem: binary-only classifier
 How to generalize to multiple classes?
Neural Networks
Support Vector Machines
Maximum Margin
Multiple Classes
Introduction
Libras
Review
Methods
Experiments
Results
Conclusion
24
 Problem: binary-only classifier
 How to generalize to multiple classes?
 Drawbacks
 Classical approaches
 Only works when equiprobable
 Evaluation of c(c-1)/2 machines
 Non-guaranteed optimum results
 One-against-all
 One-against-one
 Discriminant functions
With 27 static gestures, this would result in
351 SVM evaluations each time a new
classification is required!
Introduction
Libras
Review
Methods
Experiments
Results
Conclusion
25
 Problem: binary-only classifier
 How to generalize to multiple classes?
 Directed Acyclic Graphs
Neural Networks
 Generalization of Decision Trees,
allowing for non-directed cycles
Support Vector Machines
Maximum Margin
Multiple Classes
 Require at maximum c-1 evaluations
So, for 27 static gestures, only 26
SVM evaluations are required. Only
7.4% of the original effort 
Introduction
Libras
Review
Methods
Experiments
Results
Conclusion
26
 Elimination proccess
A
B
C
D
 One class eliminated at a time
Candidates
A lost
D lost
A
B
C
B
C
D
B lost
C lost
D lost
B lost
Libras
C lost
D lost
D
Introduction
A
B
B
C
C
D
C lost
A lost
A lost
C
Review
Methods
B
Experiments
B lost
A
Results
Conclusion
27
However, no matter the model
 We’ll have (extreme) noise due pose transitions
 How can we cope with that?
Introduction
Libras
Review
Methods
Experiments
Results
Conclusion
Hidden Markov Models, Conditional Random
Field and Hidden Conditional Random Fields for
dynamic gesture recognition.
28
29
 Find 𝑓(𝑥) such that
given extremely noisy sequences of labels,
estimate the word being signed.
blyrei
𝑓(𝑥)
hil
hello
hi
bye
hmeylrlwo
Introduction
Libras
Review
Methods
Experiments
Results
Conclusion
30
Hidden Markov
Models (HMMs)
Conditional Random
Fields (CRFs)
Hidden Conditional
Random Fields (HCRFs)
Introduction
Libras
Review
Methods
Experiments
Results
Conclusion
31
 Hidden Markov Models
 Joint probability model of a observation
sequence and its relationship with time
Hidden Markov
Models (HMMs)
𝑇
𝑝 𝑥, 𝑦 =
Conditional Random
Fields (CRFs)
𝑝 𝑦𝑡 𝑦𝑡−1 𝑝(𝑥𝑡 |𝑦𝑡 )
𝑡=1
A
Introduction
Libras
Review
B
Hidden Conditional
Random Fields (HCRFs)
Methods
Experiments
Results
Conclusion
32
 Hidden Markov Models
 Marginalizing over y, we achieve the
observation sequence likelihood
Hidden Markov
Models (HMMs)
𝑇
𝑝 𝑥 =
𝑝 𝑦𝑡 𝑦𝑡−1 𝑝(𝑥𝑡 |𝑦𝑡 )
Conditional Random
Fields (CRFs)
𝒚 𝑡=1
Hidden Conditional
Random Fields (HCRFs)
 Which can be used for classification
using either the ML or MAP criteria
𝑝 𝜔𝑖 |𝒙 =
Introduction
Libras
𝑝 𝜔𝑖 𝑝 𝒙|𝜔𝑖
𝑐
𝑗𝑝
𝒙|𝜔𝑗
Review
Methods
Experiments
Results
Conclusion
33
Word 1
𝑝(𝒙|ω𝟏 )
Word 2
𝑝(𝒙|ω𝟐 )
ω = max 𝑝 𝒙 𝜔𝑗 𝑝(𝜔𝑗 )
𝜔𝑗 ∈ ω
...
𝑝(𝒙|ω𝒏 )
Word n
One model for each word
Introduction
Libras
Review
Methods
Experiments
Results
Conclusion
34
 Hidden Markov Models have found
great applications in speech recognition
Conditional Random
Fields (CRFs)
 However, a fundamental paradigm
shift recently occurred in this field
Introduction
Libras
Review
Methods
Hidden Markov
Models (HMMs)
Hidden Conditional
Random Fields (HCRFs)
Experiments
Results
Conclusion
35
 Probability distributions governing speech
signals could not be modeled accurately,
turning “Bayes decision theory inapplicable
under those circumstances”
 (Juang & Rabiner, 2005)
Introduction
Libras
Review
Methods
Experiments
Hidden Markov
Models (HMMs)
Conditional Random
Fields (CRFs)
Hidden Conditional
Random Fields (HCRFs)
Results
Conclusion
36
Hidden Markov
Models (HMMs)
Conditional Random
Fields (CRFs)
Hidden Conditional
Random Fields (HCRFs)
Introduction
Libras
Review
Methods
Experiments
Results
Conclusion
37
 Conditional Random Fields
 Generalization of the Markov models
Hidden Markov
Models (HMMs)
Conditional Random
Fields (CRFs)
 Discriminative Models
 Model 𝑝 𝒚|𝒙 without incorporating 𝑝 𝒙
Hidden Conditional
Random Fields (HCRFs)
 Designates a family of MRFs
 Each new observation originates a new MRF
Introduction
Libras
Review
Methods
Experiments
Results
Conclusion
HMM
Directional models
Discriminative
Naïve Bayes
Graphs
Sequence
Generative
38
Logistic Regression
Linear-chain CRF
CRF
Infograph based on the tutorial by Sutton, C., McCallum, A., 2007
Introduction
Libras
Review
Methods
Experiments
Results
Conclusion
1
𝑝 𝒚𝒙 =
𝑍(𝒙)
𝑒𝑥𝑝
𝐶𝑝 ∈ 𝒞 Ψ𝑐 ∈𝐶𝑝
Potential Functions
 Conditional
Potential
CliquesRandom Fields
 Generalization of the Markov models
1
𝑍(𝒙)
𝑝 𝒚𝒙 =
39
𝐾(𝑝)
𝜆𝑝𝑘 𝑓𝑝𝑘 (𝒙𝒄 , 𝒚𝒄 )
𝑘=1
Parameter vector which can
be optimized using gradient
Hidden Markov
methods
Models (HMMs)
Ψ𝑐 (𝒙𝒄 , 𝒚𝒄 ; 𝜃)
Conditional Random
Fields (CRFs)
𝐶𝑝 ∈ 𝒞 Ψ𝑐 ∈𝐶𝑝
Hidden Conditional
Random Fields (HCRFs)
𝐾(𝑝)
Ψ𝑐 𝒙𝒄 , 𝒚𝒄 ; 𝜃𝑝 = 𝑒𝑥𝑝
𝜆𝑝𝑘 𝑓𝑝𝑘 (𝒙𝒄 , 𝒚𝒄 )
𝑘=1
𝑍 𝒙 =
Characteristic
function vector
Ψ𝑐 (𝒙𝒄 , 𝒚𝒄 ; 𝜃)
𝒚 𝐶𝑝 ∈ 𝒞 Ψ𝑐 ∈𝐶𝑝
Partition
function
Introduction
Libras
Review
Methods
Experiments
Results
Conclusion
1
𝑝 𝒚𝒙 =
𝑍(𝒙)
40
𝐾(𝑝)
𝑒𝑥𝑝
𝐶𝑝 ∈ 𝒞 Ψ𝑐 ∈𝐶𝑝
𝜆𝑝𝑘 𝑓𝑝𝑘 (𝒙𝒄 , 𝒚𝒄 )
𝑘=1
𝑇
 Conditional Random Fields
𝑝 𝑥, 𝑦 = of the Markov models
𝑝 𝑦𝑡 𝑦𝑡−1
 Generalization
𝑡=1
𝑨
𝑝(𝑥𝑡 |𝑦𝑡 )
𝑩
 How do we initialize those models?
 Reaching HCRFs from a𝑇HMM
𝑨 𝑦𝑡 , 𝑦𝑡−1 𝑩(𝑥𝑡 , 𝑦𝑡 )
𝑡=1
Introduction
Libras
Review
Hidden Markov
Models (HMMs)
Conditional Random
Fields (CRFs)
Hidden Conditional
Random Fields (HCRFs)
Methods
Experiments
Results
Conclusion
1
𝑝 𝒚𝒙 =
𝑍(𝒙)
𝑇
𝑝 𝑥, 𝑦 =
𝑒
𝑒𝑥𝑝
𝐶𝑝 ∈ 𝒞 Ψ𝑐 ∈𝐶𝑝
𝑇𝑇
𝑘=1
𝑨𝐥𝐧𝑨
𝑦𝑡 , 𝑦𝑦𝑡−1
𝑩(𝑥+𝑡 ,𝐥𝐧𝑩(𝑥
𝑦𝑡 ) 𝑡 , 𝑦𝑡 )
𝑡 , 𝑦𝑡−1
ln
𝑡=1 𝑡=1
𝑡=1
= 𝑒𝑥𝑝
𝑡=1
𝑇
𝑇
𝐥𝐧𝑨 𝑦𝑡 , 𝑦𝑡−1 +
𝑡=1
𝑡=1
Libras
𝜆𝑝𝑘 𝑓𝑝𝑘 (𝒙𝒄 , 𝒚𝒄 )
𝑇
𝑇𝑇
Introduction
41
𝐾(𝑝)
Review
𝐥𝐧𝑩(𝑥𝑡 , 𝑦𝑡 )
𝑡=1
Methods
Experiments
𝑡=1
Results
Conclusion
1
𝑝 𝒚𝒙 =
𝑍(𝒙)
𝑒𝑥𝑝
𝐶𝑝 ∈ 𝒞 Ψ𝑐 ∈𝐶𝑝
𝑇𝑇
𝑝 𝑥, 𝑦 = 𝑒𝑥𝑝
𝜆𝑝𝑘 𝑓𝑝𝑘 (𝒙𝒄 , 𝒚𝒄 )
𝑘=1
𝑇
𝑇
𝐥𝐧𝑨 𝑦𝑡 , 𝑦𝑡−1 +
𝑡=1
𝑡=1
𝑛
42
𝐾(𝑝)
𝐥𝐧𝑩(𝑥𝑡 , 𝑦𝑡 )
𝑡=1
𝑡=1
ln 𝑨
ln 𝑩
a11 a12 a13
b11 b12
a21 a22 a23
𝑛
b21 b22
a31 a32 a33
b31 b32
𝑛
𝑚
𝝀:
𝑘 = 𝑛 ×𝑛 + 𝑛× 𝑚
Introduction
Libras
Review
Methods
Experiments
Results
Conclusion
1
𝑝 𝒚𝒙 =
𝑍(𝒙)
𝑒𝑥𝑝
𝜆𝑝𝑘 𝑓𝑝𝑘 (𝒙𝒄 , 𝒚𝒄 )
𝐶𝑝 ∈ 𝒞 Ψ𝑐 ∈𝐶𝑝
𝑘=1
𝑇𝑇
𝐥𝐧𝑨 𝑦𝑡 , 𝑦𝑡−1 +
𝐥𝐧𝑩(𝑥𝑡 , 𝑦𝑡 )
𝑡=1
𝑡=1
𝑡=1
𝑡=1
𝒇𝒆𝒅𝒈𝒆 𝒚𝒕 , 𝒚𝒕−𝟏 , 𝒙; 𝑖, 𝑗 = 𝟏{𝒚𝒕=𝐢} 𝟏{𝒚𝒕−𝟏 =𝐣}
𝝀:
𝑇
𝑇
𝑝 𝑥, 𝑦 = 𝑒𝑥𝑝
𝒇:
43
𝐾(𝑝)
𝒇𝒏𝒐𝒅𝒆 𝒚𝒕 , 𝒚𝒕−𝟏 , 𝒙; 𝑖, 𝑜 = 𝟏{𝒚𝒕=𝐢} 𝟏{𝒙𝒕 =𝐨}
𝒇𝒆 𝒇𝒆 𝒇𝒆 𝒇𝒆 𝒇 𝒆 𝒇𝒆 𝒇𝒆 𝒇𝒆 𝒇𝒆 𝒇𝒏 𝒇𝒏 𝒇 𝒏 𝒇𝒏 𝒇 𝒏 𝒇𝒏
i=1
i=1
i=1
i=2
i=2
i=2
i=3
i=3
i=3
i=1
i=1
i=2
i=2
i=3
i=3
j=1
j=2
j=3
j=1
j=2
j=3
j=1
j=2
j=1
o=1 o=2 o=1 o=2 o=1 o=2
a11 a12 a13 a21 a22 a23 a31 a32 a33 b11 b12 b21 b22 b31 b32
𝑘 = 𝑛 ×𝑛 + 𝑛× 𝑚
Introduction
Libras
Review
Methods
Experiments
Results
Conclusion
1
𝑝 𝒚𝒙 =
𝑍(𝒙)
𝑇
𝑛
44
𝐾(𝑝)
𝑒𝑥𝑝
𝐶𝑝 ∈ 𝒞 Ψ𝑐 ∈𝐶𝑝
𝜆𝑝𝑘 𝑓𝑝𝑘 (𝒙𝒄 , 𝒚𝒄 )
𝑘=1
𝑛
𝑝 𝑥, 𝑦 = 𝑒𝑥𝑝
𝑇
𝑛
𝑚
𝑎𝑖𝑗 1{𝑦𝑡=𝑖} 1{𝑦𝑡−1 =𝑗} +
𝑡=1 𝑖=1 𝑗=1
𝑏𝑖𝑗 1{𝑦𝑡=𝑖} 1{𝑥𝑡 =𝑗}
𝑡=1 𝑖=1 𝑗=1
𝑇
𝑇
𝒇𝒆𝒅𝒈𝒆 𝒚𝒕 , 𝒚𝒕−𝟏 , =
𝒙; 𝑖,
𝑗 = 𝟏{𝒚𝑎𝒕=𝐢}𝒇𝟏{𝒚
𝒚𝒕 ,(𝒚
𝒚 , 𝒚, 𝒙; 𝑖,, 𝒙)
𝑜 = 𝟏{𝒚𝒕=𝐢} 𝟏{𝒙𝒕 =𝐨}
𝒕−𝟏
𝑒𝑥𝑝
𝒚𝒕−𝟏 , 𝒙) + 𝒇𝒏𝒐𝒅𝒆
𝑏𝑖𝑗 𝒇
𝑖𝑗 𝒊𝒋 (𝒚
𝒕 , =𝐣}
𝒊𝒋 𝒕−𝟏
𝒕 𝒕−𝟏
𝑡=1
𝑡=1
𝐾
= 𝑒𝑥𝑝
𝜆𝑘 𝑓𝑘 𝒙, 𝒚
𝑘=1
Introduction
Libras
Review
Methods
Experiments
Results
Conclusion
1
𝑝 𝒚𝒙 =
𝑍(𝒙)
45
𝐾(𝑝)
𝑒𝑥𝑝
𝐶𝑝 ∈ 𝒞 Ψ𝑐 ∈𝐶𝑝
𝜆𝑝𝑘 𝑓𝑝𝑘 (𝒙𝒄 , 𝒚𝒄 )
𝑘=1
𝐾
𝜆𝑘 𝑓𝑘 1𝒙, 𝒚
𝑝 𝑥, 𝑦 = 𝑒𝑥𝑝
𝑝 𝒚𝒙 =
𝑝 𝑦𝑘=1 𝑍(𝒙)
Introduction
Libras
Review
Methods
Experiments
Results
Conclusion
46
Hidden Markov
Models (HMMs)
 Drawback
 Assumes both 𝒙 and 𝒚 are known
Conditional Random
Fields (CRFs)
Hidden Conditional
Random Fields (HCRFs)
Introduction
Libras
Review
Methods
Experiments
Results
Conclusion
47
Hidden Markov
Models (HMMs)
Conditional Random
Fields (CRFs)
Hidden Conditional
Random Fields (HCRFs)
Introduction
Libras
Review
Methods
Experiments
Results
Conclusion
48
Hidden Markov
Models (HMMs)
Conditional Random
Fields (CRFs)
Hidden Conditional
Random Fields (HCRFs)
Introduction
Libras
Review
Methods
Experiments
Results
Conclusion
49
 Hidden Conditional Random Fields
 Generalization of the hidden Markov classifiers
Hidden Markov
Models (HMMs)
Conditional Random
Fields (CRFs)
Hidden Conditional
Random Fields (HCRFs)
Parameter space
Introduction
Libras
Review
Methods
Experiments
Results
Conclusion
50
 Hidden Conditional Random Fields
 Generalization of the hidden Markov classifiers
Hidden Markov
Models (HMMs)
 Sequence classification
Conditional Random
Fields (CRFs)
 Model 𝑝 𝜔|𝒙 without explicitly modeling 𝑝 𝒙
Hidden Conditional
Random Fields (HCRFs)
 Do not require 𝒚 to be known
 The sequence of states is now hidden
Introduction
Libras
Review
Methods
Experiments
Results
Conclusion
51
 Hidden Conditional Random Fields
 Generalization of the hidden Markov classifiers
𝑝,(𝜔
𝒚 |𝒙) =
𝑝 𝜔𝒙 =
𝒚
𝒚
1
𝑍(𝒙)
𝐾(𝑝)
𝑘=1
Libras
𝐶𝑝 ∈ 𝒞 Ψ𝑐 ∈𝐶𝑝
Conditional Random
Fields (CRFs)
𝐾(𝑝)
𝜽𝒑𝑐
𝜆𝑝𝑘 𝑓𝑝𝑘 𝒙𝒄 , 𝒚𝒄;, 𝜔
Ψ𝑐 𝒙𝒄,, 𝜔
𝒚𝑐𝒄 𝜽
; 𝜽𝒑𝒑 = 𝑒𝑥𝑝
Introduction
Ψ𝑐 (𝒙𝒄 , 𝒚𝒄 ,; 𝜔
𝜽𝑐𝒑 )
Review
Hidden Markov
Models (HMMs)
𝑘=1
Methods
Experiments
Hidden Conditional
Random Fields (HCRFs)
Results
Conclusion
52
Single model for all words
ω
yt-2
yt-1
yt
xt-2
xt-1
xt
ω = max 𝑝(𝜔𝑗 |𝒙)
𝜔𝑗 ∈ ω
Introduction
Libras
Review
Methods
Experiments
Results
Conclusion
Fingerspelling recognition with SVMs
and HCRFs against ANNs and HMMs
53
Pato
HCRF
ω
HMM
yt-2
yt-1
yt
xt-2
xt-1
xt
P
P
Sequence classifier
P
A
A
A
T
T
T
O
O
Static gesture classifier
SVM
ANN
Introduction
Libras
Review
Methods
Experiments
Results
Conclusion
55
Static gesture recognition
Static gestures
(hand postures)
 Database of 8100 grayscale images
 Input instances with 1024 features
 27 classes (manual alphabet signs)
Introduction
Libras
Review
Methods
Dynamic gestures
(spelled words)
Experiments
Results
Conclusion
56
 Neural Networks
 Evaluate initialization heuristics (Nguyen-Widrow)
 Support Vector Machines
 Evaluate the heuristic value for 𝜎 2 based on the interquartile range of the norm statistics for the input
dataset
Introduction
Libras
Review
Methods
Experiments
Static gestures
(hand postures)
Dynamic gestures
(spelled words)
Results
Conclusion
57
 Static sign classification
Kappa
σ2
Kappa
Hidden Neurons
Kappa
0,1
0.000
50
0.851
1
0.106
100
0.887
10
0.569
300
0.921
100
0.950
500
0.922
(heuristic) 392
0.917
1000
0.924
1000
0.863
1
800.00
0.8
600.00
0.6
400.00
0.4
200.00
0.2
0
0.1
1
Busca em grade
Introduction
Libras
Review
0.00
1000
10
100
Heurística
Vetores de Suporte
Methods
Experiments
Support Vectors
(Average)
Resilient Backpropagation ANN
Gaussian kernel SVM
Results
Conclusion
58
 Static sign classification
Kappa
σ2
Kappa
Hidden Neurons
Kappa
0.1
0.000
50
0.851
1
0.106
100
0.887
10
0.569
300
0.921
100
0.959
500
0.922
(heuristic) 392
0.917
1000
0.925
1000
0.863
1
800.00
0.8
600.00
0.6
400.00
0.4
200.00
0.2
0
0.1
1
Busca em grade
Introduction
Libras
Review
0.00
1000
10
100
Heurística
Vetores de Suporte
Methods
Experiments
Support Vectors
(Average)
Resilient Backpropagation ANN
Gaussian kernel SVM
Results
Conclusion
59
Hyperparameter surface for Gaussian SVMs
Kappa
1.00
0.95
0.90
0.85
1000000
10000
C
100
0.85-0.90
50…
15…
10…
20…
0.95-1.00
400
200
Libras
Review
Methods
20…
15…
400-600
50…
200-400
10…
Sigma (σ²)
500
H
0
0-200
Introduction
0.90-0.95
1
600
0.1
1
10
50
75
100
200
300
Average number of SVs
0.80-0.85
500
Sigma (σ²)
H
0.1
1
10
50
75
100
200
300
0.80
Experiments
1000000
10000
C
100
1
Results
Conclusion
60
 Static gesture classification
 Statistically significant results (p < 0.01)
Static gestures
(hand postures)
Dynamic gestures
(spelled words)
 Points of interest




Polynomial machines have increased sparcity but smaller kappa
Neural networks were faster to evaluate, but not to learn – unless using linear SVMs
Sigma plays a much more important role than C in Gaussian machines
Heuristics for choosing sigma and C resulted in great performance values
Introduction
Libras
Review
Methods
Experiments
Results
Conclusion
61
Static gestures
(hand postures)
Dynamic gestures
(spelled words)
Introduction
Libras
Review
Methods
Experiments
Results
Conclusion
62
Static gestures
(hand postures)
Dynamic gestures
(spelled words)
Introduction
Libras
Review
Methods
Experiments
Results
Conclusion
Pato
HCRF
ω
yt-2
yt-1
yt
xt-2
xt-1
xt
Sequence classifier
P
P
P
A
A
A
T
T
Static gesture classifier
SVM
Introduction
Libras
Review
Methods
Experiments
Results
Conclusion
T
64
Dynamic gesture classification
 Database containing 540 signed words
 Containing a total of 63,703 static signs
 The previous layer labels the entire dataset
 Then we tested all possible model combinations
 Estimated kappa (𝜅) sampled from 10-fold CV
Introduction
Libras
Review
Methods
Experiments
Results
Conclusion
65
 Dynamic gesture classification
Labeller
Classification
Algorithm
SVM
HMM
Baum-Welch
SVM
HCRF
RProp
ANN
HMM
Baum-Welch
ANN
HCRF
RProp
Introduction
Libras
Review
Methods
Training
Validation
Kappa
Kappa
Experiments
Results
Conclusion
66
 Dynamic gesture classification
Training
Validation
Kappa
Kappa
Labeller
Classification
Algorithm
SVM
HMM
Baum-Welch
0.95
0.82
SVM
HCRF
RProp
0.98
0.83
ANN
HMM
Baum-Welch
0.95
0.80
ANN
HCRF
RProp
0.99
0.82
Introduction
Libras
Review
Methods
Experiments
Results
Conclusion
67
 Dynamic gesture classification
Training
Validation
Kappa
Kappa
Labeller
Classification
Algorithm
SVM
HMM
Baum-Welch
0.95
0.82
SVM
HCRF
RProp
0.98
0.83
ANN
HMM
Baum-Welch
0.95
0.80
ANN
HCRF
RProp
0.99
0.82
 SVM+HCRF have shown the best validation result (10-fold CV)
 Combinations using HCRF have shown best results in general
 Training results are statistically different
 We have not enough evidence to say validation results are not equivalent
Introduction
Libras
Review
Methods
Experiments
Results
Conclusion
68
 Dynamic gesture classification
Training
Validation
Kappa
Kappa
Labeller
Classification
Algorithm
SVM
HMM
Baum-Welch
0,95
0,82
SVM
HCRF
RProp
0,98
0,83
ANN
HMM
Baum-Welch
0,95
0,80
ANN
HCRF
RProp
0,99
0,82
 Hidden Conditional Random Fields – in this specific problem – had
higher ability to retain knowledge while keeping the same generality
 In other words, achieved less overfitting.
Introduction
Libras
Review
Methods
Experiments
Results
Conclusion
and future works
69
 Fingerspelling experiments
 SVMs & HCRFs vs ANNs & HMMs
 Static Gesture Recognition
 Statistically significant results favoring SVMs
 Linear SVMs on DDAGs:
 Best compromise between speed, accuracy and ease of use
 SVMs have shown easier training, reduced training times
 Heuristic initializations work rather well, less parameter tuning
 Dynamic Gesture Recognition
 Choice of gesture classifier had much more impact
 Linear-chain HCRFs:
 Increased knowledge absorption without overfitting
Introduction
Libras
Review
Methods
Experiments
Results
Conclusion
71
Future works
 Detect standard words rather than fingerspelling (already complete)
 Use Structural Support Vector Machines, which are equivalent to
HCRFs but are trained using a hinge loss function
 Use a mixed language model to categorize full phrases
Introduction
Libras
Review
Methods
Experiments
Results
Conclusion
73
 References
 BOWDEN, R. et al. A Linguistic Feature Vector for the Visual Interpretation of
Sign Language. European Conference on Computer Vision. [S.l.]: Springer-Verlag.
2004. p. 391-401.
 BRADSKI, G. R. Computer Vision Face Tracking For Use in a Perceptual User
Interface. Intel Technology Journal, n. Q2, 1998. Disponivel em:
<http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.14.7673>.
 DIAS, D. B. et al. Hand movement recognition for brazilian sign language: a
study using distance-based neural networks. Proceedings of the 2009
international joint conference on Neural Networks. Atlanta, Georgia, USA: IEEE
Press. 2009. p. 2355-2362.
 FERREIRA-BRITO, L. Por uma gramática de Línguas de Sinais. 2nd. ed. Rio de
Janeiro: Tempo Brasileiro, 2010. 273 p. ISBN 85-282-0069-8.
 FERREIRA-BRITO, L.; LANGEVIN, R. The Sublexical Structure of a Sign Language.
Mathématiques, Informatique et Sciences Humaines, v. 125, p. 17-40, 1994.
Introduction
Libras
Review
Methods
Experiments
Results
Conclusion
74
 References
 FEUERSTACK, S.; COLNAGO, J. H.; SOUZA, C. R. D. Designing and Executing
Multimodal Interfaces for the Web based on State Chart XML. Proceedings of
3a. Conferência Web W3C Brasil 2011. Rio de Janeiro: [s.n.]. 2011.
 PIZZOLATO, E. B.; ANJO, M. D. S.; PEDROSO, G. C. Automatic recognition of
finger spelling for LIBRAS based on a two-layer architecture. Proceedings of the
2010 ACM Symposium on Applied Computing. Sierre, Switzerland: ACM. 2010. p.
969-973.
 VIOLA, P.; JONES, M. Robust Real-time Object Detection. International Journal of
Computer Vision. [S.l.]: [s.n.]. 2001.
 VAPNIK, V. N. The nature of statistical learning theory. New York, NY, USA:
Springer-Verlag New York, Inc., 1995. ISBN 0-387-94559-8.
 VAPNIK, V. N. Statistical learning theory. [S.l.]: Wiley, 1998. ISBN 0471030031.
 YANG, R.; SARKAR, S. Detecting Coarticulation in Sign Language using
Conditional Random Fields. Pattern Recognition, 2006. ICPR 2006. 18th
International Conference on. [S.l.]: [s.n.]. 2006. p. 108-112.
Introduction
Libras
Review
Methods
Experiments
Results
Conclusion
Guilherme Cartacho
Appendix A
77
Accord.NET Framework
 Machine Learning and Artificial Intelligence
78
Accord.NET Framework
 Machine Learning and Artificial Intelligence
 Computer Vision / Audition
79
Accord.NET Framework
 Machine Learning and Artificial Intelligence
 Computer Vision / Audition
 Mathematics and Statistics
80
Builds upon well established foundations
83
It has been used to
 Recognize gestures using Wii
84
It has been used to
 Study and evaluate performance in 3D gesture recognition
85
It has been used to
 Predict attacks in computer networks
86
It has been used to
 Compare touch and in-air gestures using Kinect
87
It has been used to
 Provide sensor information in multi-model interfaces
It has been used in
 an increasing number of publications
 Guido Soetens, Estimating the limitations of single-handed multi-touch
input. Master Thesis, Utrecht University. September, 2012.
 K. N. Pushpalatha, A. K. Gautham, D. R. Shashikumar, K. B. ShivaKumar.
Iris Recognition System with Frequency Domain Features optimized
with PCA and SVM Classifier, IJCSI International Journal of Computer
Science Issues, Vol. 9, Issue 5, No 1, September 2012.
 Arnaud Ogier, Thierry Dorval. HCS-Analyzer: Open source software for
High-Content Screening data correction and analysis. Bioinformatics.
First published online May 13, 2012.
It has been used in
 an increasing number of publications
 Ludovico Buffon, Evelina Lamma, Fabrizio Riguzzi, and Davide Forment.
Un sistema di vision inspection basato su reti neurali. In Popularize
Artificial Intelligence. Proceedings of the AI*IA Workshop and Prize for
Celebrating 100th Anniversary of Alan Turing's Birth (PAI 2012), Rome,
Italy, June 15, 2012, number 860 in CEUR Workshop Proceedings, pages
1-6, Aachen, Germany, 2012.
 Liam Williams, Spotting The Wisdom In The Crowds. Master Thesis on
Joint Mathematics and Computer Science. Imperial College London,
Department of Computing. June, 2012.
 Alosefer, Y.; Rana, O.F.; "Predicting client-side attacks via behaviour
analysis using honeypot data," Next Generation Web Services Practices
(NWeSP), 2011 7th International Conference on , vol., no., pp.31-36, 1921 Oct. 2011
It has been used in
 an increasing number of publications
 Brummitt, L. Scrabble Referee: Word Recognition Component, 2011.
Final project report. University of Sheffield, Sheffield, England.
 Cani, V., 2011. Image Stitching for UAV remote sensing application.
Master Degree Thesis. Computer Engineering, School of Castelldefels of
Universitat Politècnica de Catalunya. Barcelona, Spain.
 Hassani, A. Z.; "Touch versus in-air Hand Gestures: Evaluating the
acceptance by seniors of Human-Robot Interaction using Microsoft
Kinect," Master Thesis, University of Twente, Enschede, Netherlands,
2011.
 Kaplan, K., 2011. ADES: Automatic Driver Evaluation System. PhD
Thesis, Boğaziçi University, Istanbul, Turkey.
It has been used in
 an increasing number of publications
 Wright, M., Lin, C.-J., O'Neill, E., Cosker, D. and Johnson, P., 2011. 3D
Gesture recognition: An evaluation of user and system performance. In:
Pervasive Computing - 9th International Conference, Pervasive 2011,
Proceedings. Heidelberg: Springer Verlag, pp. 294-313.
 Lourenço, J., 2010. Wii3D: Extending the Nintendo Wii Remote into 3D.
Final course project report, Rhodes University, Grahamstown. 110p.
 Mendelssohn, T.; 2010. Gestureboard - Entwicklung eines Wiimotebasierten, gestengesteuerten, Whiteboard-Systems für den
Bildungsbereich. Final project report. Hochschule Furtwangen
University, Furtwangen im Schwarzwald, Germany.
http://accord.googlecode.com
92

Hidden Conditional Random Fields - Universidade Federal de São

Transcript Hidden Conditional Random Fields - Universidade Federal de São

Directory