Newton Method for the ICA Mixture Model

Transcript Newton Method for the ICA Mixture Model

Newton Method for the
ICA Mixture Model
Jason A. Palmer1 Scott Makeig1
Ken Kreutz-Delgado2 Bhaskar D. Rao2
1 Swartz
Center for Computational Neuroscience
2 Dept of Electrical and Computer Engineering
University of California San Diego, La Jolla, CA
Introduction
• Want to model sensor array data with multiple
independent sources — ICA
• Non-stationary source activity — mixture model
• Want the adaptation to be computationally
efficient — Newton method
Outline
• ICA mixture model
• Basic Newton method
• Positive definiteness of Hessian when model
source densities are true source densities
• Newton for ICA mixture model
• Example applications to analysis of EEG
ICA Mixture Model—toy example
• 3 models in two dimensions, 500 points per
model
• Newton method converges < 200 iterations,
natural gradient fails to converge, has difficulty
on poorly conditioned models
10
10
5
5
0
0
-5
-5
-10
-10
-10
-5
0
5
10
-10
-5
0
5
10
ICA Mixture Model
• Want to model observations x(t), t = 1,…,N,
different models “active” at different times
• Bayesian linear mixture model, h = 1, . . . , M :
• Conditionally linear given the model,
• Samples are modeled as independent in time:
:
Source Density Mixture Model
• Each source density mixture component has
unknown location, scale, and shape:
• Generalizes Gaussian
mixture model, more
peaked, heavier tails
ICA Mixture Model—Invariances
• The complete set of parameters to be
estimated is:
h = 1, . . ., M, i = 1, . . ., n, j = 1, . . ., m
• Invariances: W row norm/source density scale
and model centers/source density locations:
Basic ICA Newton Method
• Transform gradient (1st derivative) of cost
function using inverse Hessian (2nd derivative)
• Cost function is data log likelihood:
• Gradient:
• Natural gradient (positive definite transform):
Newton Method – Hessian
• Take derivative of (i,j)th element of gradient
with respect to (k,l)th element of W :
• This defines a linear transform
• In matrix form, this is:
:
Newton Method – Hessian
• To invert: rewrite the Hessian transformation
in terms of the source estimates:
• Define
,
,
• Want to solve linear equation
:
:
Newton Method – Hessian
• The Hessian transformation can be simplified
using source independence and zero mean:
• This leads to 2x2 block diagonal form:
Newton Direction
• Invert Hessian transformation, evaluate at
gradient:
• Leads to the following equations:
• Calculate the Newton direction:
Positive Definiteness of Hessian
• Conditions for positive
definiteness:
• Always true for true when model source
densities match true densities:
1)
2)
3)
Newton for ICA Mixture Model
• Similar derivation applies to ICA mixture model:
Convergence Rates
• Convergence is really much faster than natural
gradient. Works with step size 1!
• Need correct source density model
log likelihood
-1.97
-1.98
-1.99
-2
-2.01
-2.02
-2.03
20
iteration
40
60
80
100
120
iteration
140
160
180
Segmentation of EEG experiment trials
3 models
4 models
trial
trial
time
time
log
likelihood
log
likelihood
iteration
iteration
Applications to EEG—Epilepsy
1 model
5 models
log
likelihood
time
time
log
likelihood
difference
from
single
model
time
Conclusion
• We applied method of Amari, Cardoso and
Laheld, to formulate a Newton method for the
ICA mixture model
• Arbitrary source densities modeled with nongaussian source mixture model
• Non-stationarity modeled with ICA mixture
model (multiple mixing matrices learned)
• It works! Newton method is substantially
faster (superlinear). Also Newton can
converge when Natural Gradient fails
Code
• There is Matlab code available!!
– Generate toy mixture model data for testing
– Full method implemented: mixture sources,
mixture ICA, Newton
• Extended version of paper in preparation, with
derivation of mixture model Newton updates
• Download from:
http://sccn.ucsd.edu/~jason
Acknowledgements
• Thanks to Scott Makeig, Howard Poizner, Julie
Onton, Ruey-Song Hwang, Rey Ramirez, Diane
Whitmer, and Allen Gruber for collecting and
consulting on EEG data
• Thanks to Jerry Swartz for founding and
providing ongoing support the Swartz Center
for Computational Neuroscience
• Thanks for your attention!
Newton for ICA Mixture Model

Newton Method for the ICA Mixture Model

Transcript Newton Method for the ICA Mixture Model

Directory