A Null-space Algorithm for Overcomplete ICA Ying Nian Wu UCLA Department of Statistics
Download
Report
Transcript A Null-space Algorithm for Overcomplete ICA Ying Nian Wu UCLA Department of Statistics
A Null-space Algorithm for
Overcomplete ICA
Ying Nian Wu
UCLA Department of Statistics
Joint with Ray-Bing Chen
National Kaohsiung University
Plan
•Independent component analysis
•Overcomplete ICA
•Null space algorithm
•Experiments
Independent Component Analysis (ICA)
Blind source separation (Comon,1989,94)
?
ICA
?
Observations, xt
A
Sources, st
xt Ast , t 1,..., T
st ( st ,1 ,..., st ,m )' : source vector, indepedent components
xt ( xt ,1 ,..., xt ,m )' : receiver v ector
A (aij ) mm : linear mixing matrix
Statistical modeling & inference
1
xt Ast st A xt Wxt
m
st ~ P ( st ) pi ( st ,i );
i 1
pi : long - tail
xt ~ P (Wxt ) | W | dxt
l (W ) log P (Wxt ) T log | W |
t
l
Amari' s natural gradient : W
WW T
W
Coding/mutual information
Unsupervised learning (Bell & Sejnowski,1997)
xt Ast st Wxt
Overcomplete ICA
xt Ast t , t 1,..., T
st ( st ,1 ,..., st , M )' : source vector, indepedent components
xt ( xt ,1 ,..., xt ,m )' : receiver v ector
A (aij ) mM : linear mixing matrix
?
?
?
OICA
Olshausen & Field (1996)
2
f ( A, s ) {|| xt Ast || sparsity( st )}
t
st ,1
xt (a1 ,..., aM ) ... st ,1a1 ... st , M aM
s
t ,M
Lewicki & Sejnowski (2000), Lee et al (1999)
xt Ast N (0, 2 I )
M
st ~ P( st ) pi ( st ,i );
pi : long - tail
i 1
l ( A) log P ( xt | A) log P ( xt | st , A) P( st )dst
t
t
P( st | A, xt ) P ( st ) P ( xt | st , A)
Comparing ICA and Overcomplete ICA
ICA
OICA
Noise
xt Ast
xt Ast t
Source
1
st A xt P( st | A, xt )
Our work: null-space representation
As x
A( s s ) x
As 0
A simple case
s1
0 0 s 2
d1
d2
...
... ...... ...
d m 0 0 ...
s
M
x1
x
2
...
x
m
si xi / d i , i 1,..., m
1
D x
s
c
General situation: singular value decomposition
As x
~
x U ' x and ~s V ' s so that
D 0~s ~x
V1 '
A U ( D,0)V ' U ( D,0) UDV1 '
V2 '
D U' x
s V
c
1
V1D U ' x V2c
1
Bayesian modeling & inference
?
?
OICA
?
1
As x s V1D U ' x V2c
1
1
s ~ f ( s )ds P( x , c | A) f (V1D U ' x V2c ) | D | dxdc
Observed data : x1 ,..., xT
Missing data : c1 ,..., cT
Unknown parameter : A U ( D,0)V '
Data augmentation algorithm (Tanner & Wong, 1987)
As x s V1D 1U ' x V2c
1
1
s ~ f ( s )ds P( x , c | A) f (V1D U ' x V2c ) | D | dxdc
Imputation : P (c | x, A) exp{ H (c)}
Langevin - Euler moves :
H (c)
c( 1) c( ) 12 (
|c c ( ) )h h Z ( )
c
Posterior : P( A | x, c)
m
wi
P ( s1 , , sT )
T i 1 , where wi log d i
w
w
(U , V ) : repeated Givens rotations
log D :
ui ui cos u j sin ,
u j ui sin u j cos
sample from
P ( ) P ( x1 , , xT , c1 , , cT | A( ) U (i, j , )( D 0)V )
AR coefficient known
2 1
A
1 1.5
1.5
1
Mixture 1
Mixture 2
Speech
Recovered speech
Rain
Recovered rain
Wind
Recovered wind
1.9006
1.0430
0.9599
1.4479
1.6558
0.7098
AR coefficient known
2 1
A
1 1.5
1.5
1
Mixture1
Mixture 2
Ocean wave
Recovered wave
Rain
Recovered rain
Wind
Recovered wind
1.8667 1.0897 1.5760
0.9798 1.5881 0.8877
AR coefficient known
2
A
1
1
1.5
1.5
1
Mixture 1
Mixture 2
Ocean wave
Recovered wave
Rain
Recovered rain
Engine
Recovered engine
1.6091
1.1826
1.4415
1.3737
1.8061
1.2294
AR coefficient unknown
2 1
A
1 1.5
Mixture 1
Mixture 2
Ocean wave
Recovered wave
Rain
Recovered rain
Wind
Recovered wind
1.5
1
1.7381 0.8121 1.8467
1.2052 1.3309 0.7981
Parameter estimation
True
1
coefficient
2
3
4
5
6
7
8
Source 1
0.613 -0.264
0.235
-0.202
0.171 -0.188
0.139
-0.068
Source 2
1.015 -0.610
0.461
-0.331
0.340 -0.196
0.176
0.011
Source 3
0.849 -0.448
0.353
-0.216
0.244 -0.133
0.083
-0.034
3
4
5
7
8
Estimation 1
2
6
Recover 1
0.568 -0.202
0.156
-0.226
0.113 -0.182
0.173
-0.179
Recover 2
1.135 -0.855
0.717
-0.520
0.490 -0.266
0.137
0.096
Recover 3
0.810 -0.424
0.321
-0.216
0.231 -0.142
0.079
-0.046