A Null-space Algorithm for Overcomplete ICA Ying Nian Wu UCLA Department of Statistics

Download Report

Transcript A Null-space Algorithm for Overcomplete ICA Ying Nian Wu UCLA Department of Statistics

A Null-space Algorithm for
Overcomplete ICA
Ying Nian Wu
UCLA Department of Statistics
Joint with Ray-Bing Chen
National Kaohsiung University
Plan
•Independent component analysis
•Overcomplete ICA
•Null space algorithm
•Experiments
Independent Component Analysis (ICA)
Blind source separation (Comon,1989,94)
?
ICA
?
Observations, xt
A
Sources, st


xt  Ast , t  1,..., T

st  ( st ,1 ,..., st ,m )' : source vector, indepedent components

xt  ( xt ,1 ,..., xt ,m )' : receiver v ector
A  (aij ) mm : linear mixing matrix
Statistical modeling & inference




1 
xt  Ast  st  A xt  Wxt
m


st ~ P ( st )   pi ( st ,i );
i 1
pi : long - tail



xt ~ P (Wxt ) | W | dxt

l (W )   log P (Wxt )  T log | W |
t
l
Amari' s natural gradient : W 
WW T
W
Coding/mutual information
Unsupervised learning (Bell & Sejnowski,1997)




xt  Ast  st  Wxt
Overcomplete ICA


xt  Ast   t , t  1,..., T

st  ( st ,1 ,..., st , M )' : source vector, indepedent components

xt  ( xt ,1 ,..., xt ,m )' : receiver v ector
A  (aij ) mM : linear mixing matrix
?
?
?
OICA
Olshausen & Field (1996)


 2

f ( A, s )  {|| xt  Ast ||  sparsity( st )}
t
 st ,1 



 


xt  (a1 ,..., aM ) ...   st ,1a1  ...  st , M aM
s 
 t ,M 
Lewicki & Sejnowski (2000), Lee et al (1999)


xt  Ast  N (0,  2 I )
M


st ~ P( st )   pi ( st ,i );
pi : long - tail
i 1

 
 
l ( A)   log P ( xt | A)   log  P ( xt | st , A) P( st )dst
t
t



 
P( st | A, xt )  P ( st ) P ( xt | st , A)
Comparing ICA and Overcomplete ICA
ICA
OICA
Noise


xt  Ast


xt  Ast   t
Source



1
st  A xt P( st | A, xt )
Our work: null-space representation
 
As  x

 
A( s  s )  x

As  0
A simple case
 s1

0 0  s 2
 d1


 d2
 ...

... ......  ...



d m 0 0  ...

s
 M


  x1 
 x 
 2
  ... 
  x 
  m


si  xi / d i , i  1,..., m
1 
 D x
s    
 c 
General situation: singular value decomposition
 
As  x


~
x  U ' x and ~s  V ' s so that
D 0~s  ~x
 V1 ' 
A  U ( D,0)V '  U ( D,0)   UDV1 '
 V2 ' 

D U' x

s  V   
 c 


1
 V1D U ' x  V2c
1
Bayesian modeling & inference
?
?
OICA
?
  


1
As  x  s  V1D U ' x  V2c

 
 


1
1  
s ~ f ( s )ds  P( x , c | A)  f (V1D U ' x  V2c ) | D | dxdc


Observed data : x1 ,..., xT


Missing data : c1 ,..., cT
Unknown parameter : A  U ( D,0)V '
Data augmentation algorithm (Tanner & Wong, 1987)
  


As  x  s  V1D 1U ' x  V2c

 
 


1
1  
s ~ f ( s )ds  P( x , c | A)  f (V1D U ' x  V2c ) | D | dxdc
Imputation : P (c | x, A)  exp{  H (c)}
Langevin - Euler moves :
H (c)
c(  1)  c( )  12 (
|c c ( ) )h  h Z ( )
c
Posterior : P( A | x, c)
m
  wi
P ( s1 ,  , sT )
 T i 1 , where wi  log d i
w
w
(U , V ) : repeated Givens rotations
log D :
ui  ui cos   u j sin  ,
u j  ui sin   u j cos 
 sample from
P ( )  P ( x1 ,  , xT , c1 ,  , cT | A( )  U (i, j ,  )( D 0)V )
AR coefficient known
2 1
A  
1 1.5
1.5

1 
Mixture 1
Mixture 2
Speech
Recovered speech
Rain
Recovered rain
Wind
Recovered wind

1.9006

1.0430
0.9599
1.4479
1.6558

0.7098 
AR coefficient known
2 1
A  
1 1.5
1.5

1 
Mixture1
Mixture 2
Ocean wave
Recovered wave
Rain
Recovered rain
Wind
Recovered wind

1.8667 1.0897 1.5760


0.9798 1.5881 0.8877 

AR coefficient known
2
A  
1
1
1.5
1.5

1 
Mixture 1
Mixture 2
Ocean wave
Recovered wave
Rain
Recovered rain
Engine
Recovered engine

1.6091

1.1826
1.4415
1.3737
1.8061

1.2294 

AR coefficient unknown
2 1
A  
1 1.5
Mixture 1
Mixture 2
Ocean wave
Recovered wave
Rain
Recovered rain
Wind
Recovered wind
1.5

1 
 1.7381 0.8121  1.8467 


1.2052 1.3309 0.7981 
Parameter estimation
True
1
coefficient
2
3
4
5
6
7
8
Source 1
0.613 -0.264
0.235
-0.202
0.171 -0.188
0.139
-0.068
Source 2
1.015 -0.610
0.461
-0.331
0.340 -0.196
0.176
0.011
Source 3
0.849 -0.448
0.353
-0.216
0.244 -0.133
0.083
-0.034
3
4
5
7
8
Estimation 1
2
6
Recover 1
0.568 -0.202
0.156
-0.226
0.113 -0.182
0.173
-0.179
Recover 2
1.135 -0.855
0.717
-0.520
0.490 -0.266
0.137
0.096
Recover 3
0.810 -0.424
0.321
-0.216
0.231 -0.142
0.079
-0.046