BME502 Stochastic Networks Presentation

Download Report

Transcript BME502 Stochastic Networks Presentation

Stochastic Networks
Introduction to the Boltzmann Machine
Darryl H. Hwang
BME 502
Boltzmann Machine
• Network Model
• Input-Output relationship is stochastic
– Probabilistic
– First networks to introduce hidden units
• Two phases
– Training phase
– Free phase
Starting Point
• Neurons are treated as binary.
– va(t)=1 if active
– va(t)=0 if inactive
Danger: MATH AHEAD!
• Unit a is determined by total input current
Nv
I a t   ha t    M aava t 
a1
Maa΄ = Ma΄a
Maa = 0
ha = total feedforward input into unit a
• At each multiple of Δt, a random unit a gets
updated
Probability
P[va t  t   1]  F I a t 
1
F (I a ) 
Ia
1 e
• F is a sigmoidal function.
• Larger Ia, more likely unit a =1
P[va t  t   0]  1  F I a t 
Probability
P[va t  t   1]  F I a t 
1
F (I a ) 
Ia
1 e
• Markov chain v(t+Δt) depends only on v not on
history of network.
• Glauber dynamics
– v is described by a probability distribution
and doesn’t converge on a fixed point
Energy Function
1
E v   h  v  v  M  v
2
E v 
e
P[ v] 
Z   eE v 
Z
v
• Z = partition function
• P[v] = Boltzmann distribution
– States with lower energy more likely
Gibbs Sampling
• Glauber dynamics uses Gibbs sampling
for distribution
P[va t  t   1]  F I a t 
1
F (I a ) 
Ia
1 e
Mean-field Approximation
• I is determined by a dynamic equation
dI
 s  I  h  M  F I 
dt
• Instead of va=F(Ia) use
P[va t  t   1]  F I a t 
1
F (I a ) 
1  eIa
Mean-field Distribution
• Units are independent
• Probability distribution for v
Nv
1 va
Qv    F I a  1  F I a 
va
a 1
Mean-field distribution for the Boltzmann machine
Way of interpreting the outputs