The Bayesian brain, free energy and psychopathology Cambridge– May 23rd 2013 Karl Friston Abstract If we assume that neuronal activity encodes a probabilistic representation.

Download Report

Transcript The Bayesian brain, free energy and psychopathology Cambridge– May 23rd 2013 Karl Friston Abstract If we assume that neuronal activity encodes a probabilistic representation.

The Bayesian brain, free energy and psychopathology
Cambridge– May 23rd 2013
Karl Friston
Abstract
If we assume that neuronal activity encodes a probabilistic representation of the world that optimizes freeenergy in a Bayesian fashion, then this optimization can be regarded as evidence accumulation or
(generalized) predictive coding. Crucially, both predictions about the state of the world generating sensory
data and the precision of (confidence in) those data have to be optimized. In other words, we have to make
predictions (test hypotheses) about the content of the sensorium and predict our confidence in those
hypotheses. I hope to demonstrate the metacognitive aspect of this inference using simulations of action
observation and sensory attenuation - to illustrate the nature of active inference and promote discussion
about its role in making inferences about self and others.
Active inference, predictive coding and precision
Precision and false inference
Simulations of :
Auditory perception (and omission related responses)
Handwriting (and action observation)
Smooth pursuit eye movements (under occlusion)
Sensory attenuation (and the force matching illusion)
“Objects are always imagined as being present in the field of vision as
would have to be there in order to produce the same impression on
the nervous mechanism” - von Helmholtz
Hermann von Helmholtz
Richard Gregory
Geoffrey Hinton
From the Helmholtz machine to the
Bayesian brain and self-organization
Thomas Bayes
Richard Feynman
Ross Ashby
Action and perception minimise surprise
sensations – predictions
Prediction error
Change sensations
Change
predictions
Action
Perception
Action as inference – the “Bayesian thermostat”
Posterior distribution
p( | s)
Prior distribution
p ( )
Likelihood distribution
p( s |  )
s
20
40
60
80
100
120

temperature
 (t )  (t )
a (t )
Perception
  arg min F ( s,  , )  arg min  s ( s(a)  g (  )) 2   (    ) 2 
Action
a  arg min F ( s,  , )  arg min  s ( s(a)  g (  )) 2   (    ) 2 


a
a
s  g( )  
From models to perception
Generative model
A simple hierarchy
Dx(i )  f (i ) ( x(i ) , v (i ) )  x(i )
(3)
v(3)
v
v (i 1)  g (i ) ( x(i ) , v (i ) )  v(i )
(2)

v (2)
v
(2)

 x(2)
x
(2)
x (2)
x
(2)

 v(2)
v
(1)

v (1)
v
Descending
predictions
Model inversion (inference)
Expectations:
x(1)x(1)
Ascending
prediction errors
(1)
v(1)
v
(1)
x (1)
x
(0)

v (0)
v
Predictions:
Prediction errors:
v(i )  Dv(i )   v ( i )   ( i )   v( i 1)
 x(i )  D x(i )   x (i )   (i )
g (i )  g (i ) (  x(i ) , v( i ) )
f (i )  f (i ) (  x(i ) , v( i ) )
v(i )   (vi ) v(i )   (vi ) ( v( i 1)  g ( i ) )
 x(i )   (xi ) x(i )   (xi ) (D  x( i )  f ( i ) )
David Mumford
Predictive coding with reflexes
Action
a   a s   v(1)
oculomotor
signals
reflex
arc
proprioceptive input
pons
Perception
retinal input
Prediction error (superficial pyramidal cells)
frontal eye fields
Attention
geniculate
VTA
Top-down or backward
predictions
 (i )
v(i )   (vi ) v(i )   (vi ) (v(i 1)  g (i ) ( x(i ) , v(i ) ))
 x(i )   (xi ) x(i )   (xi ) (D x(i )  f (i ) ( x(i ) , v(i ) ))
Conditional predictions (deep pyramidal cells)
Bottom-up or forward
prediction error
visual cortex
 (i )
v(i )  Dv(i )   v (i )   (i )   v(i 1)
 x(i )  D x(i )   x (i )   (i )
Prediction error can be reduced by changing predictions (perception)
Prediction error can be reduced by changing sensations (action)
Perception entails recurrent message passing in the brain to optimize predictions
Action fulfils descending predictions
Decompensation
(trait abnormalities)
+
Neuromodulatory failure
(of sensory attenuation)
Attenuated violation responses
Loss of perceptual Gestalt
SPEM abnormalities
Psychomotor poverty
Resistance to illusions
-
Compensation
(to psychotic state)
Hallucinations
Delusions
Neuronal hierarchy
Generative process (and model)
(1)
1
v
(1)
2
v
f (2)
18 x2(2)  18 x1(2)


(2)
(2) (2)
(2) 
 32 x1  2 x3 x1  x2 
 2 x (2) x (2)  8 x(2)

3 3
 1 2

f (1)
18 x2(1)  18 x1(1)

 (1) (1)
(1) (1)
(1) 
  v1 x1  2 x3 x1  x2 
 2 x (1) x (1)  v (1) x (1)

2
3
 1 2

 x (1)   s 
g (1)   2(1)    1 
 x3   s2 
 x (2)  v (1) 
g (2)   2(2)    1(1) 
 x3  v2 
Syrinx
Model inversion
v( i )  Dv( i )   v ( i )   ( i )   v( i 1)
 x( i )  D x( i )   x (i )   ( i )
sonogram
percept
 v(0)
 v(1)
prediction error
micro-volts)
8
LFP (
Frequency (KHz)
Frequency (Hz)
10
6
4
2
0
-2
-4
-6
0.5
1
Time (sec)
1.5
500
1000
1500
peristimulus time (ms)
2000
percept
response to violation
Omission related
responses, MMN and
hallucinosis
100
5000
4500
LFP (micro-volts)
Frequency (Hz)
50
4000
3500
3000
0
-50
2500
2000
0.5
1
time (sec)
-100
1.5
percept
500
1000
1500
peristimulus time (ms)
2000
attenuated mismatch negativity
100
5000
4500
Reduced
precision at
second level
LFP (micro-volts)
Frequency (Hz)
50
4000
3500
3000
0
-50
2500
2000
0.5
1
time (sec)
-100
1.5
500
percept
1000
1500
peristimulus time (ms)
2000
hallucination
100
5000
4500
LFP (micro-volts)
Frequency (Hz)
50
4000
3500
3000
Compensatory
reduction of
sensory precision
0
-50
2500
2000
0.5
1
time (sec)
1.5
-100
500
1000
1500
peristimulus time (ms)
2000
Action as inference – the “Bayesian thermostat”
Prior distribution
20
40
60
80
100
120

temperature
 (t )
a (t )
Perception:
  arg min F ( s,  , )  arg min  s ( s(a)  g (  )) 2   (    ) 2 
Action:
a  arg min F ( s,  , )  arg min  s ( s(a)  g (  )) 2   (    ) 2 


a
a
s  g( )  
Heteroclinic cycle (central pattern generator)
 x(1)
 x(2)
Descending
proprioceptive predictions
action
observation
0.4
position (y)
0.6
0.8
1
1.2
1.4
0
a   a s   v(1)
0.2
0.4
0.6
0.8
position (x)
1
1.2
1.4
0
0.2
0.4
0.6
0.8
position (x)
1
1.2
1.4
Angular direction of target in extrinsic coordinates
x t(1)
x (1)
o
Angular direction of gaze in extrinsic coordinates
xt(1)  x (1)
o
oculomotor
signals
Smooth pursuit eye
movements
Angular position of target in intrinsic coordinates
proprioceptive input
so  x(1)
o
reflex
arc
pons
retinal input
st
visual channels
time
щ
йs щ й
xo(1)
ъ+ wv(1)
s = к o ъ= к (1)
кst ъ кO( xt ) Чexp(- ([- 8,ј ,8] + xo(1) - xt(1) ) 2 )ъ
л ы л
ы
(1) щ й
(1)
йx&
щ
xoў
кo ъ к
ъ
(1)
(1)
(1)
ъ= к1 (v - x (1) ) - 1 x ў(1) ъ+ w(1)
ў
&
x& = к
x
o
o
o
x
4
2
к ъ к
ъ
(1)
(1)
кx&(1) ъ к
ъ
v - xt
кt ы
ъ к
ъ
л
л
ы
йs щ
s = к o ъ=
кst ъ
л ы
йx&o щ
к ъ
ъ=
x&= кx&ў
к oъ
кx&ъ
л tы
й
щ (1)
xo
к
ъ+ ω v
2
кO(xt ) Чexp(- ([- 8, K ,8] + x o - xt ) )ъ
л
ы
й xў
щ
o
к
ъ
к1 a - 1 x oўъ+ ω (1)
x
8
к4
ъ
к v- x ъ
Generative process
к
ъ
t ы
л
v (1) = x1(2) + wv(2)
йx&(2) щ
x&(2) = к 1(2) ъ=
кx& ъ
л2 ы
v (2) = h + wv(3)
1
8
й x (2) щ
v (2) к 2 (2) ъ+ wx(2)
к- x ъ
л 1 ы
Generative model
Eye movements under occlusion – and
reduced precision
Angular position
displacement (degrees)
2
1
target
eye
0
-1
eye (reduced precision)
-2
500
1000
1500
2000
2500
3000
2000
2500
3000
velocity (degrees per second)
Angular velocity
50
40
30
20
10
0
-10
-20
500
1000
1500
time (ms)
Paradoxical responses to
violations
target and oculomotor angles
displacement (degrees)
2
eye
1
target
0
eye (reduced
precision)
-1
-2
100
200
300
400
500
600
time (ms)
700
800
900
1000
900
1000
velocity (degrees per second)
target and oculomotor velocities
30
20
10
0
-10
-20
-30
100
200
300
400
500
600
time (ms)
700
800
Sensory attenuation, illusions and agency
Making your own sensations
s   x 
s   p    i   s
 ss   xi  xe 
 xi   vi  14  xi 
x 
  x
1
 xe  ve  4  xe 
v 
v   i   v
ve 
a
 s   xi 
s   p  
  ωs
 ss   x i  v e 
x  xi   (a )  14  xi  ω x
sp
ω s ~ N (0, e 8 I )
ω x ~ N (0, e 8 I )
ss
s ~ N (0, e I )
 x ~ N (0, e4 I )
xi
Generative process
ve
  8     ( xi  vi )
v ~ N (0, e6 I )
Generative model
x
v
sensorimotor cortex
descending predictions
mx
descending sensory
predictions
descending modulation
thalamus
prefrontal cortex
mv
ascending prediction errors
v
descending motor
predictions
s
v
a
sp
ss
motor reflex arc
Self-made acts
prediction and error
hidden states
2
xi
2
1.5
ss
sp
1
1.5
1
0.5
0.5
0
0
-0.5
-0.5
5
10
15
20
25
30
xe
5
10
Time (bins)
15
20
25
30
Time (bins)
High sensory attenuation
hidden causes
1
perturbation and action
1
vi
a
0.8
0.6
0.5
ve
0.4
0.2
0
0
-0.2
-0.4
-0.5
-0.6
-0.8
5
10
15
20
Time (bins)
25
30
5
10
15
20
Time (bins)
25
30
and psychomotor poverty
prediction and error
hidden states
2
2
1.5
1.5
1
1
0.5
0.5
0
0
-0.5
-0.5
5
10
15
20
25
30
5
10
time
15
20
25
30
time
Failure of sensory attenuation
hidden causes
perturbation and action
1
1
0.8
0.6
0.5
0.4
0.2
0
0
-0.2
-0.4
-0.5
-0.6
-0.8
5
10
15
time
20
25
30
5
10
15
time
20
25
30
prediction and error
2
hidden states
2
1.5
hidden states
prediction and error
2
2
1.5
1.5
1
1
0.5
0.5
0
0
-0.5
-0.5
1.5
1
1
0.5
0.5
0
0
-0.5
-0.5
10
20
30
40
50
60
10
20
Time (bins)
30
40
50
60
10
20
Time (bins)
30
Sensory attenuation
hidden causes
2
2
40
50
60
10
perturbation and action
hidden causes
1.5
1
1
0.5
0.5
1
0.5
0.5
0
0
0
0
-0.5
-0.5
30
40
Time (bins)
50
60
-0.5
10
50
60
perturbation and action
1.5
1
20
40
Force matching illusion
1.5
10
30
Time (bins)
1.5
-0.5
20
Time (bins)
20
30
40
Time (bins)
50
60
10
20
30
40
Time (bins)
50
60
10
20
30
40
Time (bins)
50
60
Failures of sensory attenuation, with compensatory increases in
non-sensory precision
3
Simulated
Empirical
(Shergill et al)
Self-generated(matched) force
Self-generated(matched) force
2.5
2
1.5
1
0.5
0
0
0.5
1
1.5
2
External (target) force
2.5
3
External (target) force
prediction and error
3.5
3
3
2.5
2.5
2
2
1.5
1.5
1
1
0.5
0.5
0
0
-0.5
10
20
30
40
50
hidden states
3.5
60
-0.5
10
20
Time (bins)
30
40
50
60
Time (bins)
Failure of sensory attenuation and delusions of control?
hidden causes
3.5
3.5
3
3
2.5
2.5
2
2
1.5
1.5
1
1
0.5
0.5
0
0
-0.5
-1
perturbation and action
10
20
30
40
Time (bins)
50
60
-0.5
10
20
30
40
Time (bins)
50
60
+
-
Neuromodulatory failure
(of sensory attenuation)
•Schizophrenia: (dopamine) failure of
proprioceptive attenuation
Attenuated violation responses
Hallucinations
Loss• Autism:
of perceptual
Gestalt
(oxytocin)
failure of Delusions
SPEM
abnormalities
interoceptive
attenuation
Psychomotor poverty
Resistance
to illusions
•Depression:
(serotonin) failure of
exteroceptive attenuation
•…
Thank you
And thanks to collaborators:
Rick Adams
Andre Bastos
Sven Bestmann
Harriet Brown
Jean Daunizeau
Mark Edwards
Xiaosi Gu
Lee Harrison
Stefan Kiebel
James Kilner
Jérémie Mattout
Rosalyn Moran
Will Penny
Lisa Quattrocki Knight
Klaas Stephan
And colleagues:
Andy Clark
Peter Dayan
Jörn Diedrichsen
Paul Fletcher
Pascal Fries
Geoffrey Hinton
James Hopkins
Jakob Hohwy
Henry Kennedy
Paul Verschure
Florentin Wörgötter
And many others
Searching to test hypotheses – life as an efficient experiment
H ( S , )  H ( S | m)  H ( | S )
 Et [ ln p( s (t ) | m)]  Et [ H ( | S  s (t ))]
Free energy principle
minimise uncertainty
 (t )  arg min{H [q( |  , )]}

Time-scale
Free-energy minimisation leading to…
10 3 s
Perception and Action: The optimisation of neuronal and
neuromuscular activity to suppress prediction errors (or freeenergy) based on generative models of sensory data.
100 s
103 s
106 s
1015 s
Learning and attention: The optimisation of synaptic gain and
efficacy over seconds to hours, to encode the precisions of
prediction errors and causal structure in the sensorium. This
entails suppression of free-energy over time.
Neurodevelopment: Model optimisation through activitydependent pruning and maintenance of neuronal connections that
are specified epigenetically
Evolution: Optimisation of the average free-energy (free-fitness)
over time and individuals of a given class (e.g., conspecifics) by
selective pressure on the epigenetic specification of their
generative models.