International Research Training Group (IRTG) “The Brain in Action” I am therefore I think Karl Friston, University College London Abstract: TThis overview of the.

Transcript International Research Training Group (IRTG) “The Brain in Action” I am therefore I think Karl Friston, University College London Abstract: TThis overview of the.

International Research Training Group
(IRTG) “The Brain in Action”
I am therefore I think
Karl Friston, University College London
Abstract: TThis overview of the free energy principle offers an account of embodied exchange with the world that
associates conscious operations with actively inferring the causes of our sensations. Its agenda is to link formal
(mathematical) descriptions of dynamical systems to a description of perception in terms of beliefs and goals. The
argument has two parts: the first calls on the lawful dynamics of any (weakly mixing) ergodic system– from a single
cell organism to a human brain. These lawful dynamics suggest that (internal) states can be interpreted as modelling
or predicting the (external) causes of sensory fluctuations. In other words, if a system exists, its internal states must
encode probabilistic beliefs about external states. Heuristically, this means that if I exist (am) then I must have beliefs
(think). The second part of the argument is that the only tenable beliefs I can entertain about myself are that I exist.
This may seem rather obvious; however, if we associate existing with ergodicity, then (ergodic) systems that exist by
predicting external states can only possess prior beliefs that their environment is predictable. It transpires that this is
equivalent to believing that the world – and the way it is sampled – will resolve uncertainty about the causes of
sensations. We will conclude by looking at the epistemic behavior that emerges under these beliefs, using simulations
of active inference.
Key words: active inference ∙ autopoiesis ∙ cognitive ∙ dynamics ∙ free energy ∙ epistemic value ∙ self-organization
.
.
What does it mean to be embodied?
The statistics of life
Markov blankets and ergodic systems
simulations of a primordial soup
The anatomy of inference graphical models and predictive coding
canonical microcircuits
Action and perception
action and its observation
simulations of saccadic searches
“How can the events in space and time which take place within the spatial
boundary of a living organism be accounted for by physics and chemistry?”
(Erwin Schrödinger 1943)
The Markov blanket as a statistical boundary
(parents, children and parents of children)
Internal states
External states
Sensory states
Active states
The Markov blanket in biotic systems
Sensory states
s  f s ( , s, a)  s
  f ( , s, a)  
External states
  f (s, a, )
a  f a (s, a,  )
Active states
Internal states
lemma: any (ergodic random) dynamical system (m) that possesses a Markov
blanket will appear to engage in active inference
x  f ( x)  
p( x | m)
The Fokker-Planck equation
p( x | m)   (  f ) p
And its solution in terms of curl-free and divergence-free components
p( x | m)  0  f ( x)  (  Q) ln p( x | m)
But what about the Markov blanket?
s ( s ,a ,  )
ln p ( s | m) 
f  ( s )  (  Q)  ln p( s | m)
Perception
f a ( s )  (  Q)a ln p( s | m)
Action
Value
Reinforcement learning, optimal control
and expected utility theory
F   ln p ( s | m) 
Surprise
Infomax, minimum redundancy and the
free-energy principle
Et [ ln p( s | m)] 
Entropy
Self-organisation, synergetics and
homoeostasis
p ( s | m) 
Pavlov
Barlow
Haken
Model evidence
Bayesian brain, evidence
accumulation and predictive coding
Helmholtz
Overview
The statistics of life
Markov blankets and ergodic systems
simulations of a primordial soup
The anatomy of inference graphical models and predictive coding
canonical microcircuits
Action and perception
action and its observation
simulations of saccadic searches
Position
Simulations of a (prebiotic)
primordial soup
Short-range forces
Strong repulsion
Weak electrochemical attraction
Finding the (principal) Markov blanket
Markov blanket matrix encodes the children, parents and parents of children
B  A  AT  AT A
Markov Blanket = [B · [eig(B) > τ]]
Adjacency matrix
Markov Blanket
20
Hidden states 40
60
80
Sensory states
100
Active states
Internal states 120
20
40
60
80
Element
100
120
Does action maintain the structural and functional integrity of the Markov blanket (autopoiesis) ?
Do internal states appear to infer the hidden causes of sensory states (active inference) ?
Active lesion
Markov blanket
8
8
6
6
4
4
2
2
0
0
-2
-2
-4
-4
-6
-6
-8
-8
-6
-4
-2
0
2
4
6
8
-8
-8
-6
-4
-2
Position
0
2
4
6
8
Position
Autopoiesis, oscillator death and simulated brain lesions
Sensory lesion
Internal lesion
8
8
6
6
4
4
2
2
0
0
-2
-2
-4
-4
-6
-6
-8
-8
-6
-4
-2
0
Position
2
4
6
8
-8
-8
-6
-4
-2
0
Position
2
4
6
8
Decoding through the Markov blanket and simulated brain activation
True and predicted motion
8
Motion of external state
0
Predictability
6
Christiaan Huygens
-0.1
-0.2
-0.3
-0.4
100
2
200
0
-2
-6
-8
300
Time
400
500
Internal states
-4
5
-5
0
Position
5
10
Modes
Position
4
15
20
25
30
100
200
300
Time
400
500
The existence of a Markov blanket necessarily implies a partition of states into
internal states, their Markov blanket (sensory and active states) and external or
hidden states.
Because active states change – but are not changed by – external states they
minimize the entropy of internal states and their Markov blanket. This means action
will appear to maintain the structural and functional integrity of the Markov blanket
(autopoiesis).
Internal states appear to infer the hidden causes of sensory states (by maximizing
Bayesian evidence) and influence those causes though action (active inference)
Overview
The statistics of life
Markov blankets and ergodic systems
simulations of a primordial soup
The anatomy of inference graphical models and predictive coding
canonical microcircuits
Action and perception
action and it observation
simulations of saccadic searches
“Objects are always imagined as being present in the field of
vision as would have to be there in order to produce the same
impression on the nervous mechanism” - von Helmholtz
Hermann von Helmholtz
Richard Gregory
Geoffrey Hinton
The Helmholtz machine and the
Bayesian brain
Thomas Bayes
Richard Feynman
“Objects are always imagined as being present in the field of
vision as would have to be there in order to produce the same
impression on the nervous mechanism” - von Helmholtz
Hermann von Helmholtz
Richard Gregory
Impressions on the Markov blanket…
sS
Bayesian filtering and predictive coding
f  ( s )  (Q  )F ( s ,  )
 D       
prediction
update

  s  g ( )
prediction error
Making our own sensations
sensations – predictions
Prediction error
Action
Perception
Changing
sensations
Changing
predictions
Hierarchical generative models
what
A simple hierarchy
(3)
 v(3)

v
the 
(2)
(2)
v
(2)
 x(2)

x
(2)
x(2)
(2)
 v(2)

v
v(1)
(1)
 x(1)
x
Ascending 
prediction errors
 x(1)
(1)
 v(1)

v
where
Descending
predictions
(0)
v(0)
  D     
Sensory
fluctuations
David Mumford
Predictive coding with reflexes
Action
a  s     (1)
oculomotor
signals
reflex arc
proprioceptive input
pons
Perception
retinal input
frontal eye fields
Prediction error (superficial pyramidal cells)
geniculate
 (i )
Top-down or
backward predictions
Bottom-up or forward
prediction error
 (i )   (i 1)  g (i ) ( (i ) )
Expectations (deep pyramidal cells)
visual cortex
 (i )
 (i )  D (i )   (i ) (i )   (i )
Biological agents minimize their average surprise (entropy)
They minimize surprise by suppressing prediction error
Prediction error can be reduced by changing predictions (perception)
Prediction error can be reduced by changing sensations (action)
Perception entails recurrent message passing to optimize predictions
Action makes predictions come true (and minimizes surprise)
Overview
The statistics of life
Markov blankets and ergodic systems
simulations of a primordial soup
The anatomy of inference graphical models and predictive coding
canonical microcircuits
Action and perception
action and its observation
simulations of saccadic searches
Action with point
attractors
 x(1)
 v(2)
 v(1)
visual input
V 
s    
J 
(0, 0)
Descending
proprioceptive predictions
x1
J1
 v(1)   (1)
v ( s ( a )  g (  ))
proprioceptive input

(1)
v
x 
s   1 
 x2 
a   a s  v(1)
x2
J2
V  (v1 , v2 , v3 )
Action with itinerant
attractors
Heteroclinic cycle (central pattern generator)
 x(1)
 x(2)
Descending
proprioceptive predictions
action
observation
0.4
position (y)
0.6
0.8
1
1.2
1.4
0
a   a s   v(1)
0.2
0.4
0.6
0.8
position (x)
1
1.2
1.4
0
0.2
0.4
0.6
0.8
position (x)
1
1.2
1.4
Overview
The statistics of life
Markov blankets and ergodic systems
simulations of a primordial soup
The anatomy of inference graphical models and predictive coding
canonical microcircuits
Action and perception
action and its observation
simulations of saccadic searches
F ( s, )  Eq [ ln p( s |  , m)  ln p( | u )  ln p(u | m)  ln q( , u )]
Likelihood
Free energy
Empirical priors
Entropy
Prior beliefs
Energy
“I am [ergodic] therefore I think [I will minimise free energy]”
ln p  u | m   Eq ( s , |u ) [ln p( s |   )  ln p(  | u )  ln q(  | u )]
Expected energy
Expected entropy
 Eq ( s |u ) [ln p ( s | m)]  Eq ( s |u ) [ D[q (  | s , u ) || q (  | u )]]
Extrinsic value
Epistemic value or information gain
Bayesian surprise and Infomax
KL or risk-sensitive control
Expected utility theory
In the absence of prior beliefs about outcomes:
In the absence of ambiguity:
In the absence of uncertainty or risk:
 Eq [ln p(  | u )  ln q(  | u )]
 Eq [ln P ( s | m)]
 Eq [ D[q(  | s , u ) || q(  | u )]]
Bayesian surprise
 D[q(  , s | u ) || q(  | u )Q( s | u )]
Predicted mutual information
  D[q (  | u ) || p(  | u )]
Predicted divergence
Extrinsic value
F ( s, )  Eq [ ln p( s |  , m)  ln p( | u )  ln p(u | m)  ln q( , u )]
Free energy
Likelihood
Empirical priors
Prior beliefs
Entropy
Energy
“I am [ergodic] therefore I think [I will minimise free energy]”
ln p  u | m   Eq ( s |u ) [ D[q(  | s , u ) || q(  | u )]]   (u )
Epistemic value or information gain
 (t )  
s(t )  S
𝜎(𝑢
Extrinsic value
stimulus
visual input
salience
Sampling the world to resolve uncertainty
sampling
 x, p
Parietal (where)
xp
Frontal eye fields
v
x
v
Visual cortex
v ,q
sq
x
Pulvinar salience map
 x ,q
Fusiform (what)
 (u )
xp
a
oculomotor reflex arc
v, p
sp
Superior colliculus
Saccadic eye movements
Saccadic fixation and salience maps
Action (EOG)
2
Hidden (oculomotor) states
0
-2
200
400
600
800
time (ms)
1000
1200
1400
Visual samples
Posterior belief
5
Conditional expectations
about hidden (visual) states
vs.
0
-5
200
And corresponding percept
400
600
800
time (ms)
1000
1200
1400
vs.
Hermann von Helmholtz
“Each movement we make by which we alter the appearance of
objects should be thought of as an experiment designed to test
whether we have understood correctly the invariant relations of
the phenomena before us, that is, their existence in definite
spatial relations.”
‘The Facts of Perception’ (1878) in The Selected Writings of Hermann von
Helmholtz, Ed. R. Karl, Middletown: Wesleyan University Press, 1971 p. 384
Thank you
And thanks to collaborators:
And colleagues:
Rick Adams
Ryszard Auksztulewicz
Andre Bastos
Sven Bestmann
Harriet Brown
Jean Daunizeau
Mark Edwards
Chris Frith
Thomas FitzGerald
Xiaosi Gu
Stefan Kiebel
James Kilner
Christoph Mathys
Jérémie Mattout
Rosalyn Moran
Dimitri Ognibene
Sasha Ondobaka
Will Penny
Giovanni Pezzulo
Lisa Quattrocki Knight
Francesco Rigoli
Klaas Stephan
Philipp Schwartenbeck
Micah Allen
Felix Blankenburg
Andy Clark
Peter Dayan
Ray Dolan
Allan Hobson
Paul Fletcher
Pascal Fries
Geoffrey Hinton
James Hopkins
Jakob Hohwy
Mateus Joffily
Henry Kennedy
Simon McGregor
Read Montague
Tobias Nolte
Anil Seth
Mark Solms
Paul Verschure
And many others
Time-scale
Free-energy minimisation leading to…
10 3 s
Perception and Action: The optimisation of neuronal and
neuromuscular activity to suppress prediction errors (or freeenergy) based on generative models of sensory data.
100 s
103 s
106 s
1015 s
Learning and attention: The optimisation of synaptic gain and
efficacy over seconds to hours, to encode the precisions of
prediction errors and causal structure in the sensorium. This
entails suppression of free-energy over time.
Neurodevelopment: Model optimisation through activitydependent pruning and maintenance of neuronal connections that
are specified epigenetically
Evolution: Optimisation of the average free-energy (free-fitness)
over time and individuals of a given class (e.g., conspecifics) by
selective pressure on the epigenetic specification of their
generative models.
Searching to test hypotheses – life as an efficient experiment
H ( S ,  )  H ( S | m)  H (  | S )
 Et [ ln p ( s (t ) | m)]  Et [ H (  | S  s (t ))]
Free energy principle
minimise uncertainty
 (t )  arg min{H [q( |  , )]}


International Research Training Group (IRTG) “The Brain in Action” I am therefore I think Karl Friston, University College London Abstract: TThis overview of the.

Transcript International Research Training Group (IRTG) “The Brain in Action” I am therefore I think Karl Friston, University College London Abstract: TThis overview of the.

Directory