Transcript Slide 1

MVPD – Multivariate pattern decoding
23.4.2009
Christian Kaul
MATLAB for Cognitive Neuroscience
Outline
• What is MVPD
– What types of classifiers are there?
•
•
•
•
MVPD in fMRI
How to design an experiment – a few examples
The MVPD MatLab toolbox
Common problems when thinking about MVPD of
fMRI data
• Relevant introduction papers
MVPD – Multivariate pattern decoding
• What is MVPD?
– Methodology in which an algorithm is trained to tell two
or more conditions from each other.
– The algorithm is then presented with a new set of data
and categorises/classifies it into the conditions
previously learned.
– MVPD is a relatively new tool in fMRI, however note
that Pattern Classification as such has long been
developed and used in Artificial Intelligence and
Neuronal Networks.
What types of classifiers are there?
• The most common classifiers used for fMRI
data are
– LDA (Linear Discriminant Analysis)
– SVM (Support Vector Machines)
– SLR (Sparse Logistic regression)
algorithm: maximize margin!
•All are generally doing a good
job.
•SLR and LDA find solutions
based on linear combinations
of features only.
•However SVMs also take
non-linear effects into account.
This is largely done by
mapping the information into a
higher dimensional space
(feature space).
Non-linear SVMs - Feature space
2 examples:
Downside of non-linear SVMs:
There are more and more
parameters to be optimized during
learning.
MVPD in fMRI
• In situations where we do find a univariate effect, a
multivariate effect is unlikely to reveal anything new!
• But when conventional analysis is not feasible, multivariate
analysis might be an option.
• What are we actually measuring?
• What does a “pattern of brain activity” mean?
• Example:
• Visual feature sensitive information is present in BOLD signal
fMRI of basic visual features
- Conventional analysis was thought to be not feasible
due to its lack of spatial resolution, compared to invasive
single cell recordings.
+
=
Haynes
& Rees
(2006)
fMRI of basic visual features
- Conventional analysis was thought to be not feasible
due to its lack of spatial resolution, compared to invasive
single cell recordings.
+
Haynes
& Rees
(2006)
fMRI of basic visual features
Multivariate Pattern Decoding!
Mean
signal
LDA
Haynes
& Rees
(2005)
Often multivariate results are presented ROI specific...
Kamitani
& Tong
(2006)
Multivariate pattern analysis – how to design
an experiment
Does the pattern of activity contain meaningful information we
can extract?
 Not the level of brain activity is addressed, but the pattern of
information within the activity.
 questions that can be answered with multivariate pattern
analysis:
 “What have I seen?” Decoding of visual input, majority of publications
 “What have I heard/ felt/ …?” Decoding of other sensitive input should be possible.
 “What am I going to do next?” Decisions seem to be coded in distinctive patterns of
brain activity.
More interesting Questions?
Does feature selective information contained in the
BOLD signal for an irrelevant stimulus change under
different levels of attentional load in a central task?
+
=
?
Experiment 1
• Prediction (from load theory):
Low
Accuracy
% correct decoded
– Feature selective information should
be reduced in high load condition
High
Chance
V1
V2
ROI
V3
Decoding Result
actual
expected
Accuracy
Low
High
Chance
V1
V2
V3
ROI
Result:
Feature selective information NOT reduced
Example 2 intentions
Question:
• At the beginning of each trial, the
word “select” was presented that
instructed the subjects to freely
and covertly choose one of two
possible tasks, addition or
subtraction. From the button
press, it was possible to
determine the covert intention of
the subject during the previous
delay period.
• Decoding objective :
– Can subjects decision be decoded?
Haynes et al, 2007c
Example 2, Result
• In the anterior medial prefrontal cortex decoding during the delay
(green bars) was highest but was at chance level during the task
execution (red bars) after onset of the task-relevant stimuli. In contrast
posterior & superior medial prefrontal cortex (MPFCp) encoded the
chosen task only once it had entered the stage of execution, but not
during the delay period.
• Results presented with “searchlight” approach: A spherical searchlight
centered on one voxel is used to define a local neighborhood.
Haynes et al, 2007c
Example 3 – Voxel based tuning functions
Monkey-data like tuning
functions with fMRI!
Serences et al, 2008
Example 4 – Real time reconstruction of seen
images
Miyawaki et al, 2008
The MVPD MatLab “toolbox”
• MatLab- functions to perform MVPD with “any”
suitable data.
• Presented is the basic control-script.
– It is quite easy to follow the workflow in this controlscript as a demonstration of how MVPD using SLR can
look like.
•  If anyone is interested in working with the code,
please contact me directly: [email protected]
The common problems when thinking about
MVPD of fMRI data
• Decoding of what? TR, block average, betas.
• Overfitting - too many features at too few data
samples.
• Voxel selection.
The common problems when thinking about
MVPD of fMRI data: TR, BLOCK or BETA?
• In principle there are 3 different strategies how to
get your brain pattern: single TRs (raw data),
averaged blocks of TRs, betas (spm-estimates).
single TRs
avg. BLOCKs
Noise
Number of observations
BETAs
The common problems when thinking about
MVPD of fMRI data - OVERFITTING
– (1) an SVM classifier is unstable on a small-sized
training set;
– (2) SVM’s optimal hyper-plane may be biased when the
positive feedback samples are much less than the
negative samples
– (3) overfitting happens because the number of feature
dimensions is much higher than the size of the training
set.
Over-fitting and Under-fitting
• To avoid overfitting, cross-validation is used to
evaluate the fitting provided by each parameter
value set tried during the grid or pattern search
process.
The common problems:
VOXEL SELECTION (LDA & SVM)
• To reduce feature input dimensionality (# of
voxels) it is common to preselect voxels:
– ROI based selection on voxels
• But: ROI must be defined independent from classification
– Threshold based selection of voxels
• But: threshold must be independent from classification
– Searchlight approach: A fixed sphere is moved over the
brain, voxel-by-voxel
• But: multiple comparisons!
• SLR does not have this problem due to automatic
relevance detection
Relevant introduction papers
• Revealing representational content with patterninformation fMRI--an introductory guide.
• Mur M, Bandettini PA, Kriegeskorte N
• Machine learning classifiers and fMRI: a tutorial overview.
• Pereira F, Mitchell T, Botvinick M
• Sparse estimation automatically selects voxels relevant for
the decoding of fMRI activity patterns.
• Yamashita O, Sato MA, Yoshioka T, Tong F, Kamitani Y.
Thanks – enjoy this sunny afternoon!