Transcript Slide 1

MVPA / MVPD – Multivariate pattern
decoding
Session 13, 3.7.2008
Christian Kaul
MATLAB for Cognitive Neuroscience
Outline
• What is MVPD
– What types of classifiers are there?
•
•
•
•
MVPD in fMRI
How to design an experiment – 2 Examples
The MVPD MatLab “toolbox”
Common problems when thinking about MVPD of
fMRI data
MVPD – Multivariate pattern decoding
• Also known as MVPA with A for Analysis –
however if you enter MVPA into Google you end
up here:
MVPD – Multivariate pattern decoding
• What is MVPD?
– Methodology in which an algorithm is trained to tell two
or more conditions from each other.
– The algorithm is then presented with a new set of data
and categorises/classifies it into the conditions
previously learned.
– MVPD is a relatively new tool in fMRI, however note
that Pattern Classification as such has long been
developed and used in Artificial Intelligence and
Neuronal Networks.
What types of classifiers are there?
• The most common classifiers used
for fMRI data are
– LDA (Linear Discriminant Analysis) and
– SVM (Support Vector Machines).
algorithm: maximize margin!
•Both are generally doing a
good job.
•LDA finds a solution based on
linear combinations of features
only.
•However SVMs also take
non-linear effects into account.
This is largely done by
mapping the information into a
higher dimensional space
(feature space).
Non-linear SVMs - Feature space
2 examples:
Downside of non-linear SVMs:
There are more and more
parameters to be optimized during
learning.
MVPD in fMRI
• In situations we do find a univariate effect, a multivariate effect
is unlikely to reveal anything new!
• But when conventional analysis is not feasible, multivariate
analysis might be an option.
• What are we actually measuring?
• What does a “pattern of brain activity mean?
• Assumption:
• Feature sensitive information is present in BOLD signal
From fMRI data to a brain pattern.
Pattern
Multivoxel
fMRI data
Distribution
of selective
cells might
be uneven
(biased
sampling)
From fMRI data to a brain pattern.
Raw fMRI data
Classification
Multivoxel
fMRI data
Result:
Accuracy of
decoding
Classifier
N-fold crossvalidation
Pattern
%-accuracy result
What questions can (& cannot) be answered with
multivariate pattern analysis?
• Feature sensitive information is present in
BOLD signal (biased sampling)
• Multivariate decoding extracts this info
• Examples:
• “What have I seen?” Decoding of visual input
make the majority of publications (basic
features, categories, entire natural pictures).
• “What have I heard/ felt/ …?” Decoding of
other features specific sensitive input should
be possible.
• “What am I going to do next?” Decisions
seem to be coded in distinctive patterns of
brain activity.
Mean
signal
LDA
Haynes
& Rees
(2005)
Multivariate pattern analysis – how to design
an experiment
• Other then with conventional analysis we are
asking a different question:
Does the pattern of activity contain meaningful
information we can extract?
 Not the level of brain activity is addressed, but the
pattern of information within the activity.
Example 1 –
visual features
Question:
• Does feature selective
information (left vs. right tilted
orientation as measured by
decoding from BOLD signal) for
the irrelevant annulus change
between the two central load
conditions?
• Decoding objective:
– Is feature selective information
reduced in high load condition?
Multivariate Decoding
example Result:
actual
expected
Accuracy
% correct decoded
Low
High
1
50
100
N voxels
Number of voxels
Result:
Feature selective info present
and decoded
(…but slightly confusing)
Example 2 intentions
Haynes et al, 2007
Question:
• At the beginning of each trial, the
word “select” was presented that
instructed the subjects to freely
and covertly choose one of two
possible tasks, addition or
subtraction. From the button
press, it was possible to
determine the covert intention of
the subject during the previous
delay period.
• Decoding objective :
– Can subjects decision be decoded?
Example 2, Result
• In the anterior medial prefrontal cortex decoding during the delay
(green bars) was highest but was at chance level during the task
execution (red bars) after onset of the task-relevant stimuli. In contrast
posterior & superior medial prefrontal cortex (MPFCp) encoded the
chosen task only once it had entered the stage of execution, but not
during the delay period.
• Results presented with “searchlight” approach: A spherical searchlight
centered on one voxel is used to define a local neighborhood.
The MVPD MatLab “toolbox”
• During the last few months I have written a set of MatLabfunctions to perform multivariate data analysis with “any”
suitable data (work in progress).
• As the packet is quite complex and lengthy I’m only
introducing the basic control-script. It is quite easy to follow
the workflow in this control-script as a demonstration of
how MVPD can look like.
•  If anyone is interested in working with the code, please
refer to course webpage, or,
• as the “toolbox” is not complete yet: for the newest version
contact me directly: [email protected]
The common problems when thinking about
MVPD of fMRI data
• Decoding of what? TR, block average, betas.
• Overfitting - too many features at too few data
samples.
• Voxel selection.
The common problems when thinking about
MVPD of fMRI data: TR, BLOCK or BETA?
• In principle there are 3 different strategies how to
get your brain pattern: single TRs (raw data),
averaged blocks of TRs, betas (spm-estimates).
single TRs
avg. BLOCKs
Noise
Number of observations
BETAs
The common problems when thinking about
MVPD of fMRI data - OVERFITTING
– (1) an SVM classifier is unstable on a small-sized
training set;
– (2) SVM’s optimal hyper-plane may be biased when the
positive feedback samples are much less than the
negative samples
– (3) overfitting happens because the number of feature
dimensions is much higher than the size of the training
set.
Over-fitting and Under-fitting
• To avoid overfitting, cross-validation is used to
evaluate the fitting provided by each parameter
value set tried during the grid or pattern search
process.
The common problems when thinking about
MVPD of fMRI data: VOXEL SELECTION
• To reduce feature input dimensionality it is
common to preselect voxels:
– ROI based selection on voxels
• ROI must be defined independent from classification
– Threshold based selection of voxels
• threshold must be independent from classification
– Searchlight approach: A fixed sphere is moved over the
brain, voxel-by-voxel
• Data has to be corrected for multiple comparisons as each data
point is used multiple times.
Thanks – enjoy this sunny afternoon!