Neuropsychology of Vision - Carnegie Mellon University

Transcript Neuropsychology of Vision - Carnegie Mellon University

Neuropsychology of Vision
Anthony Cate
April 19, 2001
Dolphins have amazing brains!
Holy cow!
The problem of vision and how
it relates to PDP modeling
Narrowing down the problem:
Perception vs. Action
Look at the P, the D and the P
Parallel?
The visual system is parallel
on at least two levels:
The early visual system has a
massively parallel architecture
Visual input from early visual
areas is processed in 2 distinct
ways in parallel
The dorsal and the ventral
stream can be dissociated by
brain injury
Distributed?
No grandmother cells.
Distributed?
Localist or distributed
representations of form?
Processing?
How best to describe the processing
performed by higher-order brain areas?
In terms of function or computation?
Is there a limit to the parallel
stage of visual processing?
Is there a limit to the parallel
stage of visual processing?
Effects of context at all stages
of processing.
High level:
Model via constraint
satisfaction?
Effects of context at all stages
of processing.
Extremely low level:
The result of either
feedback from “higher
layers” or from lateral
connections.
Point:
Think of vision not as
- reproducing patterns of light
- matching inputs to stored templates
But as a process wherein visual input
interacts with many kinds of information
stored in a network.
No single process.
A short PDP tour of the visual system:
Attention (Posner vs. Cohen)
Object Recognition (RBC vs. Edelman)
Category specificity (Kanwisher vs.
Gauthier)
Visual attention as the product of a
distributed network of brain areas:
Damage to the parietal lobe produces what
appears to be a “disengage” problem:
But how to
characterize the
behavior produced by
this network?
By assigning
functions to each
anatomical region…
Posner et al., 1984
… or by describing the computational
process by which the network might
produce the behavior?
Cohen et al., 1994
Damage to the simple model also produces
a “disengage” problem, without the need for
any subdivision of processing:
Disadvantages of the PDP account:
The “disengage” model exists for a reason:
different lesions produce somewhat different
kinds of disorders.
Perhaps inappropriate to capture entire
network in such a simple model.
Object recognition
How do we generalize across different
viewpoints of one objects?
How do we generalize across the many
individual objects that make up a
category?
Great differences in the images
produced by both of these factors
Marr’s solution:
Represent 3-D volumes, not 2-D images
Marr was unfortunate, so…
Recognition by Components (RBC)
Recognition by Components (RBC)
Recognition by Components (RBC)
Recognition by Components (RBC)
JIM
(“John & Irv’s
Model”)
Hummel &
Biederman, 1992
Recognition by Components (RBC)
Recognition by Components (RBC)
Recognition by Components (RBC)
So perhaps object recognition not based on representations of
3-D structure, but on 2-D views.
Difficulties in representing objects via 2-D
images:
How to associated different views of
same object?
How to generalize based on a view?
The space of all possible 2-D images is of
a much higher dimensionality than the
space of geons.
Dimensionality reduction
Dimensionality reduction by PCA
Potential problem with principle component
representations:
What about similarity?
As long as second-order isomorphism preserved,
not necessarily a problem
Second-order isomorphism
Autoencoders can do PCA
Autoencoders can do PCA
Problem with stored views approach:
Do we represent prototypes?
How do we “decide” which prototypes to
represent?
(Is there really a “chorus?”)
Do we represent prototypes?
At least for faces, yes!
Do we represent prototypes?
How distributed are prototype
representations?
Sparse population coding of faces
Sparse population coding of faces:
Suggests that neurons do represent a
distributed code, but…
Only a very small subset of neurons
necessary to encode a particular
category of stimuli (faces)
Does this imply that the brain is
organized in terms of function?
PCA and prototype encoding discussed
above is entirely holistic.
The image is not segmented into parts
which are represented independently.
This is not how we represent most images,
except for faces.
Larry
Tanaka & Farah, 1993
•Subjects more accurate at identifying whole faces
than parts
Isolated part
Whole face
Tanaka & Farah, 1993
•This part/whole difference did not hold for other
kinds of stimuli: scrambled faces, inverted faces,
houses
So faces are special, then, right?
Perhaps, but examine what faces as a
class of stimuli have in common with
other classes of stimuli
Subordinate level categorization
(We don’t look at a person and say
“face!”)
Expertise
(We are all “face experts”)
Subordinate level categorization and
expertise lead to more holistic
representations of non-face objects
“Greeble experts” show the whole-over-part
advantage that is found for faces
Greebles and several other kinds of
“subordinate” stimuli activate the “fusiform
face area”
What does this have to do with PDP?
Describe functions both of ensembles of
neurons and gross brain areas in terms of
computational principles (i.e. PCA
analysis), not functional goal.
No accident that most object-selective
cells/regions found are face-selective?
Problem:
How are objects with recognizable parts
represented?
Unclear:
Ensembles of prototypes of parts?
Would these be tantamount to geons?
Resources:
Object recognition:
http://comp9.psych.cornell.edu/faculty/people/
Edelman_Shimon.htm
Face specificity:
http://web.mit.edu/afs/athena.mit.edu/org/b/bc
s/kanwisher.html
http://www.psy.vanderbilt.edu/faculty/gauthier

Neuropsychology of Vision - Carnegie Mellon University

Transcript Neuropsychology of Vision - Carnegie Mellon University

Directory