A Neurobiological framework for Auditory Images and the

Download Report

Transcript A Neurobiological framework for Auditory Images and the

CNBH, PDN, University of Cambridge
Part II: Lent Term 2015: ( 3 of 4)
Central Auditory Processing
Roy Patterson
Centre for the Neural Basis of Hearing
Department of Physiology, Development and Neuroscience
University of Cambridge
email [email protected]
Lecture slides on CamTools
https://camtools.cam.ac.uk/portal.html
Lecture slides, sounds and background papers on
http://www.pdn.cam.ac.uk/groups/cnbh/teaching/lectures/
CNBH, PDN, University of Cambridge
Contents/Progress
Act I: the information in communication sounds
(animal calls, speech, musical notes)
Act II: the perception of communication sounds
(the robustness of perception)
Act III: the processing of communication sounds in the
auditory system (signal processing)
Act IV: the processing of communication sounds
(anatomy, physiology, brain imaging)
CNBH, PDN, University of Cambridge
How does the auditory system process
communication sounds and musical notes
to extract and separate Ss, Sf and the
message?
CNBH, PDN, University of Cambridge
Visual Images and Auditory Images
Visual Image
Auditory Image
VC
LGN
LL
LL
Retina
Light
Sound
Vision
Audition
CNBH, PDN, University of Cambridge
Stabilized Auditory Images
Illustrations of what the representation of sound
might be like in auditory cortex
- for or a word:
a-i-u-a
- for an arpeggio: C-E-G-C
CNBH, PDN, University of Cambridge
In the beginning:
• Fish have been communicating with sound for over 400 Mya
• Frogs have been communicating with sound for over 300 Mya
• Fish and frogs do not have a cochlea
• The cochlea is a relatively modern invention.
• Developed about 100 Mya
• Developed separately by the reptiles, birds and mammals
So what are fish and frogs listening to?
How did/do they analyse sound?
CNBH, PDN, University of Cambridge
Evolution of Hearing from the Lateral Line System
lateral line canal
neuromasts
CNBH, PDN, University of Cambridge
Neuromasts and Hair Cells
Primary information is time intervals
Hair cell
Adapted from
Manley (1990)
Fig. 2.2
hair bundle links
Nayak et al. (2007)
CNBH, PDN, University of Cambridge
CNBH, PDN, University of Cambridge
The basic element of communication sounds
Pulse
The pulse marks the start
of the communication.
Amplitude
The resonance provides
distinctive information
about the shape and
structure of resonators in
the communicator’s body.
Time
amplitude of
sound wave
CNBH, PDN, University of Cambridge
resonance
pulse
Isolated transient:
initiated by pulse from source,
characterised by the resonance
probability of
neural firing
Time in ms
Firing in response to transient
is synchronised to wave
(phase-locked)
number of
time intervals
Time in ms
Phase locking in hair cells
amplitude of
sound wave
CNBH, PDN, University of Cambridge
Isolated transient:
initiated by pulse from source,
characterised by the resonance
probability of
neural firing
Time in ms
Information in the neural pattern:
• time intervals re onset time
• peak levels re onset peak
number of
time intervals
Time in ms
Information provided by hair cells
amplitude of
sound wave
CNBH, PDN, University of Cambridge
Isolated transient:
initiated by pulse from source,
characterised by the resonance
Auditory time-interval histograms: 1
probability of
neural firing
Time in ms
Information in the neural pattern:
• time intervals re onset time
• peak levels re onset peak
number of
time intervals
Time in ms
Capture and store information in an
auditory interval histogram
Time Interval in ms
CNBH, PDN, University of Cambridge
Waveform of a child’s /a/
The waveform and spectrum of the vowel /a/
amplitude of
sound wave
CNBH, PDN, University of Cambridge
a
b
Auditory time-interval histograms: 2
number of
time intervals
probability of
neural firing
Two transients from the same source at times a and b
a
b
A histogram of time intervals,
each measured from the most
recent pulse in the sound
Time Interval in ms
amplitude of
sound wave
CNBH, PDN, University of Cambridge
Auditory time-interval histograms: 3
probability of
neural firing
Multi-pulse processing Time
andinthe
auditory image
ms
of of
number
number
intervals
intervals
time
time
Time in ms
•Aids recognition of recurring isolated transients
•Produces a stable image of repeating transients
•Enhances signal-to-noise ratio of resonance
Time Interval
TimeLog
Interval
in ms in ms
CNBH, PDN, University of Cambridge
Basilar partition in the mamalian cochlea
Hair bundle
cochlea
Inner
hair cell
INNER OUTER
HAIR HAIR CELLS
CELL
BASILAR MEMBRANE
BASILAR MEMBRANE
Slide provided by Prof. Andrew King
Outer hair cell
CNBH, PDN, University of Cambridge
Communication by steams of pulses (with resonances)
pulses
Amplitude
resonances
8
16
Time in ms
24
32
pitch period = 8 ms
CNBH, PDN, University of Cambridge
repeated pulses
resonances
Patterson (1994a)
Patterson et al. (1995)
Centre Frequency of Auditory Filter (RBs)
Basilar membrane motion in response to a vowel
Time (ms)
http://www.pdn.cam.ac.uk/groups/cnbh/teaching/lectures/Pjasj00.pdf
CNBH, PDN, University of Cambridge
Anatomy of the Auditory Pathway: 1
Basilar membrane motion
CNBH, PDN, University of Cambridge
glottal pulses
resonances/formants
Patterson (1994a)
Patterson et al. (1995)
Centre Frequency of Auditory Filter (RBs)
Neural activity pattern in response to the /ae/ in ‘hat’
Time (ms)
CNBH, PDN, University of Cambridge
Centre Frequency of Auditory Filter (RBs)
Neural activity pattern in response to the /ae/ in ‘hat’
Time (ms)
http://www.pdn.cam.ac.uk/groups/cnbh/teaching/lectures/Pjasj00.pdf
CNBH, PDN, University of Cambridge
Neural activity pattern
CNBH, PDN, University of Cambridge
STI in three NAP channels
2.4-kHz strobe time
2.4-kHz channel
1.2-kHz strobe time
1.2-kHz channel
0.6-kHz strobe time
0.6-kHz channel
CNBH, PDN, University of Cambridge
Patterson (1994a)
Patterson et al. (1995)
Centre Frequency of Auditory Filter (RBs)
Neural activity pattern in response to the /ae/ in ‘hat’
Time (ms)
CNBH, PDN, University of Cambridge
Tonotopic axis of cochlea
Stabilised auditory image of the /ae/ in ‘hat’: 1
2.4 kHz
1.2-kHz
0.6-kHz
time interval, ms
0-ms time interval
in all channels
CNBH, PDN, University of Cambridge
Stabilised auditory image of the /ae/ in ‘hat’: 2
formants
Tonotopic axis of cochlea
Patterson (1994b)
Patterson et al. (1995)
glottal pulses
time interval, ms
pitch
http://www.pdn.cam.ac.uk/groups/cnbh/teaching/lectures/Pjasj00.pdf
CNBH, PDN, University of Cambridge
Strobed temporal integration
CNBH, PDN, University of Cambridge
Auditory Image
CNBH, PDN, University of Cambridge
The final stage of acoustic scale processing
I:
II:
III:
IV:
V:
Auditory perception is amazingly robust to the scale
variability in communication sounds
This is true of fish and frogs, as well as mammals
Moreover, the sounds of fish and frogs change scale (Ss
and Sf) with their body temperature, because colder
things vibrate more slowly.
This suggests that all animals construct a scale invariant
representation of communication sounds to improve
categorization.
That is, to know when a new sounds occurs, whether the
individual is a different species, or the same species but a
different size.
http://www.pdn.cam.ac.uk/groups/cnbh/teaching/lectures/PDIica07.pdf
CNBH, PDN, University of Cambridge
Stabilised auditory image of the /ae/ in ‘hat’
time interval, ms
Auditory Images with linear time-interval axes are not scale-shift invariant
CNBH, PDN, University of Cambridge
Stabilized Auditory Images
Illustrations of what the representation of sound
might be like in auditory cortex
- for or a word:
a-i-u-a
- for an arpeggio: C-E-G-C
CNBH, PDN, University of Cambridge
Time-Interval Axis
low
Carrier Frequency
high
linear
log
Pattern is regular
for a tuned
resonance
Pattern is
normalised on a
log-time-interval
axis
Pattern expands
as resonance
frequency
decreases
Pattern shifts but
it does not
expand
amplitude of
sound wave
CNBH, PDN, University of Cambridge
Transient:
• initiated by a pulse
• characterised by the resonance
probability of
neural firing
Time in ms
Information in the neural pattern:
• time intervals re onset time
• peak levels re onset peak
number of
time intervals
Time in ms
Capture, store and normalise
information for scale in a
log-interval histogram
Single-channel Auditory Image
Log Time Interval in ms
CNBH, PDN, University of Cambridge
high pitch
short vocal tract
short vocal tract
low pitch
high pitch
long vocal tract
long vocal tract
Patterson, van Dinther and Irino (2007) (ICA, Madrid)
low pitch
Irino andPatterson (2002)
The Problem: scaled versions of four vowels
CNBH, PDN, University of Cambridge
Patterson, van Dinther and Irino (2007) (ICA, Madrid)
Neural activity patterns for 2-formant vowels
CNBH, PDN, University of Cambridge
Patterson, van Dinther and Irino (2007) (ICA, Madrid)
Neural activity patterns for 2-formant vowels
CNBH, PDN, University of Cambridge
Patterson, van Dinther and Irino (2007) (ICA, Madrid)
Neural activity patterns for 2-formant vowels
CNBH, PDN, University of Cambridge
Patterson, van Dinther and Irino (2007) (ICA, Madrid)
Expanded Neural activity patterns for 2-formant vowels
CNBH, PDN, University of Cambridge
Patterson, van Dinther and Irino (2007) (ICA, Madrid)
Expanded Neural activity patterns for 2-formant vowels
CNBH, PDN, University of Cambridge
scale-shift covariant Auditory Image
Patterson, van Dinther and Irino (2007) (ICA, Madrid)
{log2cycles, log2scale}
CNBH, PDN, University of Cambridge
Normalized, Stabilized Auditory Images
Illustrations of what the scale-shift covariant
representation of sound might be like in auditory cortex
- for or a word:
a-i-u-a
- for an arpeggio: C-E-G-C
CNBH, PDN, University of Cambridge
Conclusion:
AIM provides a functional model of the neural activity that
accompanies the conversion of natural sounds into our perception
of those sounds – Auditory Images
The cochlea performs a frequency analysis with an auditory
filterbank and primary nerve fibers fire record the times of
amplitude peaks.
The IC and/or the MGB creates synchronized time-interval
histograms (one for each filter channel). This stabilizes the
repeating patterns produced by tonal (periodic) sounds.
If the time-interval dimension is expressed as cycles of the impulse
response, the resulting SAI presents a scale covariant representation
of the message, and orthogonal representations of Ss and Sf .
http://www.pdn.cam.ac.uk/groups/cnbh/teaching/lectures/PDIica07.pdf
CNBH, PDN, University of Cambridge
Anatomy of the Auditory Pathway: 1
LL
LL
CNBH, PDN, University of Cambridge
End of Act III
Thank you
Patterson, R.D., Allerhand, M., and Giguere, C., (1995). "Time-domain modelling
of peripheral auditory processing: A modular architecture and a software
platform,” J. Acoust. Soc. Am. 98, 1890-1894.
http://www.pdn.cam.ac.uk/groups/cnbh/teaching/lectures/PAG95.pdf
Patterson, R.D. (2000). Auditory images: How complex sounds are represented in
the auditory system. J Acoust Soc Japan (E)21(4) 183-190.
http://www.pdn.cam.ac.uk/groups/cnbh/teaching/lectures/Pjasj00.pdf
Patterson, R. D., van Dinther, R. and Irino, T. (2007). “The robustness of bioacoustic communication and the role of normalization,” Proc. 19th
International Congress on Acoustics, Madrid, Sept, ppa-07-011.
http://www.pdn.cam.ac.uk/groups/cnbh/teaching/lectures/PDIica07.pdf