MScComplexSoundsV5

Download Report

Transcript MScComplexSoundsV5

Hearing Complex Sounds

MSc Neuroscience Dr. Jan Schnupp [email protected]

Sound Signals

Many physical objects emit sounds when they are “excited” (e.g. hit or rubbed). Sounds are just pressure waves rippling through the air, but they carry a lot of information about the objects that emitted them.

(Example: what are these two objects? Which one is heavier, object A or object B ?) The sound (or signal) emitted by an object (or system) when hit is known as the Impulse responses of everyday objects can be quite complex, but the

impulse response.

sine wave

is a fundamental ingredient of these (or any) complex sounds (or signals).

Vibrations of a Spring Mass System

Undamped

1.

F = -k·y (Hooke’s Law) 2.

3.

F = m · a (Newton’s 2 nd ) a = dv/dt = d 2 y/dt 2  -k · y = m · d 2 y/dt 2  y(t) = y o · cos(t ·  k/m)

Damped

4. –k·y –r dy/dt = d 2 y/dt 2 y(t) = y o ·e (-r·t/2m) cos(t·  k/m-(r/2m) 2 ) Don’t worry about the formulae! Just remember that mass-spring systems like to vibrate at a rate proportional to the square-root of their “stiffness” and inversely proportional to their weight.

https://mustelid.physiol.ox.ac.uk/drupal/?q=acoustics/simple_harmonic_motion

Resonant Cavities

In resonant cavities, “lumps of air” at the entrance/exit of the cavity oscillate under the elastic forces exercised by the air inside the cavity.

The preferred resonance frequency is inversely proportional to the square root of the volume. (Large resonators => deeper sounds).

The Ear

Organ of Corti Cochela “unrolled” and sectioned

Modes of Vibration and Harmonic Complex Tones

An “ideal” string would maintain the triangular shape set up when the string is plucked throughout the oscillation.

Fourier analysis

approximates such vibrations as a sum (series) of sine waves. For large numbers of sine waves (“Fourier components”) the approximation becomes very good, for an infinite number of components it is of a “ exact harmonically related .

Natural sounds are often made up of sine components that are i.e. their frequencies are all integer multiples fundamental frequency”.

y

 sin(

x

  )  1 4 sin( 3 

x

 )...

 1 8 sin( 5 

x

 )  1 16 sin( 7 

x

 )...

 1 32 sin( 9 

x

 )  1 64 sin( 11 

x

 )  ...

https://mustelid.physiol.ox.ac.uk/drupal/?q=acoustics/modes_of_vibration

Click Trains & the “30 Hz Transition”

time At frequencies up to ca 30 Hz, each click in a click train is perceived as an isolated event.

At frequencies above ca 30 Hz, individual clicks fuse, and one perceives a continuous hum with a strong pitch.

https://mustelid.physiol.ox.ac.uk/drupal/?q=pitch/click_train

The Impulse (or “Click”)

The “ideal click”, or impulse, is an infinitesimally short signal. The Fourier Transform encourages us to think of this click as an infinite series of sine waves, which have started at the beginning of time, continue until the end of time, and all just happen to pile up at the one moment when the click occurs.

Basilar Membrane Response to Clicks

http://auditoryneuroscience.com/ear/bm_motion_2

Why Click Trains Have Pitch

If we represent each click in a train by its Fourier Transform, then it becomes clear that certain sine components will add (top) while others will cancel (bottom). This results in a strong harmonic structure.

Basilar Membrane Response to Click Trains

auditoryneuroscience.com | The Ear

AN Phase Locking to Artificial “Single Formant” Vowel Sounds

Phase locking to Modulator (Envelope) Phase locking to Carrier Cariani & Delgutte AN recordings https://mustelid.physiol.ox.ac.uk/drupal/?q=ear/bm_motion_3

Vocal Folds in Action

auditoryneuroscience.com | Vocalizations

Articulation

Articulators (lips, tongue, jaw, soft palate) move to change resonance properties of the vocal tract.

auditoryneuroscience.com | Vocalizations Launch Spectrogram

Harmonics & formants of a vowel

Formants Harmonics

Formants Arise From Resonances in the Vocal Tract

Speakers change formant frequencies by making resonance cavities in their mouth, nose and throat larger or smaller.

The “Neurogram”

As a crude approximation, one might say that it is the job of the ear to produce a spectrogram of the incoming sounds, and that the brain interprets the spectrogram to identify sounds. This figure shows histograms of auditory nerve fibre discharges in response to a speech stimulus. Discharge rates depend on the amount of sound energy near the neuron’s characteristic frequency.

Phase Locking

The discharges of cochlear nerve fibres to low frequency sounds are not random; they occur at particular times (

phase locking

).

https://mustelid.physiol.ox.ac.uk/drupal/?q=ear/phase_locking

Evans (1975)

Quantifying Phase-Locking

Phase locking is usually measured as a “vector strength”, also known as “synchronicity index” (SI). To calculate this, a Period Histogram of the neural response is normalized, and each bin of the histogram is thought of as a vector.

The SI is given by length of the vector sum for all bins.

Phase Locking Limits

Hair Cell Receptor Potentials AN fibre Responses

1000 Hz 2000 Hz 3000 Hz 4000 Hz Frequency (kHz)

AN fibres are unable to phase lock to signal fluctuations at rates faster than 3 kHz.

This is due to low-pass filtering in the hair cell receptors.

Frequency Coding

The Place Theory stipulates that frequencies are encoded by activity across the tonotopic array of fibers in the AN, as well as in tonotopic nuclei along the lemniscal auditory pathway.

The Timing Theory posits that temporal information conveyed through phase locking provides the dominant cue to frequency information.

Neither the Place nor the Timing Theory can account for all psychophysical data

The Pitch of “Complex” Sounds (Examples)

pure tone 3500 3500 3000 2500 2000 3000 2500 2000 1500 1000 500 0 3500 3000 2500 2000 1500 1000 500 0 1 1 2 Time click trains 3 2 Time 3 4 am tones 1500 1000 500 0 1 2 Time 3 3500 iterated rippled (comb filtered) noise 3000 2500 2000 1500 1000 500 0 1 2 Time 3

The Periodicity of a Signal is the Major Determinant of its Pitch

  Iterated rippled noise can be made more or less periodic by increasing or decreasing the number of iterations. The less periodic the signal, the weaker the pitch.

Sinusoidally Amplitude Modulated (SAM) Tones

Example with carrier frequency c:5000 Hz, modulator frequency f:400 Hz Make a sinusoidally modulated tone by modulator: multiplying sin(2  t c) · (0.5 · sin(2  t m) + 0.5) carrier with or by adding sine components of frequencies c+ m, c - m 0.5 · sin(2  t c) + 0.25 · sin(2  t (c+m)) + 0.25 · sin(2  t (c-m)) (!) The SAM tone has no energy at the modulation frequency.

Nevertheless the modulator influences the perceived pitch.

AN Phase Locking to SAM Tones

Spontaneous rate High Medium Low Sound level High Low Medium SR fibers phase lock better to amplitude modulation than high or low SR fibers.

The ability to phase lock to amplitude modulation declines with high sound levels.

Cell Types of the Cochlear Nucleus

Spherical Bushy

DCN AVCN

Stellate Globular

PVCN

Bushy Stellate Fusiform Octopus

CN Phase Locking to Amplitude Modulations

Certain cell types in the CN (including Onset and some Chopper types) exhibit much stronger phase locking to AM than is seen in their AN inputs.

Encoding of Envelope Modulations in the Midbrain

Neurons in the midbrain or above show much less phase locking to AM than neurons in the brainstem.

Transition from a timing to a rate code. Some neurons have “best modulation frequencies” (BMFs).

Topographic maps of BMF may exist within isofrequency laminae of the ICc, (“periodotopy”).

bandpass MTFs and exhibit

Periodotopic maps via fMRI

Baumann et al Nat Neurosci 2011 described periodotopic maps in monkey IC obtained with fMRI.

They used stimuli from 0.5 Hz (infra-pitch) to 512 Hz (mid-range pitch).

Their sample size is quite small (3 animals – false positive?) The observed orientation of their periodotopic map (medio-dorsal to latero ventral for high to low) appears to described by Schreiner & Langner (1988) in the cat (predimonantly caudal to rostral) differ from that

Proposed Periodotopy in Gerbil A1

SAM tones should only activate the high frequency parts of the tonotopically organized A1.

However, activity (presumed to correspond to the low pitch of these signals) is also seen in the low frequency parts of A1. This activity is thought to be organized in a concentric periodotopic map.

However...

animal 1 SAM tones hp Clicks

Periodotopy inconsistent in ferret cortex

hp IRN animal 2 Nelken, Bizley, Nodal, Ahmed, Schnupp, King (2008) J. Neurophysiol 99(4)

A pitch area in primates?

In marmoset, Pitch sensitive neurons are most commonly found on the boundary between fields A1 and R.

Fig 2 of Bendor & Wang, Nature 2005

A pitch sensitive neuron in marmoset A1?

Apparently pitch sensitive neurons in marmoset A1.

Fig 1 of Bendor & Wang,

Nature 2005

Mapping cortical sensitivity to sound features

200 -45 ° -15 ° 15 ° 45 ° 336 565 951 Bizley et al., J Neurosci, 2009 / ɑ / / ɛ / / u /

Timbre

/ i /

Responses to Artificial Vowels

Bizley et al J Neurosci 2009

Joint Sensitivity to Formants and Pitch Vowel type (timbre)

Bizley, Walker, Silverman, King & Schnupp - J Neurosci 2009

Mapping cortical sensitivity to sound features

Timbre Nelken et al., J Neurophys, 2004 Neural sensitivity Bizley et al., J Neurosci, 2009

Summary

Periodic Signals (click trains, harmonic complexes) with periods between ca. 30-3000 Hz tend to have a strong pitch. Aperiodic signals (noises, isolated clicks) have “weak” pitches or no pitch at all.

Speech sounds include periodic (vowels, voiced consonants) and aperiodic (unvoiced consonants) sounds.

There are competing place and timing (temporal) theories of pitch. Place theories depend on the representation of spectral peaks in tonotopic parts of the auditory pathway. Some important features (e.g. formants of speech sounds) are represented tonotopically, but place cannot represent harmonic structure well.

Timing theories postulate that early stages of the auditory system measures spike intervals in phase locked discharges.

In midbrain and cortex pitch appears to be represented through rate codes. Whether there are “periodotopic maps” or specialized pitch areas in the central auditory system is highly controversial.