Modeling Auditory Localization of Subwoofer Signals in

Download Report

Transcript Modeling Auditory Localization of Subwoofer Signals in

Applied Psychoacoustics
Lecture 3: Masking
Jonas Braasch
From ATH to masked detection thresholds
from
tonmeister.ca
after Zwicker
& Fastl 1999
So far we have measured the absolute threshold of hearing (ATH) throughout
the auditory frequency range for sinusoids.
Now we would like to investigate how we detect sounds if other sounds are present as
well.
Preliminary thoughts
• We have seen that a sine sound does not only excite the
part of the basilar membrane that corresponds to its
frequency but also other frequencies as well (traveling
waves).
• The traveling wave moves from the base (high freqs.) to
the apex (low freqs.) and declines after passing the
resonance frequency.
• Therefore, we expect that a sound at a given frequency
also affects the detection of a sound at another
frequency.
• We will utilize this effect to the determine the shape of
auditory filters.
Method
We now re-measure the threshold of hearing, but
in this case we present to sinusoids to the
listeners.
• One, the masker, is fixed in frequency (1 kHz)
and level (60 dB SPL).
• The second one, the target, is varied in level as
before.
• We measure at various frequencies the
minimum sound pressure level at which the
second tone is detected.
Absolute Threshold of Hearing
from
tonmeister.ca
after Zwicker
& Fastl 1999
Our absolute threshold of hearing (ATH) for a single tone now changes to …
Masked Detection Threshold
steep curve
for low freqs
shallow slope
for high freqs
from
tonmeister.ca
after Zwicker
& Fastl 1999
… to this one.
We now speak of the masked detection threshold, with the sinusoids at 1 kHz, 60 dB
SPL being the masker. Note that we find a steep slope just below 1 kHz, because in
this range the basilar membrane is not much affected by the sound, while the slope is
shallow for frequencies just above 1 kHz.
Masked Detection Threshold
from
tonmeister.ca
after Zwicker
& Fastl 1999
Of course the masked detection threshold depends on the characteristics of the
masker. In this graph, several thresholds curves are shown for various masker levels
(20-100 dB SPL). 100 dB SPL is a very high value. I recommend NOT to go above 85
dB SPL if you want to repeat this measurement at home.
Auditory Filters
• Fletcher (1940) postulated that the
auditory system behaves like a bank of
pass-band filters with overlapping
passband.
• Helmholtz (1865) already had similar
ideas.
• Auditory filters can be measured in
amplitude and phase as functions of
frequency.
Measurement of auditory filters
• We will firstly restrict ourselves to the amplitude
of auditory filters.
• We can use a similar paradigm to measure
auditory filters as in our previous experiment:
– present two sinusoids with same level and same
frequency to the listener. The level was adjusted just
above sensation level.
– Next, we vary the frequency of the sinusoids in
opposite direction.
– The two sinusoids become inaudible at the point
where they do not fall into one auditory filter
anymore. In this case, the energy within each of the
two auditory filters becomes to small to be detected.
Critical Bands: Zwicker (1961)
(Fig.:Terhardt 1998)
•Zwicker (1961) measured the critical bandwidth using two narrow-band
maskers which masked a sine target at the center of the critical band.
•Zwicker recorded the detection threshold of the sine tone (varied in level)
as a function of the frequency gap between both maskers.
•Unfortunately, the interference between the lower frequency noise masker
and an the sine target lead to interference effects, and combination tones
at different frequencies become audible, while the signal remains
undetected.
•This leads to the abrupt decrease in the detection threshold at 0.3 kHz.
Zwicker’s critical bands
Linear range contradicts new findings
Zwicker’s critical band rate
Errors between
measured and predicted
values (from equations
below)
Previous
slide
The graph shows the
Critical band rate in
Bark as a function of
Frequency f. The
equations were
established to fit the
data.
Zwicker’s critical bandwidth
Errors between
measured and predicted
values (from equations
below)
Previous
slide
The graph shows the
Critical band width in
Bark as a function of
Frequency f. The
equation was
established to fit the
data.
Patterson ‘74: Measurement method
from Patterson (1974)
Hypothetical
MASKING
Shaded area: part
of noise that is
effectively masking
the test tone
Patterson (1974) used a broadband noise masker to avoid
harmonicity to influence the results.
Patterson ‘74: Measurement method
Number of correct responses [%]
Measuring the ATH one more time
100%
Hearing threshold/masked threshold
0%
Sound pressure level
… would be nice, but in reality
Probability for correct response
… it looks like this
75 % threshold
50 % threshold
Log-normalized stimulus intensity
(e.g, sound pressure level)
In signal detection theory, we explain this variation with internal noise
in the central nervous system
Patterson 74: Results
Patterson 74: Results
Off-Frequency Listening
masker
Auditory filter
on-frequency
listening
tone
masker
Auditory filter
off-frequency
listening
tone
By placing the center frequency of of the Auditory Filter above the testtone frequency, the signal-to-noise ratio between the tone and the masker
can be increased. This way the test tone is easier targeted.
Avoiding off-Frequency Listening
Auditory filter
masker
2. masker
on-frequency
listening
2. masker
off-frequency
listening
tone
Df
Df
Auditory filter
masker
tone
In this two masker case off-frequency listening does not pay off anymore.
By shifting the auditory filter, the influence of one masker is reduced,
while the influence of the 2. masker is increased. Overall the signal-tonoise ratio balance decreases. In this experiment, it is assumed that the
auditory filter is symmetrical, which is a good-enough approximation.
Noise Gap Masking
(Fig.: Moore 2004)
To avoid off-frequency listening, Patterson (1976) measured the threshold
of the sinusoidal signal as a function of the width of the spectral notch in
the noise masker. The shaded areas shows the amount of noise passing
through the auditory filter.
Auditory filter shape
(Fig.: Moore 2004)
Typical shape of an auditory filter as measured by Patterson (1976). The
center frequency is 1 kHz.
Auditory Filter non-linearity
80 dB
30 dB
This graph shows the non-linearity of the auditory filter. In the left graph the filter curves
Are normalized to 0 dB. The filters were measured for several 2-kHz sine tones from 30
to 80 dBs. Note how the filter broadens toward low frequencies with increasing level. In
the right graph the filters were not normalized.
(Fig.: Moore 2004)
Auditory Filter Bandwidths
(Fig.: Moore 2004)
Width of auditory filters measured with different techniques. The dashed curve shows
the values of Zwicker (1961), the solid line the ERBN values which was measured
using Patterson’s (1976) notched-noise method.
Note the large deviations of Zwicker’s results at low frequencies, which are based on
indirect measures.
ERB calculations
ERBN  24.7(4.37 f  1)
ERBN in Hz, f=frequency in kHz
ERBN #  21.4 log10 (4.37 f  1)
ERBN # in Hz, f=frequency in kHz
Glasberg and Moore (1990)
Psychophysical Tuning Curves
Fig.: Moore 2004
The psychophysical tuning curves (PTC) were determined by measuring the masked
detection thresholds for 6 sine tones which were presented 10 dB above sensation
level (black circles). The masker was a sine tone as well which was varied in level
(Data from Vogten, 1974).
Tuning curves of the auditory nerve
Neurons tuned toward higher frequencies
CFs
Each curve shows the tuning curve for a specific auditory nerve neuron of an
anesthetized cat. The curves show the minimal sound pressure of a sinusoidal
sound pressure to excite the neuron above a given threshold (also called
frequency-threshold curves). The minimum of each curves show the CF of the
neuron.
from Palmer (1987)
Isointensity contours
CF
Curve near threshold
level.
Possible firing rate
threshold to
determine the tuning
curve
Isointensity curves are another way of describing the response patterns of
neurons. Here, the firing rate in spikes per seconds are depicted for sinusoids at
different sound pressure levels (in this case for one auditory nerve neuron of a
squirrel monkeys). Note how the increase in sound pressure level increases the
firing rate, and broadens the frequency range. Note also the shift of the CF with
increase in SPL.
from Rose et al. (1971)
Tuning curves vs. Isointensity contours
• Tuning curves show the Sound pressure level that is
needed at a particular frequency to exceed a threshold
firing rate. Often describes behavior at low sound
pressure levels.
• Isointensity contours show the firing rate for a sinusoid
with a given sound pressure level as a function of
frequency. Often describes behavior at higher sound
pressure levels.
Phase Sensitivity
• Human responses to phase variation can
be measured through stimuli that just
differ in their phase, but not amplitude.
Amplitude Modulation
(1  m sin(2f ct )) sin(2f ct )
Amplitude Modulation (AM)
m=modulation index (m=0 no modulation, m=1 100% modulation),
fc=frequency in Hz
fc
fc−g
fc+g
Sidebands
f
Frequency Modulation (FM)
sin(2f ct   cos(2gt))
Frequency Modulation (FM)
b=modulation index (b=0 no modulation, b=1 100% modulation),
fc=carrier frequency in Hz, g=modulation frequency in Hz.
For low modulation indexes, we can simulate the FM signal using quasifrequency modulation (QFM), which consists of three sinusoids with
appropriate amplitudes and phases:
A1 sin(2f c1t  1 )  A2 sin(2f c 2t )  A3 sin(2f c3t  3 )
with fc1= fc2−g, fc1= fc3+g
Quasi-Frequency Modulation (QFM)
AM and QFM differ only in phase, but not in amplitude. This feature makes
the stimuli interesting for psychoacoustic experiments. If the listeners do
not respond differently to both stimuli, the underlying processes are most
likely not dependent on phase.
Amplitude vs. Frequency Modulation
(Fig.: Moore 2004)
Amplitude Modulation (AM)
Frequency Modulation (FM)
Perception of Modulation
• For low modulation frequencies (e.g., g=5 Hz):
– Amplitude modulation is perceived as loudness
fluctuation
– Frequency modulation is perceived as frequency
fluctuation
Critical Modulation Frequency (CMF)
AM and QFM modulation thresholds for a
1-kHz sinusoidal tone as a function of the
modulation frequency g. In both cases the
threshold decreases with modulation
frequency.
(Fig.: Moore 2004)
The lower graph shows the ratio /m. At
90 Hz, the so-called critical modulation
frequency (CMF) the ratio becomes one.
Above this frequency, which highly
correlates with the width of the auditory
filter, the auditory system becomes
insensitive towards the phase of the
components:
->The phase of different frequency
components plays only a role if they are
1000-Hz carrier processed by the same frequency band!
Audibility of single partials of a complex tone
(Fig.: Moore 2004, after
data of Plomp, 1964a;
Plomp and Mimpen,
1968).
The x’s and open circles show the minimal separation frequency as a
function of partial frequency above which the partial can be heard out with
75% accuracy. The long-dashed curve shows the ERBN function ×1.25.
Basically, partials cannot be heard out, if its share the same auditory band
with other partials.
Masking Patterns
(Fig.: Moore 2004,
data from Egan and
Hake, 1950)
Masking patterns (audiograms) for a narrow band of noise centered 410 Hz.
The curves show the increase in threshold for a sinusoidal signal as a function
of frequency. The number above each curve gives the SPL of the noise masker.
Excitation Patterns
(Fig.: Moore 2004)
Estimation of the excitation pattern from auditory filterbank data for a 1-kHz
sinusoid. For each filter band the filter amplitude at the frequency of the test
tone is determined (points a-e). Afterwards, these points are plotted at the
center frequency of the corresponding filter, which represents the excitation
pattern.
Excitation Patterns
The figure shows the excitation
patterns for the 1-kHz sinusoid
for various sound pressure
levels from 20 to 90 dB in
steps of 10 dB.
(Fig.: Moore 2004)
Co-Modulation Masking Release
1.
target
masker
1. The masker increases
the detection threshold of
the target
f
2. The threshold is not
affected if a second masker
is presented far enough in
frequency
2.
f
3.
co-modulated
masker
3. However, if we co-modulate both
maskers, the detection threshold is
lowered (co-modulation masking
release)
f
Co-Modulation Masking Release
• Is a phenomenon that tells us how the
auditory system integrates information
over several frequency bands.
• It is important to understand how the
auditory system groups information to
form one or several auditory events (See
upcoming lecture on Auditory Scene
Analysis).
Co-Modulation Masking Release
(Fig.: Hall et al. 1984)
Masker components
NOT co-modulated
Masker components
co-modulated
Test tone:
1-kHz sinusoid
Masker:
Band-pass filtered
noise
Temporal masking
Temporal masking occurs because of the sluggishness of the basilar
membrane, and time-dependent processing of the hair-cells and other
neurons (refractory time).
Simultaneous masking: masking induced by a simultaneous masker
Forward masking: masking induced by a masker that precedes the test tone in time.
Backward Masking: Masking that occurs for a target that precedes a masker
Overshoot: Initial masked threshold increase during the onset phase of the masker.
Noise vs. tonal masker
410 Hz, 90 Hz bandwidth
400 Hz
Egan and Hake (1950)