Spectral Analysis of Sound

Download Report

Transcript Spectral Analysis of Sound

Spectral Analysis of Sound
Robert Mannell
Macquarie University
1
Spectral Analysis of Sound
Sub-topics:• Complex Waves and Line Spectra
• Fourier Transforms
• Linear Prediction Analysis
• Filtering
• Spectrograms: Time, Frequency & Intensity
2
Complex Waves and Line Spectra
• The addition of more than one pure tone
produces complex waveforms.
• These waveforms are not readily analysed
by eye.
• As complex waves increase in complexity
it becomes increasingly difficult to
determine anything from their waveform
except for the fundamental frequency.
3
Complex Waves and Line Spectra
• A line spectrum is a spectral representation that
displays the frequencies and relative intensities
of the component sine waves.
• Each sine wave is displayed as a single vertical
line placed at the appropriate frequency on the
x-axis.
• The height of the line represents the amplitude
of the component sine wave.
• The amplitude is usually displayed as relative
sound pressure level in dB.
• Phase information is absent in such a display.
4
Complex Waves and Line Spectra
In this diagram we can see the
waveform and the line
spectrum of two 100 Hz pure
tones, one with an amplitude
of “1” and the second with an
amplitude of “2”. On a line
spectrum this amplitude could
be in Pascals or dB.
The associated line spectrum
clearly displays this difference,
but it could also easily be
deduced from the waveforms.
5
Complex Waves and Line Spectra
These two waves are
complex sounds that have
been derived by adding
together 2 or 3 pure tones.
We can’t easily tell from the
waveforms what these
tones were, but if we have
a line spectrum we can
easily see the frequencies
and relative amplitudes of
these tones.
6
Fourier Transforms and FFT
• The addition of pure tones (sine waves) results
in a complex sound.
• A frequency analysis of such a sound attempts
to determine the original pure tones.
• The Fourier Transform (Fourier, 1820's) is the
main way of doing this.
• The Fast Fourier Transform (FFT) is a very fast
and commonly used method of computing a
Fourier transform on a digital computer.
7
Fourier Transforms and FFT
FFT analysis of
the centre of
the Australian
English vowel
/3:/ spoken by
an adult male
In this FFT spectrum we have intensity (in dB) on the
y axis and frequency (in kHz - kiloHertz) on the x axis.
Many of the fine detailed peaks are multiples of F0 or
harmonics (~105 Hz) superimposed over broader
spectral peaks.
8
Linear Prediction Analysis (LPC)
• A Linear Prediction Analysis is a method
that selects the main resonance peaks (or
“formants”) of speech sounds. Formant
peaks tell us about the position of the
tongue, lips, etc.
• LPC analysis, if done correctly, provides a
smoothed spectrum with easily analysable
formants. (We’ll talk more about formants
in another topic)
9
Linear Prediction Analysis (LPC)
LPC analysis of
the centre of
the Australian
English vowel
/3:/ spoken by
an adult male
In this LPC spectrum, the four clear peaks are the first four
resonance peaks (formants) for this vowel.
They tell us that the vowel is a mid central unrounded
vowel spoken by an adult male. The pattern of harmonics
has been ignored by this analysis.
10
FFT/LPC spectra
FFT + LPC
analysis of the
centre of the
Australian
English vowel
/3:/ spoken by
an adult male
It’s often useful to display both the FFT and LPC together
and this kind of plot is used in a number of topics. It’s an
easy way of seeing harmonic and formant information
together.
11
Acoustic Filtering
• It’s sometimes necessary to “filter” sounds
so that some frequencies are available for
analysis (“passed” by the filter) and other
frequencies are removed.
• Low pass (LP) filters pass sound below a
certain frequency, high pass (HP) filters
pass sound above a certain frequency and
band pass (BP) filters pass sound between
two frequencies. All other frequency
components are blocked (“stopped”) .
12
Acoustic Filtering
High pass (HP), low pass (LP)
and band pass (BP) filters.
Note that BP filters pass
frequencies between the HP
and LP frequencies. Green is
passed and white is stopped.
13
Acoustic Filtering
• In most practical acoustic filters there is a
region around the cut-off frequency where
frequencies are partially allowed to pass.
• This provides a more gentle transition
between the pass-band (the frequencies
which are unattenuated) and the stopband (the frequencies which are
attenuated).
14
Spectrograms
• FFT and LPC spectra are two dimensional
(2D) spectra with the dimensions
amplitude (usually y axis) and frequency
(usually x axis). They display the spectrum
for a short window of time.
• Spectrograms are three dimensional
spectra showing an additional time
dimension.
15
Spectrograms
Broad band
spectrogram of the
word “heard”
spoken by an adult
male speaker of
Australian English.
A spectrogram displays three acoustic dimensions. Here,
y axis is frequency (kHz), x axis is time (s) and intensity is
grey scale (with black being the most intense). A broad
band spectrogram has good time resolution (vertical bars
show glottal cycles) and poor frequency resolution.
16
Spectrograms
Narrow band
spectrogram of the
word “heard”
spoken by an adult
male speaker of
Australian English.
Narrow band is the same as broad band, except that it has
a poor time resolution (vertical bars don’t show) and good
frequency resolution (visible horizontal bars represent
harmonics - multiples of F0).
17
Spectrograms
• In the spectrograms on the previous two
slides the four parallel horizontal bands
represent the first four formants (numbered
1-4 from bottom to top).
• If we take a single time slice through the
middle of the vowel (the part with the four
prominent dark bands) we see the same
peaks that we saw in the FFT and LPC
plots. (it’s the same vowel and speaker)
18
Readings
• For more detail and additional suggested
readings go to the topic web site at:http://www.ling.mq.edu.au/speech/acoustics/frequency/spectral.html
19