Transcript IAFPA 2006

Formant-Pattern Estimation
Guided by Cepstral Compatibility
Frantz Clermont
Philip Harrison
Peter French
J.P. French Associates & University of York
United Kingdom
IAFPA Conference
Plymouth, UK
23-25 July 2007
A Long-Standing Problem:
How “reliable” is the measured formant-pattern?
 Central to Acoustic Phonetics
 Crucial to Forensic Phonetics
 Central to a Major Debate in
Speech & Speaker Recognition
SHORT ANSWER:
 STILL NO OBJECTIVE WAY OF ENSURING/CHECKING RELIABILITY
OF MEASURED F-PATTERNS
 APPRECIABLE VARIABILITY AMONGST SOFTWARE PACKAGES
(HARRISON, 2004)
WAYS FORWARD – WHAT ARE WE TO DO?
 THROW OUR HANDS IN DESPAIR & SIMPLY USE AVAILABLE TOOLS?
 STILL HOPE FOR SOME VIABLE SOLUTIONS?
A New Approach: Particulars & Aims
• Objectivity & Reliability of F-pattern Estimation
• Compatibility with the observed spectrum
• A “Smart” Measure of Compatibility
• We use a related representation: The Cepstrum
•
•
•
•
Efficient approximation to the exact spectrum
More readily available from speech signal
Contains vocal-tract resonance information
Robustness in speech and speaker recognition
• We propose a Cepstral Analysis-by-Synthesis
Method
• To generate Candidate Cepstra
• To determine most compatible candidate w.r.t
observed cepstrum
Pathway to Speech Spectrum
Source-Filter
MODEL
Linear Prediction (LP)
ALL-POLE FILTER
FILTER ORDER
M
N.B. “OPTIMUM” M  UNRESOLVED ISSUE!
4
Pathways to Formant Patterns:
Prominences of Spectral Shapes
EXACT
RAW
EXACT
LP
Pathways to Formant Patterns (cont’d):
LP-derived Cepstrum (order M)
A Vocal-Tract Parameter
A Fourier-Series Model
of EXACT LP-Spectrum
M÷2 POLES
{BROAD & NARROW BANDWIDTHS}
The Cepstral Distance (Euclidean):
Un-Weighted versus Index-Weighted
CA
k  [1,2,...,M]
CB
k  [1,2,...,M]
kCA
k [1,2,...,M]
kCB
k [1,2,...,M]
Cepstral Analysis-by-Synthesis (CAbS)
ACOUSTIC
SIGNAL 
D1
FRAMES 
LP CEPSTRUM
order M = 12 
LP CEPSTRUM
TO POLES
M÷2 = 6 
D2
o
C
min{D j1...15}
OBSERVED
CEPSTRUM
CEPSTRAL
DISTANCE
k [ 1..12]
P1 [F1,B1]
P2 [F2 ,B2 ]
CANDIDATE
CEPSTRA
P6 [F6 ,B6 ]
GENERATE
ALL UNIQUE
QUARTETS 
BINOMIAL (6,4)
q1
k
q1  [ P1P2P3P4 ]
q2  [ P1P2P3P5 ]
q15  [ P3P4P5P6 ]
CEPSTRUM
CONVERSION
C
C
q2
k
q15
k
C
D15
Some Vowel Data to Test the Approach
Harrison (2004)
THIS
STUDY
(1 speaker)
FLEECE
TRAP
PALM
GOOSE
SCHWA
Zero
he
ha
Har
who
hisser
/t/
heat
hat
heart
hoot
hurt
/d/
heed
had
hard
who’d
herd
/s/
cease
pass
Haas
Soos
hearse
/z/
he’s
has
SARS
who's
hers
/n/
seen
Hann
Hahn
Hoon
Hearn
MICROPHONE


TELEPHONE
2 adult-male, native speakers
3 contemporaneous tokens
4.5
POLE-GRAM (LP-ORDER M = 10)
+
CAbS RESULTS SUPERIMPOSED
(Upper Bound = 3.5 KHz)
2.6
CALCULATE
AVERAGE
INTRA-VOWEL
DISPERSION
2.5
Candidate Values of Upper Bound (kHz)
Implementation Methodology
10 11
18
Candidate Values of M
CLUSTERING
QUALITY
(CQ)
Results 1: Optimisation of Analysis Parameters
(LP order & Spectral Range)
MICROPHONE
Opt. M = 10
[3300 – 3700] Hz
CQ = 27.80 Hz
Opt. M = 13
[3300 – 4200] Hz
CQ = 27.05 Hz
TELEPHONE
Results 2a (MICROPHONE DATA):
Min. Distance vs Min. Mean Bandwidth
Results 2b (TELEPHONE DATA):
Min. Distance vs Min. Mean Bandwidth
Concluding Summary
1)
Introduced a New Approach to F-pattern Estimation
a.
Two major schools of parameterisation – the formant & the cepstrum
b.
F-patterns estimated by requiring compatibility w/ observed cepstra
c.
Compatibility is achieved using a “smart” measure of compatibility
i.
ii.
2)
3)
sensitivity to spectral peaks that provide the best spectral match
flexibility to select any spectral range
Current Results are Promising
a.
Compatibility Measure exhibits Robustness
b.
Estimated F-patterns are Consistent with Phonetic Expectationz
c.
LP-Order and Spectral Range are Concurrently Optimised
Proposed Approach – A Serious First Step towards
a.
Objectivity
b.
Reliability
c.
A Very Useful Tool
14
Looking towards the Future
1) More Challenging Data
a.
b.
c.
d.
Beyond Steady-State Vowels
Wide Range of Speakers
Variety of Voice Qualities
Differing Recording Conditions
2) Speculations
a. The All-Pole LP-Model is likely to be insufficient
for all cases
b. The Approach of Cepstral Compatibility holds
the potential of a long survival!
15