Coarticulation Analysis of Dysarthric Speech
Download
Report
Transcript Coarticulation Analysis of Dysarthric Speech
Coarticulation Analysis of
Dysarthric Speech
Xiaochuan Niu,
advised by Jan van Santen
Outline
Goal of the Dysarthria Project
Problems
Hypotheses
Analysis approach
Results
Conclusions
Goal of the Dysarthria Project
Dysarthria: Motor speech impairment
Dysarthric speech:
Normal speech:
Improve the intelligibility of Dysarthric
speech:
Dysarthric
Speech
Speech
Transformation
System
Intelligible
Speech
Problems
Previous results of intelligibility test (by
Hosom, Kain, et al.):
Improvement potential:
Baseline transformation system:
Dysarthric: 68% --> Normal: 99%
Spectral feature replacement: 87%
GMM + Linear transformation
NO improvement: 67%
Poor spectral separability of dysarthric
speech
Hypotheses
Poor spectral separability caused by
H1 - Target shift
H2 – Coarticulation effect
Dysarthric speakers develop special vocal-tract
configurations for certain phonemes.
Degree of context influence on articulation is
greater in dysarthric speech
H3 - Random variation
Dysarthric speakers can not repeat the target
vocal-tract configuration accurately
Analysis approach
Acoustic measure
Speech data
Coarticulation model
Estimation method
Acoustic measure
Formants
Natural frequencies of certain vocal-tract
configurations
Assumption: each phoneme has a target
formant pattern
Formant trajectories
Dynamic characteristics of articulation
Acoustic measure (example)
/b/
Formant trajectories of CVC’ segments
/i:/
/t/
F3
F2
F1
/b/
/u/
/t/
Speech data
Utterance
Phoneme
One dysarthric speaker and one normal speaker
from a dysarthric database
74 nonsense sentences per speaker
Manually labeled with time alignments
Formants
First three formant frequencies at the midpoints of
vowels in CVC’ segments
Automatically extracted and manually checked
Coarticulation model
F (t) (t) TC TV TV (t) TC TV
(t) TC 1 (t) (t)TV (t) TC
TV
F (t )
TC
Notations:
Observed formant vectors: F (t )
Target formant vectors: TC , TV , and TC
Coarticulatory factors:
(t ) and (t )
TC
Estimation method
(i )
Given N samples of observed formant vectors F at
midpoints of vowels in CVC’ segments, assume target
formant-vectors are known,
(i ) (i )
(i ) (i ) (i ) (i ) (i )
F TV TC TV
TC TV
(i )
(i 1 ~ N ).
(i )
(i )
With
and
fixed, jointly estimate target
formant-vectors from equations
(i )
F (i ) I 1 (i ) (i )
(i )
TC
1 0 0
(i )
(i )
I I TV (i 1 ~ N ) , I 0 1 0.
T (i )
0 0 1
C
Results – H1
Vowel space (/i:, @, A, u/): dysarthric ~ normal
Results – H2
Coarticulation effects: 1
Conclusions
An approach to decompose the contributions
of three factors
Practical aspects of the approach
target shift / coarticulatory effect / random
variation
Initial targets
Target constraints in estimation
Future work
Analysis of the entire trajectory
Apply results in the transformation system