Coarticulation Analysis of Dysarthric Speech

Download Report

Transcript Coarticulation Analysis of Dysarthric Speech

Coarticulation Analysis of
Dysarthric Speech
Xiaochuan Niu,
advised by Jan van Santen
Outline






Goal of the Dysarthria Project
Problems
Hypotheses
Analysis approach
Results
Conclusions
Goal of the Dysarthria Project

Dysarthria: Motor speech impairment



Dysarthric speech:
Normal speech:
Improve the intelligibility of Dysarthric
speech:
Dysarthric
Speech
Speech
Transformation
System
Intelligible
Speech
Problems

Previous results of intelligibility test (by
Hosom, Kain, et al.):

Improvement potential:



Baseline transformation system:



Dysarthric: 68% --> Normal: 99%
Spectral feature replacement: 87%
GMM + Linear transformation
NO improvement: 67%
Poor spectral separability of dysarthric
speech
Hypotheses

Poor spectral separability caused by

H1 - Target shift


H2 – Coarticulation effect


Dysarthric speakers develop special vocal-tract
configurations for certain phonemes.
Degree of context influence on articulation is
greater in dysarthric speech
H3 - Random variation

Dysarthric speakers can not repeat the target
vocal-tract configuration accurately
Analysis approach




Acoustic measure
Speech data
Coarticulation model
Estimation method
Acoustic measure

Formants



Natural frequencies of certain vocal-tract
configurations
Assumption: each phoneme has a target
formant pattern
Formant trajectories

Dynamic characteristics of articulation
Acoustic measure (example)

/b/
Formant trajectories of CVC’ segments
/i:/
/t/
F3
F2
F1
/b/
/u/
/t/
Speech data

Utterance



Phoneme


One dysarthric speaker and one normal speaker
from a dysarthric database
74 nonsense sentences per speaker
Manually labeled with time alignments
Formants


First three formant frequencies at the midpoints of
vowels in CVC’ segments
Automatically extracted and manually checked
Coarticulation model





  
 
F (t)   (t)  TC  TV  TV   (t)  TC  TV



  (t) TC  1 (t)   (t)TV   (t) TC

TV
F (t )
TC
Notations:


Observed formant vectors: F (t )
 

Target formant vectors: TC , TV , and TC

Coarticulatory factors:

 (t ) and  (t )
TC
Estimation method

 (i )
Given N samples of observed formant vectors F at
midpoints of vowels in CVC’ segments, assume target
formant-vectors are known,

 

 (i )  (i )
 (i )  (i )  (i )  (i )  (i ) 
F  TV  TC  TV
TC  TV
 (i ) 
 

(i  1 ~ N ).
(i )
(i )


With
and
fixed, jointly estimate target
formant-vectors from equations

 (i )
F   (i )  I 1   (i )   (i )


 (i )
TC 
1 0 0
  (i ) 
(i )
 I   I TV  (i  1 ~ N ) , I  0 1 0.

T  (i ) 
0 0 1
C 

Results – H1

Vowel space (/i:, @, A, u/): dysarthric ~ normal
Results – H2

Coarticulation effects: 1     
Conclusions

An approach to decompose the contributions
of three factors


Practical aspects of the approach



target shift / coarticulatory effect / random
variation
Initial targets
Target constraints in estimation
Future work


Analysis of the entire trajectory
Apply results in the transformation system