Vowels and speech production: - uni

Download Report

Transcript Vowels and speech production: - uni

Vowels and speech production:
gender differences
Presentation from Lina Hecker
Speaker Characteristics
Venice International University
Prof. Dr. Jonathan Harrington
17. October 2007
Introduction:
• there have been some analyses of female speech in
the past
→ focal point has been the male voice
• female voice has a higher frequency range
• men are more studied and they are regarded as the
standard to which everything else is compared
• in this lecture you can hear some differences in the
speech of females and males based on adults
• focus on dynamic articulatory and acoustic
consequences of differences in male and female vocal
tract dimensions
• and the relationship between formant change and
tongue movement
1. What are the dynamic articulatory and
acoustic consequences of differences in
male and female vocal tract dimensions?
(Simpson 2001)
• Simply illustrated in Goldstein (1980) using the
mid-sagittal vocal tract dimensions
• models vocal tract growth from infant to adult and
its acoustic products
• Goldstein draws together available anatomical
dimension data from a number of qualitative and
quantitative different sources
Conclusion of Figure 1:
• In the figure female stricture sizes are calculated
as 80% of the male values.
• It shows superimposed tongue positions for
female (solid) and male (dashed) [i] and [a].
• The distance from male [a] to [i] is ~11% greater
than the analogous female distance.
• If you assume the same nominal articulatory
speed and neglect inertia and acceleration, then
the male V–V movement will also take 11% longer.
2. The relationship between the size of oral
structures and its implications for
articulatory displacement and articulatory
velocity.(Kuehn & Moll 1976)
• They showed that the subjects with larger oral
structures, had larger articulatory displacement
and employed greater articulatory velocity to
traverse larger articulatory spaces.
→ focused on the general consequences of
differences in oral structure size
→ did not discuss the more wide-ranging
implications of their findings for gender
specific consequences in articulatory
behavior and its acoustic products
Explanation of Figure 2:
•
In the next figure you can see a hypothetical
male and female F1 paths for open–close vowel
movement, assuming the same nominal tongue
body movement speed of 200mm/s.
→
the male acoustic trajectory lasts longer
than the female one
→
the linear acoustic rate of change of F1
for females is ~35% greater than
the
male value.
==> female tongue covers a shorter distance
to achieve analogous targets, and
corresponds to a greater acoustic
distance.
Conclusion of Figure 2:
• males and females aim for analogous phonetic vowel
targets in CVC sequences
• if they move their articulators at the same speed, and
if they are operating within the same durational
framework
→ females reach their target earlier
• female degree of openness is greater than the male
one
• females exhibit less undershoot than males
→ undershoot increases from close to open vowel
categories
==> despite dimensional differences, targets are
attained at approximately the same time with a
difference in articulatory speed
3. The Relationship between formant change
and tongue movement
main articulatory-acoustic patterns found in diphthongs
(Simpson 2001)
• average male and female pellet and formant tracks
are similar in form
• female speakers cover a greater acoustic space
both in linear (Hz) and nonlinear (Bark) terms
• The articulatory distance covered by the two
posterior lingual pellets during the vocalic stretch is
greater for male speakers
• the dorso-tectal stricture size defined by the two
posterior lingual pellets is smaller for female
speakers throughout the vocalic stretch
• mean pellet speeds are greater for male than female
speakers
3.1. Data: UW-XRMBDB (Westbury, 1994)
• data set for examining gender-specific differences
in the relationship between articulation and its
acoustic products
• contains acoustic and articulatory records from 26
female and 22 male speakers (age 18-37),
speaking Upper Midwest dialect of Am. English
• linguistic (e.g. reading text) and non-linguistic (e.g.
swallowing) tasks
• articulatory data consists of 8 gold pellets
• 4 lingual pellets are placed along the midline of the
tongue
3.2. Method
use stretches of utterance to investigate the
dynamic relationship between acoustic and
articulatory activity which fulfill 3 criteria
1. large amounts of articulatory and acoustic
movement;
2. continuous voicing throughout the stretch to
facilitate reliable automatic formant tracking;
3. repetition by the same speaker of the same
expression containing a suitable stretch.
“The coat has a blend of both light and
dark fibers.”
‘‘They all know what I said’’
A: Formant analysis
• analysis of the vocalic stretch of “they all” made with
the ESPS program formant
• nominal default value of F1 was increased by 10%
to 550Hz for female speakers
• analysis times were extended by 25ms beyond the
segment start and end times
• formant tracks of the 239 tokens were visually
checked for tracking errors
• each set of formant tracks was resampled to
provide 11 temporally equidistant formant records
• 11 points provide a good definition of formant
movement throughout the vocalic stretch
B: Pellet position
• pellet position of the UW-XRMBDB are stated in a
coordinate system
• The normalization method redefines the position of
the pellets on the tongue surface, with respect to their
distance from the tip of the upper incisors.
• normalization allows to compare values from
speakers with different palate outline lengths
• raw pellet positions were averaged separately for
males and females
• male and female average palate outlines were
created using individual palate outlines, resampled at
0,5mm intervals
3.3. Results
3.3.1. Duration:
• a one-tailed t-test for the V-V stretches shows that
the mean female duration is greater than the male
one
→ no significant difference was found between the
male and female durations of the utterances
• in other studies there were also found longer female
durations for diphthongs (Simpson 2001) and
monophthongs (Hillenbrand, Getty, Clark & Wheeler
1995)
3.3.2. Formant tracks
•
at the 11 equidistant measurement points means
and standard deviations of F1-F3 were calculated
for males (right) and females (left) tokens. (next fig)
• formant values for the V-V stretch for “they all” can
only cautiously compared with the results found in
the literature
1. speakers in the UW-XRMBDB speak an Upper
Midwest American English
2. vowels are from the initial part of the utterance,
particularly “they” being utterance-initial, unstressed
and preceding a stressed back open vowel →
expect a more centralized vowel than you would
find in isolation or utterance finally
• In the next figure you can see a graphical comparison
of the mean male and female formant tracks,
converted to the Bark scale.
• In linear (Hz) terms, female acoustic excursion within
the vocalic stretch is greater for both F1 and F2
• In non-linear (Bark) terms, situation is different. The
mean tracks for F2 and F3 run parallel with little
change and a distance between them throughout the
vocalic stretch.
• difference in mean F1 is 0,74 Bark at the beginning
and is 1,58 Bark (more than twice) by the end of the
stretch
→ suggesting a closer male vowel or a more open
female quality
→ more open the vowel quality, the larger the difference
becomes between female and male F1
Explanation of figure 5:
• during vocalic stretch tongue body makes a small
upward moving before moving backwards and
downwards
• F1 is determined by the apico-dental stricture of “they”
over the initial part of the stretch (t1-t4)
• at the final part of the stretch (t5-t11) the tongue body is
lowered, resulting in an increase in the size of the
dorso-palatal stricture defined by T2–T4
• F2 rises (t2-t4) to reach a plateau at (t3-t4) for the
closing phase of the diphthong
• F2 falls continuously as dorso-palatal stricture size
increases and the tongue moves back
• rise in F3 can be related to the lowering and backing of
the tongue body causing pharyngeal narrowing
3.3.3. Pellet position and speeds
• Fig. 6 shows the pellet position of the 4 lingual pellets
• T1-T4 at each of the 11 measurement points for
female and male speakers
• transformed and normalized values are shown in (a)
• in (b) raw values are plotted together with average
palate outlines and pharynx line segments can be
seen
• arrows indicate the direction of movement over time
• (b) shows the mean size, shape and location of the
male and female pellet trajectories
• in the transformed data (a), the palate has been
‘flattened’
→ must be interpreted more carefully
Explanation of Figure 6:
• both transformed and raw data bring out the larger
male dorso-palatal strictures defined by T3 and T4
• laminal and apical strictures are not different for
males and females
• transformed data encode the distance between
the palate and the pellets
• higher location of the female trajectories shows
the different stricture size (T3-T4)
• T-test proves that for females the palate-pellet
distance for T3-T4 is smaller
• average lengths of the pellet trajectories during the
vocalic stretch are shorter for females
• posterior male lingual pellets T3-T4 travel a greater
distance than the female pellets and they stay in
contrast to the smaller acoustic space traversed by
the male speakers
• these gender differences stand in contrast to findings
in (Hashi et al. 1998) where no gender influence on
isolated vowel tokens was found
• male dorsum travels a greater distance in a shorter
time period (see 1.Duration) than the female one
because the mean speed of the male posterior pellets
(T3-T4) is higher
Explanation of Figure 7:
• the next figure summarizes the average pellet speeds
at each of the 11 measurement points
• for the anterior pellets T1-T2 the male and female
speed is not significantly different over the whole
vocalic stretch
• for T3-T4 the initial and final portions are similar as
well
• whereas the mean speeds of T3-T4 are at the highest
point you can see significantly higher male speeds
→ compensation by both males and females is
necessary to achieve the same targets, despite
differences in articulatory space
Conclusion of Figure 7:
• gender-specific stricture differences are restricted to
posterior region of the oral cavity
→ degree of male palatal doming is higher and creates
a greater articulatory space to cross
• there are nonuniform differences in the relation of oral
to pharyngeal cavity length and nonuniform
differences in palate shape
→
this has nonuniform dynamic consequences for
tongue movement
4. Discussion
• for the same V-V sequences male and female tongue
movements and their acoustic and perceptual
products are similar in shape and structure
• difference between male and female F1 increased
acoustically with the degree of vowel openness
• male speakers had a shorter stretch duration
→ the speed of tongue dorsum displacement was higher
• size of male and female articulatory spaces is different
and stands in an inverse relationship to the size of
their acoustic products
• for the V-V sequences male and female pellet tracks
have a similar form and differ only in size and position
• male and female speakers a operating with similar
speeds of tongue movements
• assume that the slower (female) articulatory
movements require more time and faster (male) ones
less
• larger vowel space for women
→ women speak more clearly and articulate more
because it is the prestige form for female
• women produce longer vowels than men
• possibly speakers adopt different articulatory
strategies to arrive at tokens of the same phonological
categories
→ many of the hypothetical consequences are
speculation
→ no proof whether 2 speakers aim for similar targets
when they produce tokens of the same phonological
categories in a language
→ no classification that tokens of the same phonological
categories are equivalent in articulatory, acoustic and
perceptual terms
• several experiments draw conclusions based on a few
informants
→ tendencies might be individual rather than gender
based
• many reasons for difference between male and
female speech
→ women tend to have a greater variation in their
speech
→ female speech has been seen more difficult to
analyse
5. References
• Simpson, A. P. (2002). Gender-specific articulatoryacoustic relations in vowel sequences. Journal of
Phonetics, 30(3):417-435.
• Simpson, A. P. (2001). Dynamic consequences of
differences in male and female vocal tract dimensions.
Journal of the Acoustical Society of America,
109(5):2153-2164.
• Samuelsson, Y. (2006) Gender effects on phonetic
variation and speaking styles: A literature study. GSLT
Speech Technology Term Paper, autumn 2006.
• Goldstein, U. (1980) An articulatory model for the
vocal tracts of growing children. Ph. D. Thesis, MA:
M.I.T.
• Hashi, M., Westbury, J. R. & Honda, K. (1998)
Vowel posture normalization, Journal of the
Acoustical Society of America, 104, 2426–2437.
• Johnson, K., Ladefoged, P. & Lindau, M. (1993)
Individual differences in vowel production, Journal of
the Acoustical Society of America, 94, 701–714.
• Kuehn, D. P. & Moll, K. L. (1976) A cineradiographic
study of VC and CV articulatory velocities, Journal
of Phonetics, 4, 303–320.
Thank you for listening!