96x48 Poster Template

Download Report

Transcript 96x48 Poster Template

Perceptual compensation for /u/-fronting in American English
KATAOKA, Reiko ([email protected])
Department of Linguistics, University of California at Berkeley, 1203 Dwinelle Hall, Berkeley, California 94720-2650
Materials and Methods (cont.)
Materials and Methods (cont.)
Listener's identification of speech sounds are influenced by both
perceived and expected characteristics due to the influence of
surrounding sounds. For example, a vowel ambiguous between /i/
and /e/ is heard more often as /e/ when the precursor sentence has
low F1 but it is heard as /i/ when the precursor has high F1
(Ladefoged & Broadbent 1957), and a greater degree of vowel
undershoot is perceptually accepted in fast speech than in slow
speech (Lindblom & Studdert-Kennedy 1967).
Later, Ohala and Feder (1994) showed that American
listeners judge a vowel stimulus which is ambiguous between /i/
and /u/ more frequently as /u/ in alveolar context than in bilabial
context, and do so both when the context is heard and when it is
“restored”. The current study is an attempt to extend Ohala &
Feder’s study with additional measures to reveal the locus of
perceptual compensation in the human speech processing system.
Presentation:
• A stimulus was played after a precursor “I guess the word is…”.
• Each stimulus was tested 5 times.
• Fillers ([CiC], [CɨC], [CuC], and [CʉC], each three times) were
mixed in the trials to create genuine vowel quality variation.
• Test stimulus and fillers were presented at random order.
• During a trial, the two alternative words (/CiC/ and /CuC/ forms)
appear on the screen. (See Table 1.) The words on the screen
will make subjects to ‘restore’ the onset and coda phoneme
even when hearing the stimulus with white noise.
Task : Two-alternative forced-choice. The subjects determined
which of the two words on the screen the word just heard was.
Subjects: 31 native speakers of American English (10-M, 21-F)
Factors:
1) Consonantal context (3 conditions: Alveolar, Bilabial, Zero)
2) Status of context (2 conditions: Acoustic or Restored)
Stimuli: 3 continua of 10 syllables each, varying by vowel backness
(i.e., /dit/ - /dut/, /bip/ - /bup/, and /i/ - /u/ continuum) were
created by the following process:
• Create 10 equal-step /i/ - /u/ continuum. The voice source was
extracted from a sustained vowel, which was produced by the
speaker of a precursor sentence, by applying an inverse
filtering of LPC object of the vowel so that the voice of the
stimuli matches to the voice of the precursor. All 10 vowels
have identical formant structures except for F2 value, which
varied from 2,300 Hz (step 1) to 860 Hz (step 10). See Fig. 2.
• Add amplitude contour and F0 contour as in Exp.1.
• Add onset and coda consonant bursts as in Exp.1.
Presentation and Task: Same as Experiment 1 except that each
stimulus was tested 4 times.
Subjects: Same as Experiment 1.
Factor: Consonantal context (3 conditions: Alveolar, Bilabial, Zero).
(a)
Status
Alveolar
Bilabial
Zero
Acoustic
[dyt]
[byp]
[y]
Restored*
[NyN]
[NyN]
[NyN]
Alternatives
‘deet’ or ‘doot’
‘beep’ or ‘boop’
‘ee’ or ‘oo’
EXPERIMENT 2
Methods are same as Experiment 1 except that precursor was
uttered in 3 different speech rates.
Subjects: Same as Experiment 1.
Factors: Precursor speech rate (3 conditions: fast, medium, slow).
EXPERIMENT 3
Production
Subjects uttered a sentence “Say who again.” 10 times. From the
mid point of the vowel in ‘who’, F2 value was measured. Then,
by using each subject’s other vowel productions (/i, ɪ, e, æ, a,
ɔ, ʌ, ʊ, u/, 10 times), normalized mean F2 value for /u/ was
obtained as a difference from mean F2 of all the vowels on the
log scale (LN).
Perception
Figure 1. Waveforms of a /dVt/ stimulus used in Acoustic
Context condition (a) and Restored Context condition (b).
TEMPLATE DESIGN © 2008
www.PosterPresentations.com
Fast
/i/
Context
Cond.
Resp.
Alveolar
Bilabial
Zero
Total
9
22
16
47
/u/
145
132
138
415
/i/
9
36
10
55
/u/
146
119
145
410
Restored /i/
Acoustic
Total
309
309
309
Alveolar
Med.
35
95
147
103
120
370
7
53
11
71
148
101
144
393
3
26
6
35
/u/
152
129
149
430
Total
465
464
465
/u/
/i/
465
464
465
1395
589 598 589
600


496

Fast
Med
Slow
498


Rate


523 513

566 569

400
Error Bars show Mean +/- 1.0 SE
Bars show Means
200
Effect of Context
Fast:
[F=6.0 (2, 48), p<0.05]
Med:
[F=9.3 (2, 52),
p<0.001]
Slow:
[F=10.2 (2, 60),
p<0.001]
0
G.
Total
Total
Zero
52
/i/
Slow
Bilabial
G.
Total
8
Alveolar
Bilabial
Context
Zero
Figure 4. Reaction Time for /u/ response.
Results: Experiment 3
462
465
Consonant
100
51.0
Alveolar
927
Chi-Square test for association: # of /u/-response and Contexts
Acoustic Condition [χ2=3.429 (2), p=0.180]
Restored Condition [χ2=0.612 (2), p=0.736]
Bilabial
80
Zero
Error bars: 95% CI
60
40
48.0
45.0
42.0
20
39.0
0
1
2
3
4
5
6
7
8
9
10
Stimulus #
R Sq Linear = 0.168
624
580

60 0
523

499
528

550



Condition
Res tored context
Acoustic context
Error Bars show Mean +/- 1.0 SE
40 0
Figure 5. % /u/-response as
a function of stimulus
number, in three different
consonant conditions.
-0.60
-0.40
-0.20
Effect of Contexts
Restored: [F=4.5 (1.5, 43.3), p<0.05]
Acoustic: [F=3.0 (1.8, 52.9), p=0.06]
Bilabial
Context
Zero
Figure 3. Reaction Time for /u/ response.
•Alveolar context shortens the RT for /u/-identification (Experiment
1 & 2). This suggests that the perceptual system not only shifts the
perceptual category boundary but also to facilitate perceptual
processing for a category that is contrastive to the context.
•Speech rate of precursor influenced vowel identification in Bilabial
and Zero context (Experiment 2), suggesting the listener’s ability to
perform not only categorical but also gradient compensation. The
effect was absent in Alveolar context. This could be due to ceiling
effect in this particular experimental setting. This effect needs to be
re-examined by using an improved method.
•Mild correlation was found between /u/ identification in perception
and F2 measured from /u/ production (Experiment 3). Among those
who produced /u/ with relatively low F2 (back /u/), many also
accepted fronted /u/s as category members in perceptual vowel
categorization task. Although previous studies (e.g. Harrington et.
al., 2008) demonstrated robust link between production and
perception, language users seem to be much more tolerant to
speech variation (e.g. dialects, sociolects, style shifts, etc.) that
differs from their own production patterns.
Citation
0.00
Figure 6. Scatter plot of the
perceptual /u/ space against
normalized F2 of /u/.
Bars show Means
20 0
•Listeners heard an ambiguous stimulus more often as /u/ in
Alveolar than in Bilabial or Zero context, and did so both in Acoustic
and Restored phoneme conditions (Experiment 1). Also, the /u/
identification function shifted as a function of consonantal context
(Experiment 3). These results confirm listener’s ability to
perceptually compensate for coarticulatory fronting of /u/ in alveolar
contexts by shifting the category boundary. It also seems to show
listeners’ ability to utilize mentally stored acoustic images of
linguistic contexts to perform compensation.
back [u] ------------------------------------------- fronted [u]
Normalized F2 of /u/
Alveolar
Figure 2. Spectrograms of 10-step /i/ - /u/ continuum. Formant
frequency (Hz) for the vowels are: F1=375, F2 = variable, F3 =
2500, F4 = 3500, and F5 = 4500.
Resp.
/u/
Table 2. # of responses by Context and Condition
(5 repetition X 31 listeners = 155 cell total)
0
(b)
Cond.
narrow /u/ space ------------------ wide /u/ space
EXPERIMENT 1
Stimuli: Re-synthesized syllables [dyt], [byp], and [y] (to be used in
Acoustic condition) and another set where white noise replaced the
consonants (to be used in Restored condition) were created by:
• Iterating single vowel period in the CV transition of ‘dude’ to
obtain a steady vowel (i.e. no formant transition) of 100 ms
• Adding amplitude contour for the first and last 15 ms of the vowel
• Adding F0 contour: 130 at onset  90 Hz at offset
• Adding excised natural stop burst (/b/ or /d/) at vowel onset and
another (/p/ or /t/) 70 ms after the vowel offset, or
(Fig. 1, a)
• Adding white noise in the place of natural stop burst (Fig. 1, b)
Context
Context
Perceptual /u/ space
Materials and Methods
Results: Experiment 1
Summary and Discussion
Table 3. # of responses by Context and Condition
(5 repetition X 31 listeners = 155 cell total)
% /u/ Response
•To replicate Ohala & Feder’s findings and further investigate the
context effect on Reaction Time (RT) (Experiment 1)
•To test the effect of speech rate on the degree of perceptual
compensation (Experiment 2)
•To investigate the relationship between speech production and
perceptual compensation (Experiment 3)
Table 1. Factors tested, stimuli presented (e.g. [dyt]), and two
alternatives shown on the screen for each stimulus.
Reaction Time (ms)
Objectives
Results: Experiment 2
Reaction Time (ms)
Introduction
Figure 7. Metric for obtaining
the perceptual space for the
vowel /u/. The space is
defined as an area under the
% /u/-response function.
Harrington, J., Kleber, F., and Reubold, U. (2008). “Compensation
for coarticulation, /u/-fronting, and sound change in standard
southern British: An acoustic and perceptual study,” J. Acoust.
Soc. Am. 123, 2825-2835.
Ladefoged, P., and Broadbent, D. E. (1957). “Information conveyed
by vowels,” J. Acoust. Soc. Am. 29, 98-104.
Lindblom, B., and Studdert-Kennedy, M. (1967). “On the role of
formant transitions in vowel recognition,” J. Acoust Soc. Am. 42,
830-843.
Ohala, J. J., and Feder, D. (1994). “Listeners’ identification of
speech sounds is influenced by adjacent ‘restored’ phonemes,”
Phonetica 51, 111-118.