Linguistics 460

Transcript Linguistics 460

Li6 Phonology and Morphology
Rules
Lecture plan
Key point of tension between symbolic
rationalists and numerical reductionists:



Do humans extract generalisations from the data in
their perceptual worlds?
Put differently, is the mind a Turing Machine or a
recurrent switching network?
Evidence for rules
What form rules take




Degree of specificity
Formalism
Turing machine vs switch network
+
memory
+ rules/algorithms/generalisations
-
Arguments for Turing machine
(or against connectionism)

Gallistel 2006






Minsky and Papert 1969 on 2-layer networks:




Dead reckoning
Bee dances
Temporal learning in conditioning experiments
truly random control (Rescorla 1968)
Blocking (Kamin 1969)
Exclusive OR (thanks to Marc)
Can’t correctly indicate at its output neuron (or neurons)
whether there are an even or an odd number of neurons firing
in its input layer
Berent et al. 2006 on plurals in English compounds
Vaux

MSCs, as we’ll see later
URSR mappings and rules


We saw in lecture 1 that humans store both abstract
underlying representations (URs) and more concrete
surface representations (SRs)
How does one get from one type of representation to
the other?

Hypothesis 1:
Hypothesis 2:

Hypothesis 3a:

Hypothesis 3b:

Why favor
this one?
each is simply memorized
URSR mappings encoded in
associative/connectionist network
All URs are transformed into SRs (and
perhaps vice versa) by an ordered series of
rules
Only regular URSR mappings involve rules
Animals extract
generalisations
Generalisation by animals
9
8
7
6
5
Generalisation by infants
1
2
3
4
3
2
1
0
consistent

Marcus et al 1999

Question


Do infants extract linguistic generalisations, and in what form?
Method

16 infants randomly assigned to one of two groups, each familiarized with 2-minute
speech sample




test sentences varied as to whether they were consistent or inconsistent with the grammar of the
habituation sentences.
Because none of the test words appeared in the habituation phase, infants could not distinguish
the test sentences based on transitional probabilities, and because the test sentences were the
same length and were generated by a computer, the infant could not distinguish them based on
statistical properties such as number of syllables or prosody.
Results


ABA group: 3 reps of each of 16 3-word sentences from ABA grammar (ga ti ga, li na li, etc.)
ABB group: same with ABB grammar (ga ti ti, etc.)
After habituation, testing on sentences of 3 novel nonce words


inconsistent
Mean time spent looking in the direction of
the consistent and inconsistent stimuli in
each condition for experiments 1, 2, and 3.
The infants attend longer to sentences with unfamiliar structures.
Conclusions

“Results suggest that infants can represent, extract, and generalize abstract
algebraic rules.”
Conclusions about
generalisation extraction



Ample evidence that humans extract
generalisations from patterns of data in the
real world
These are directly captured in rules
These are not captured insightfully (or
sometimes at all) by switch-network models
(surface constraints, connectionist networks)
Evidence for rules
Internal evidence
A typical line of argumentation

When does glottalization occur?




sat
Atlantic, atmosphere, coat-tails
tap, atrocious
Since glottalization/unrelease is predictable, we don’t
want this to be part of the underlying representation,
under the assumption that speakers don’t store
redundant information.
If this is the case, we need a rule to glottalize stops in
the appropriate environments.



What form should this rule take?
External evidence

Productivity


Speech therapy


Child and adult Wug tests, e.g. Pinker and Ullmann on novel plurals
Click girl undoing her problem with lightning quickness
Syllable deletion in speech errors





unanímity [junnImRi]

unámity
[junQmRi]
treméndously [tHrmEndsi] 
trémenly [tHrEmni]
specifícity [spEsfIsRi]

specífity
[spEsIfRi]
What is the error in each case?
We need a rule to assign a new stress in these words; if there were no
rules, we should expect the forms to be stressless


First-language acquisition phenomena


What sort of rules do we need to account for the outcome of these errors?
Over-regularization (goed for went, etc.)
Transfer in second-language acquisition

Speakers have trouble suppressing L1 rules


Japanese/Korean palatalization, epenthesis
English aspiration
What about extraction of
generalisations from
less clear patterns?
Morphophonemic rules
Static patterns
What about generalisations that
have exceptions?

English Vowel Shift is productive for some
speakers for some vowels


Cena 1978, Jaeger 1980, McCawley 1986
Pierrehumbert 2002

English /k/  [s] / _ i in Latinate contexts



electric-ity vs cheek-y  *chee[s]y
Is the rule active, or just a historical remnant?
Method

ADJN (back formation)


NADJ (forward formation)


In Pierre’s entire career as a curator, he had never
before seen such a perfect example of hovacity. It
was an electrifyingly ______ sculpture.
Before Pierre stood an electrifyingly hovac sculpture.
In his entire career as curator, he had never before
seen such a perfect example of ______.
Results

The alternation was productive, but only for Latinate
and semi-Latinate targets.
100
90
80
70
60
50
40
30
20
10
0
%s
latinate
semi- non-latinate
latinate
What about generalisations that
show no alternations?







Esper 1925
 Test subjects break up nonce words into morphemes based on
phonotactics of their L1
Moreton 1999
 Speakers have active knowledge of constraint on monosyllables ending
in lax vowel, which they use in speech perception
Pater and Tessier 2003
 toy grammars easier to acquire when their alternations conform to
phonotactic generalizations in their L1
Dell et al 2000
 speech errors conform to phonotactics of data in toy language
Cebrian 2002
 native English speakers, and Catalan learners of English, use this
restriction in interpreting the morphological composition of nonce words.
Vaux 2003
 Productivity of MSCs
Kaun and Harrison 1999 on Tuvan reduplication…
Tuvan overwriting reduplication

Common assumption among phonologists:




Non-alternating structure is stored as such in underlying forms.
Alternating structure is not stored in URs.
Alternation Condition (Kiparsky ‘68), Lexicon Optimization (P&S ‘93)
Kaun and Harrison 1999:




Observation: Tuvan VH: all vowels in a root agree wrt [back]
Question: does vowel harmony apply to non-alternating forms?
Method: teach subjects Jocular Reduplication; see if new V triggers
root harmony
 Replace first vowel of root with [a]
nom ‘book’
 nom-nam
 If root vowel is [a], replace it with [u] at ‘name’
 at-ut
Results: harmonic forms reharmonize, disharmonic forms don’t
 Harmonic words
idik ‘boot’  idik-adık (not *adik)
 Disharmonic words mašina ‘car’  mašina-mušina (*mušı/una)
Tuvan overwriting reduplication

Conclusions:



Disharmonic forms are fully specified underlyingly
Harmonic forms are not (“Free Ride”, McCarthy 2004)
Theoretical implication:

Generalisations can be formed over non-alternating
phonological material
idik
|
[-bk]
m a š i n a
|
|
|
[+b] [-b] [+b]
Technical aspects of
rule formalism
The formal statement of rules
Rules take the general form A  B /X_Y
 A
target of rule, an element in UR
 
becomes
 B
what the segment containing A becomes
 /
in the environment of
 _
position of the target A
 X
element left-adjacent to A (can be absent)
 Y
element right-adjacent to A (can be absent)
 #
word boundary
 Ø
zero/nothing
 /X/
underlying form
 [X]
surface form
 <X>
stray segment
 (X)
optional segment
 α,β,γ variables
Key rule types

Insertion



Deletion



Ø→A/B_C
 Insert A between any BC sequence
Ø → A / _#
 Insert A word-finally
A→Ø/B_C
 Delete A between B and C
A → Ø / #_
 Delete A word-initially
Alpha Rule

[αX] → [-αX] / B _ ]σ
 Invert the feature specification for X when it occurs after B at
the end of a syllable
Desiderata in rules
Keys:







Elsewhere Case = UR
Use as few rules as possible

This includes trying to collapse rules dealing with (seemingly)
separate phenomena, such as the English plural and other
voice assimilation processes
Be as general as possible

E.g. try “stops” rather than “{p t}”
Be as predictive as possible

a rule that merely describes the facts is essentially useless
The last two points normally boil down to the same
thing (use as few features as possible, etc.)
Linguists generally temper their rule formulations with
consideration of what is (typologically) plausible
Choosing a UR




Relevance of Elsewhere Case
 English aspiration
Insertion of material is less common than deletion
 Generalisation: avoid insertion/creation of arbitrary elements
Consideration of rule typology
 Final devoicing, palatalization, etc.
An example
 Sound X occurs only at the ends of words, while sound Y occurs
anywhere but at the ends of words. Which of the following rules
is most likely to be involved?



X → Y / ___ #
Y → X / ___ #
X → Y everywhere but / ___ #
Use as few rules as possible

English aspiration


p → ph, t → th, k → kh
[-voice, -cont] → [+spread glottis]
3 rules
1 rule
Use as few features as possible

Voicing neutralization

Russian voiced obstruents become voiceless word-finally

Voiced obstruents = [+voice, -sonorant, +consonantal…]
Relatively specific formulation:
 [+voice, -son, +cons] → [-voice, -son, +cons] / _ #
More general/predictive formulation, using fewer features:
 [-son] → [-voice] / _ #
 voiceless obstruents vacuously undergo the rule


Spanish spirantization
Noun
definite
gloss
banca [baNka]
la banca [la BaNka] bank
demora [demoRa] la demora [a DemoRa]
gana [gana]
la gana [a ana] desire
 What are the segments targeted by the rule?
 In what environment(s) do they undergo the rule?




The set of sounds that undergoes this change is the voiced stops, i.e. the natural
class of [+consonantal, -sonorant, -continuant, +voice] segments.
The set of sounds produced by the rule is the voiced fricatives, i.e. the
natural class of [+consonantal, -sonorant, +continuant, +voice] segments.
The set of sounds that triggers the change is vowels and r, i.e. the natural
class of [+continuant] segments.
We could therefore say:


delay
[+cons, -son, -cont, +voice]  [+cons, -son, +cont, +voice] / [+cont]_
However, we want to be as general and efficient as possible. Therefore:

[+voice]  [+continuant] / [+continuant] _
English plural formation


Formation of regular plurals of nouns in English:
cat : cat[s]
dog : dog[z]
ash : ash[z]
Possible analyses:
1. Memorize each word and its plural form.
2. Memorize 3 plural endings; assign each word to class 1, 2, or 3.
3. {plural} 



[s] after {p t k ...}
[z] after {b d g ...}
[z] after …
4. several general rules (holding over domains broader than the plural):




Plural selection:
Epenthesis:
Voicing Assimilation:
{plural} → /-z/
Ø → [] / _ <C>
[-son] → [αvoice] / _ [αvoice] ]σ
cf. knish
cf. fif-th
Predictions?

Analyses 1 and 2 predict that speakers will be unable to deal with foreign and madeup words.
What does each model predict?
Some theories of English plural formation

rule-based
1.
[+pl]  [-z] / {aeioubdgmnŋð…} _
[-s] / {ptkθf} _
unordered
[-əz] / {szčĵšž} _
2.
[+pl]  [-z] / [+voice, -strident] _
[-s] / [-voice, -strident] _
[-əz] / [+strident] _
3.
[+pl]  [-əz] / [+strident] _
[-s] / [-voice] _
[-z] / elsewhere
ordered
4.
rule 1
[+pl]  /-z/
rule 2
Ø  [ə] / _ <C>
rule 3
[+cons]  [-voice] / [-voice] _

probabilistic ( analogical, connectionist)
1.
wug + PL  70% of g-final words take -z  70% wugz
2.
wug + PL  70% of g-final words take -z  100% wugz

memory-based
Conclusions
Avoiding insertion/creation
Verb
Passive
Gerundive Gloss


awhi
awhitia
awhitaŋa
embrace
hopu
hopukia
hopukaŋa
catch



aru
arumia
arumaŋa
follow
tohu
tohuŋia
tohuŋaŋa
point out
mau
mauria
mauraŋa
carry
wero
werohia
werohaŋa
stab
patu
patua
patuŋa
strike, kill
kite
kitea
kiteŋa
see, find
Proto-Polynesian *C → Ø / _ #
Synchronic analysis:
Better to have C-deletion rule
than to have many allomorphs
for the passive and the
gerundive


Passive /-ia/, gerundive /-aŋa/
V→Ø/V_
The allomorphy analysis also
incorrectly predicts the existence
of roots selecting -tia but -maŋa
NB Maori actually did later
choose the allomorphy analysis,
and then made -tia its default
form
Hale, Ken. 1973. Deep-surface canonical disparities in relation to analysis and change:
An Australian example. Current Trends in Linguistics 11:401-458.
References
Armbruster, Thomas. 1978. The Psychological Reality of the Vowel Shift and Laxing Rules Dissertation Abstracts International. 39:1516A-17A.
Aske, Jon. 1990. Disembodied Rules versus Patterns in the Lexicon: Testing the Psychological Reality of Spanish Stress Rules Berkeley Ling. Soc.. Berkeley;
30-45. Proceedings of the Sixteenth Annual Meeting of the Berkeley Linguistics Society, February 16-19, 1990: General Session and Parasession on the
Legacy of Grice. Hall, Kira (ed.); Koenig, Jean-Pierre (ed.); Meacham, Michael (ed.); Reinman, Sondra (ed.); Sutton, Laurel A. (ed.).
Berent, Iris, Steven Pinker, G. Ghavami, and S. Murphy. 2006. The Dislike of Regular Plurals in Compounds: Phonological Familiarity or Morphological
Constraint? Manuscript, Harvard University.
Bernstein Ratner, N. 1984 Phonological rule usage in mother-child speech. Journal of Phonetics 12:245-254.
Cena, R. 1978. When is a phonological generalization psychologically real? Bloomington: Indiana University Linguistics Club.
Dell, Gary, Reed, K.D., Adams, D.R., & Meyer, A. 2000. Speech errors, phonotactic constraints, and implicit learning: A study of the role of experience in
language production. Journal of Experimental Psychology: Learning, Memory, and Cognition 6:1355-1367.
Gallistel, C. Randy. 2003. Conditioning from an information processing perspective. Behavioural Processes 61.3:1234 1-13.
Gallistel, C.Randy. 2006. The nature of learning and the functional architecture of the brain. In Q. Jing, et al (Eds) Psychological Science Around the World, vol
1. Proceedings of the 28th International Congress of Psychology. Sussex: Psychology Press.
Hale, Ken. 1973. Deep-surface canonical disparities in relation to analysis and change: An Australian example. Current Trends in Linguistics 11:401-458.
Hauser, Marc, Daniel Weiss, and Gary Marcus. 2002. Rule learning by cotton-top tamarins. Cognition 86:B15–B22.
Hetzron, Robert. 1972. The Shape of a Rule and Diachrony. Bulletin of the School of Oriental and African Studies 35.3:451-475.
Iverson, Greg. 1994. The Reality of Linguistic Rules (Studies in Language Companion Series, no. 26), ed. with S. Lima & R. Corrigan. Amsterdam: John
Benjamins.
Janda, Richard, Brian Joseph, and Neil Jacobs. 1992. Systematic hyperforeignisms as maximally external evidence for linguistic rules. In Iverson et al, the
reality of linguistic rules.
Marcus, Gary, S. Vijayan, S. Bandi Rao, and P. Vishton. 1999. Rule learning by seven-month-old infants. Science 283.5398.
Minsky, Marvin and Seymour Papert. 1969. Perceptrons. Cambridge: MIT Press.
Moreton, Elliott. 1999. Evidence for phonological grammar in speech perception. In: J. J. Ohala, Y. Hasegawa, M. Ohala, D. Granville, and A. C. Bailey (eds.),
Proceedings of the 14th International Congress of Phonetic Sciences, San Francisco, pp. 2215-2217.
Pater, J. and A.-M. Tessier. 2003. Phonotactic Knowledge and the Acquisition of Alternations. In M.J. Solé, D. Recasens, and J. Romero (eds.) Proceedings of
the 15th International Congress on Phonetic Sciences, Barcelona. 1777-1180.
Pierrehumbert 2002, an unnatural process. LabPhon 8.
Pinker and Prince. 1994. Regular and irregular morphology and the psychological status of rules of grammar. In: S. D. Lima, R. L. Corrigan, G. K. Iverson (eds.),
The reality of linguistic rules, 321-51. Amsterdam: Benjamins.
Pinker and Ullmann
Trammell, Robert. 1978. The Psychological Reality of Underlying Forms and Rules for Stress Journal of Psycholinguistic Research. 7:79-94.
Truly random control
Shows that:

1.
2.
statistical correlations of the sort “if CS then US” do not drive
generalisation formation
Categorical generalisations can be extracted from gradient distributions

vs 33% response as one might expect for Group 1
Blocking

Shows that something beyond statistical association is
taking place
Esper 1925

Method

Ss learn names of 16 objects, each having one of four different shapes and one of four different colors

Ss trained on 14 object-name associations but tested on 16 to see if they generalize what they learned

3 experimental conditions:

names presented to Group 1:



Names presented to Group 2:



bi-morphemic names, as with Group 1
unlike group 1, the morphemes were not phonologically legal for English, e.g., nulgen, nuzgub, pelgen, pezgub (where nu- and pewere color morphemes and -lgen and -zgub were shape morphemes, the latter two violating English morpheme structure
constraints)
Names presented to Group 3 (a control group):



naslig, sownlig, nasdeg, sowndeg, where nas- and sown- coded color and -lig and -deg coded shape
Since these names consisted of two phonologically legal morphemes, this group could simplify their task by learning not 16 names
but 8 morphemes (if they could discover them) plus the simple rule that the color morpheme preceded the shape morpheme in each
name.
names with no morphemic structure
no recourse but to learn 16 idiosyncratic names
Results




As expected, group 1 learned their names much faster and more accurately than group 3.
Performance of Group 2 was similar to (and marginally worse than) that of group 3
Analysis of the errors of group 2, including how they generalized what they’d learned to the two object-name associations
excluded from the training session, revealed that they tried to make phonologically legal morphemes from the ill-formed
ones.
Demonstrates (i) psychological reality of MSCs; (ii) ability to conduct morphological analysis
Korean borrowing of Coda [t]


Korean word-final [t|]  /t, th, t’, č, čh, č’, s, s’/
Surface word-final postvocalic [t] in loans and nonce
words invariably assigned to /s/ (Martin 1992, Kang
1998, Hayes 1998, Iverson & Lee 2004)


supermarket  nom. [supəmakhet|], dat. [supəmakhese]
What appears to be involved in the Korean case is that
speakers know that surface word-final [t]s most often
come from underlying /s/ in their native lexicon, and
they therefore assign all new words to the same
pattern.

Linguistics 460

Transcript Linguistics 460

Directory