Abstract Genomic Encoding of Universal Grammar in

Download Report

Transcript Abstract Genomic Encoding of Universal Grammar in

Formal Typology:
Explanation in Optimality Theory
Paul Smolensky
Cognitive Science Department
Johns Hopkins University
with:
Géraldine Legendre
Donald Mathis
Melanie Soderstrom
Alan Prince
Suzanne Stevenson
Peter Jusczyk†
Advertisement
The Harmonic Mind:
From neural computation to optimality-theoretic grammar
Paul Smolensky
& Géraldine Legendre
 Blackwell 2002 (??)
 Develop the Integrated
Connectionist/Symbolic (ICS) Cognitive
Architecture
 Apply to the theory of grammar
Chomsky 1988
“1. What is the system of knowledge?
2. How does this system of knowledge
arise in the mind/brain?
3. How is this knowledge put to use?
4. What are the physical mechanisms
that serve as the material basis for
this system of knowledge and for the
use of this knowledge?” (p. 3)
Responsibilities of Grammatical Theory
Chomsky’s “Big 4” questions
concerning knowledge of grammar
①
Structure
②
①
Nativist hypothesis
Acquisition
Processing
Neuro-genetics
OT
③
④
Not new to Chomsky or generative grammar …
Jakobson’s Program
 Linguistic theory is not just for
theoretical linguists
 The same principles that explain formal
cross-linguistic and language-internal
distributional patterns can also explain
• Acquisition
• Processing
• Neurological breakdown
Jakobson’s Program
Markedness enables a Grand Unified Theory for
the cognitive science of language: Avoid α
① Structure
 Inventories lack α
 Alternations eliminate α
② Acquisition
 α is acquired late
③ Processing
 α is processed poorly
④ Neural
 Brain damage most easily disrupts α
Talk Plan
OT Explanation
① Structure
② Acquisition
③ Processing
④ Neurogenetics
















Responsibilities of Grammatical Theory
Chomsky’s “Big 4” questions
concerning knowledge of grammar
①
Structure
OT
Structure of UG: Captured in a
general formalism for grammars
and their variation
Acquisition
Processing
Neuro-genetics
Possible strong version – Explanatory Goal ①:
Analysis of phenomenon
Φ in language L
⇒
①
Universal typology of
phenomenon Φ
Inherent typology
From Markedness to OT
 Formalizing markedness  ⋯  OT
• Markedness constraints
• Faithfulness constraints
• Competition
• Strict domination
• Strong universality & Richness of the Base
 Structure: Formal Result
Formalizing Markedness:
Two Problems
 Goal: Change epiphenomenal explanatory
status of markedness

• Markedness “explains grammars (e.g.,
rules)”; informal commentary about grammar
vs.
• Markedness IS grammar: markednessgrammars formally determine languages
 Structure: Formal Result
Formalizing Markedness:
Two Problems
 Problem 1: Multidimensional integration
Each dimension of linguistic structure
independently has its own marked pole, but
how do these dimensions combine?
Turns out to be related to another
fundamental problem:
 Structure: Formal Result
Formalizing Markedness:
Two Problems
 “α is marked” ⇝ “Avoid α”
 But when & how does “avoidance” happen?
Problem 2: Pervasive variability in “avoidance”
• Inventories: If [θ] is absent in French “because it is
marked” how can it be present in English “despite
being marked”?
¿The grammar of every language turns on or off: “No α ” = *α —
a markedness constraint. OT: More subtle version that also solves:
• Alternations: If in environment E, α  β “because α is
more marked than β”, how do we explain that in E α
̷ β “even though” α is more marked than β?
 Structure: Formal Result
Formalizing Markedness
 Most crudely: Why aren’t unmarked
elements always avoided?
 Something must oppose markedness
forces.
 Markedness cannot be the sole basis of
a formal grammatical theory: it is only
one half of the complete story.
 Structure: Formal Result
The Great Dialectic
Phonological representations serve two masters
MARKEDNESS
FAITHFULNESS
Phonetic interface
[surface form]
Often: ‘minimize effort
(motoric & cognitive)’;
‘maximize discriminability’
Phonetic
s
Lexical interface
/underlying form/
‘be this invariant form’
Phonological
Representation
Locked in eternal conflict
Lexico
n
 Structure: Formal Result
The Core Constraints of Con
 MARKEDNESS: *α (“minimize effort; maximize distinctiveness”)
• “constraint *α  Con” 
α meets empirical criteria for ‘marked’
• Freedom? Empirically constrained by universal patterns
 FAITHFULNESS (“be this invariant form”):
• /input/  [output] is the identity map, i.e.,
• elements /x/ and [x] are in one-to-one correspondence
and identical (McCarthy & Prince ’95)
• Constraints: MAX(x), DEP(x), IDENT(x), …
• Essentially determined by elements {x} of representation
• Freedom? Representations — as always: empirically
constrained to allow statement of markedness constraints
¿ “In OT you can invent any constraint you want” ?
 Structure: Formal Result
Conflict
 Dialectic: MARK vs. FAITH conflict
• Why aren’t marked elements always avoided?
Because sometimes MARK is over-ruled by FAITH
• Why aren’t words always pronounced in their
invariant, lexical form?
Because sometimes FAITH is over-ruled by MARK
 1 over-rules (dominates) 2: 1 ≫ 2
 Whether M gets violated (whether marked elements
fail to ‘be avoided’) varies by
• Language (in some, M ≫ F; in others, F ≫ M)
• Context (in some, M ≫ F2; in others F1 ≫ M)
 Structure: Formal Result
Conflict
 Dialectic: MARK vs. FAITH conflict
 Whether M gets violated (whether marked
elements fail to ‘be avoided’) varies by
• Language (in some, M ≫ F; in others, F ≫ M)
• Context (in some, M ≫ F2; in others F1 ≫ M)
 Why is there cross-linguistic variation?
• Phonetic  Lexical ~ MARK  FAITH Dialectic gets
resolved differently
• Typology by re-ranking: Factorial Typology
{possible human languages}  {rankings of Con}
(n constraints give n! rankings — many are equivalent)
 Structure: Formal Result
Formalizing Markedness
 Problem 1: ‘Avoidance of the marked’ is
pervasively variable; exactly where does
marked material appear?
• Solution: Constraint ranking
— MARK w.r.t. FAITH
Will now see this also solves:
 Problem 2: Multidimensional markedness
• Solution: single constraint ranking for all
constraints in a given language
 Structure: Formal Result
Formalizing Markedness
 Markedness is multidimensional
• Each dimension has its universally marked pole
• How do dimensions combine? (M1, *M2) vs. (*M1, M2)
CVC
́ .CV (STRESSHEAVY, *MAINSTRESSRIGHT) vs. CVC.CV́
• Integrate via a common markedness currency: Harmony
 Numerical: *M1 = 3.2; *M2 = 2.8
 Symbolic: *M1 absolutely worse than *M2
see below
 OT:
•For a given language, there is a single constraint ranking
for all constraints
•Strict domination hierarchy: markedness on higherranked constraints can never be compensated for by
unmarkedness on lower-ranked ones
 Structure: Formal Result
Competition for Optimality
 Given an input, an OT grammar does not
provide a procedure for how to construct the
output — bur rather a description of the
output: the structure that best-satisfies the
constraint ranking
 Best-satisfies is a comparative criterion; outputs
compete and the grammar identifies the
winner: the optimal — grammatical —
highest Harmony — output for that input
 Structure: Formal Result
Harmonic Competition
 Numerical Harmony
Candidates
STRESSHEAVY MAINSTRESSRIGHT
´
a. σHσ…σσ
n
b. σHσ…σσ
n
´
**…*
n
*
Harmony
n(wMAINSTRESSRIGHT)
wSTRESSHEAVY
 Stress is on the initial heavy syllable iff the number of
light syllables n obeys
n
wSTRESSHEAVY
wMAINSTRESSRIGHT
 any number Pathological grammars
 “Grammars can’t count”
 Structure: Formal Result
Harmonic Competition
 Symbolic Harmony: Strict domination
• STRESSHEAVY ≫ MAINSTRESSRIGHT
Candidates STRESSHEAVY MAINSTRESSRIGHT
´
´ Hσ…σσ
 a. σ
n
b.
´
σHσ…σσ
*!
**…*
n
Stress the initial
heavy syllable
n RIGHT ≫ STRESSHEAVY
• MAINSTRESS
Candidates MAINSTRESSRIGHT STRESSHEAVY
a.
σ
´ Hσ…σσ
n
 b.
´
σHσ…σσ
n
**…*!
n
*
 Strict domination  “Grammars can’t count”
Stress the
final syllable
 Structure: Formal Result
OT: ‘Formal’ definition
 Gen: Specifies candidate outputs for any
given input
 Con: The constraint set
 A grammar: A hierarchical ranking of Con
 H-Eval: Given two candidates and a ranking,
a formal definition employing strict
domination of which has higher Harmony —
which better-satisfies the ranking
 I  O mapping: I  The maximal-Harmony
candidate[s] in Gen(I)
 Structure: Formal Result
Richness of the Base
 Universality: All systematic cross-linguistic variation
arises from differences in constraint ranking
 Therefore:
• Con is universal; H-Eval is universal
• Gen is universal, including the space of possible inputs as
well as possible outputs
 i.e.: No systematic cross-linguistic variation is due to differences in
inputs
 e.g.: Languages with no surface codas cannot get this property
from limitations on the lexicon (e.g., a morpheme structure
constraint *Cwd]) — but rather from the ranking
 i.e.: The grammar must have the property that even if there were Cfinal inputs, there would still be no surface codas
Aside
 Richness of the Base is a principle for
inducing a grammar (generalizing)
from a set of grammatical items
 It can be justified by the central
principle of John Goldsmith’s
presentation:
 Maximize the probability of the data
 Structure: Conceptual “Question”
Explanatory Power
“OT is as unexplanatory as extrinsically-ordered
rule-theory”
Stipulating ranking ~ stipulating ordering
 Structure: Explanatory Goal
Inherent Typology
Actually, OT achieves Explanatory Goal ①, Inherent
Typology: In the analysis of phenomenon Φ in one
language is inherent a typology of Φ in all languages
 Structure: Conceptual “Question”
Analytic Restrictiveness
“You can make up any constraint you want in OT ”
 Structure: Explanatory Goal
Robust Falsifiability
Actually, in OT, positing  in the analysis of a
language L necessarily has a huge number of
empirically falsifiable implications (one
consequence of Inherent Typology)
E.g., Two pervasive patterns generated by ‘  Con’
 Structure: Explanatory Goal
Consequences of ‘  Con’ – I:
The Subordination Pattern
 E.g.,  = NOCODA
 Recall:
• If ‘No codas’ is in UG, why do codas ever appear?
• Conflict
 With faithfulness constraints
 With other markedness constraints – other dimensions of
markedness
 Cross-linguistic variation: codas are less and
less restricted as NOCODA is subordinated to
more and more conflicting constraints (i.e.,
dimensions of markedness)
 Structure: Empirical Application
Subordination Pattern: Codas
NOCODA
No codas at all
STRESS-TO-WEIGHT
Codas only in stressed syllables
MAXμ
… + Geminate codas
MAX
Codas unrestricted …
except prohibited inter-vocalically
[~V.CV~]
 Structure: Conceptual “Question”
Multiplicity of Constraints
For second pervasive pattern generated by ‘  Con’:
“Any framework which leads to the morass of
constraints found in OT analyses in phonology
cannot possibly be explanatorily adequate.”
 Structure: Explanatory Goal
Factorial Interaction
Actually, OT interaction-via-domination replaces
many rules by fewer constraints
 Structure: Explanatory Goal
Consequences of ‘  Con’ – II:
Factorial Interaction
 ‘Factorial interaction’: with varying
interaction (re-ranking), n simple
modular constraints correspond to
• Multiplicity of rules (many more than n)
• Complex, non-modular rules
• Rules + representational/notational tricks
• Rules + constraints
 E.g.,  = NOCODA
 Structure: Empirical Application
Factorial Interaction: Codas
 Consider Con  {MAX} ↪ {MAX, DEP}
 Number of constraints increases by 1
 Number of corresponding rules doubles
as set of ‘repairs’ now includes
epenthesis as well as deletion:
NOCODA ≫ MAX
↪ NOCODA ≫ DEP
ONSET ≫ MAX
↪ ONSET ≫ DEP
~
~
~
~
CØ/—σ]
Ø V/Cσ]—
VØ/[σ—
Ø C/[σ—V
 Structure: Empirical Application
Factorial Interaction: Codas
MARKEDNESS ≫
FAITHFULNESS
FAITHFULNESS
MARKEDNESS
NOCODA
ONSET
MAX
CØ/—σ]
VØ/[σ—
DEP
Ø V/Cσ]—
Ø C/[σ—V
In general, the number of comparable rules increases
much faster than the number of constraints
 Structure: Explanatory Goal
Consequences of ‘  Con’ –
II: Factorial Interaction
 ‘Factorial interaction’: with varying
interaction (re-ranking), n simple
modular constraints correspond to
• Multiplicity of rules (many more than n)
• Complex, non-modular rules
• Rules + representational/notational tricks
• Rules + constraints
 E.g.,  = NOCODA
 Structure: Empirical Application
Factorial Interaction: Codas
 STRESS-TO-WEIGHT ≫ NOCODA
• Codas only in stressed syllables
• CØ/—σ̆] segmental rule sensitive to foot structure
[‘non-modular rules’]
 ANCHOR-R ≫ NOCODA
• Codas only word-finally
• CØ/—σ] plus final-C extrametricality
[‘representational trick’]
 MAXμ ≫ NOCODA
• Only geminate codas — /Cμ/
• CØ/—σ] plus Hayes’ exclusivity of association
[‘notational trick’]
 Structure: Empirical Application
Factorial Interaction
 STRESS-TO-WEIGHT ≫ NOCODA
• STRESS-TO-WEIGHT ≫ *Cμ
• Geminates only after stressed V
• μØ/—σ̆]
 ANCHOR-R ≫ NOCODA
Codas only in stressed syllables
Codas only word-finally
• ANCHOR-R ≫ *[+voi,son]
• Obstruent devoicing except word-finally
• [+voi][voi]/[—, son] plus ?? to block word-finally
 MAXμ ≫ NOCODA
Only geminate codas; /C μ/
• MAXμ ≫ WEIGHT-TO-STRESS
• Geminates are the only codas in unstressed syllables
• CØ/—σ̆] plus exclusivity of association
 Structure: Jakobson’s Program
Markedness + Faithfulness = Harmony
In summary:
 Jakobson’s key insight concerning linguistic
structure: the central organizing principle of
grammar is: Minimize Markedness
 OT formalizes this as Maximize Harmony
 OT formalizes Markedness via violable constraints
 OT adds the crucial notion of Faithfulness – the
other (lexical) half of the phonological dialectic
 OT Harmony combines Markedness with
Faithfulness; their conflict is adjudicated via ranking
 Ranking unifies multiple dimensions of markedness
 Structure: Summary
 OT achieves the explanatory goals of
• Changing the epiphenomenal status of
markedness in grammatical theory:
markedness is now in grammar, not about
grammar
• A strongly universalist formalism
exhibiting Inherent Typology
• Robust falsifiability
Responsibilities of Grammatical Theory
Chomsky’s “Big 4” questions
concerning knowledge of grammar
①
Structure
②
Nativist hypothesis
①
OT
Acquisition
Processing
Neuro-genetics
Possible strong version – Explanatory Goal ②:
Substantive structure (①) ②
of a UG module
governing phenomenon Φ
⇒
Acquisition theory — initial
state, learning algorithm —
for phenomenon Φ
General Learning Theory
 Acquisition: Formal Result I
Learning Theory
 Learning algorithm
• Provably correct and efficient (when part of a general
decomposition of the grammar learning problem)
• Sources:
 Tesar 1995 et seq.
 Tesar & Smolensky 1993, …, 2000*
 * See for how to exploit the analogy to ‘weighted OT’
(Goldsmith, today)
• If you hear A when you expected to hear E, increase
the Harmony of A above that of E by minimally
demoting each constraint violated by A below a
constraint violated by E
 Acquisition: Formal Result I
Constraint Demotion Algorithm
If you hear A when you expected to hear E, increase the
Harmony of A above that of E by minimally demoting each
constraint violated by A below a constraint violated by E
Correctly handles difficult case: multiple violations in E
in +
Candidates
possible
☹☞E
☺
☞
Mark
Faith
Faith
(NPA)
*
inpossible
A impossible
*
*
 Acquisition: Conceptual “Question”
Large Grammar Space
 “Huge number of grammars” — “OT is
too unrestrictive”
 Acquisition: Explanatory Goal
General Learning Theory
 Actually, OT achieves Explanatory Goal ②:
General Learning Theory: A theory-general,
UG-informed learning algorithm, provably
correct and efficient (under strong assumptions)
 Acquisition: Formal Result II
Learnability & the Initial State
 M ≫ F is learnable with
/in+possible/→impossible
• ‘not’ = in- except when followed by …
• “exception that proves the rule”: M = NPA
 M ≫ F is not learnable from data if there are
no ‘exceptions’ (alternations) of this sort, e.g.,
if no affixes and all underlying morphemes
have mp: M and F, no M vs. F conflict, no
evidence for their ranking
 Thus must have M ≫ F in the initial state, ℌ0
 Acquisition: Empirical Application
Initial State: Experimental Test
 Collaborators
 Peter Jusczyk
 Theresa Allocco
 (Elliott Moreton, Karen Arnold)
 Here, only a thumbnail sketch (more in
the OT Workshop Thursday)
 Acquisition: Empirical Application
Initial State: Experimental Test
 Linking hypothesis:
More harmonic phonological stimuli ⇒
Longer listening time
 More harmonic:
• M ≻ *M, when equal on F
• F ≻ *F, when equal on M
• When must chose one or the other, more
harmonic to satisfy M: M ≫ F
 M = Nasal Place Assimilation (NPA)
 Acquisition:
Empirical
Application
20
4.5 Months (NPA)
H igher H
Lower H
18
Higher Harmony
Lower Harmony
um…ber…umber
um…ber… iŋgu
p = .006 (11/16)
1 5 .3 6
16
Tim e (sec)
14
1 2 .3 1
12
10
8
6
4
2
0
Faithfulness
M arkedness
M ≫F
 Acquisition:
Empirical
Application
20
4.5 Months (NPA)
H igher H
Lower H
18
1 5 .3 6
Tim e (sec)
14
1 2 .3 1
Lower Harmony
um…ber…umber
un…ber…unber
p = .044 (11/16)
1 5 .2 3
16
Higher Harmony
1 2 .7 3
12
10
8
6
4
2
0
Faithfulness
M arkedness
M ≫F
 Acquisition:
Empirical
Application
20
4.5 Months (NPA)
H igher H
Lower H
18
1 5 .3 6
Tim e (sec)
14
un…ber…umber
1 5 .2 3
16
1 2 .3 1
* Markedness
 Faithfulness
 Markedness
* Faithfulness
un…ber…unber
???
1 2 .7 3
12
10
8
6
4
2
0
Faithfulness
M arkedness
M ≫F
 Acquisition:
Empirical
Application
20
Higher Harmony
un…ber…umber un…ber…unber
4.5 Months (NPA)
p = .001 (12/16)
H igher H
Lower H
18
1 5 .3 6
1 6 .7 5
1 5 .2 3
1 4 .0 1
16
Tim e (sec)
14
Lower Harmony
1 2 .3 1
1 2 .7 3
12
10
8
6
4
2
0
Faithfulness
M arkedness
M ≫F
 Acquisition: Jakobson’s Program
Markedness = Distance from Initial State
 X is universally more marked than Y ~
 In addition to the constraints M1, M2, …, Mk
violated by Y, X also violates markedness
constraints M1, M2, …, Mn
 Y will be acquired – become admitted into the
child’s inventory – after M1, M2, … Mn are all
demoted below relevant faithfulness constraints
 These demotions are all necessary for X to be
acquired, and additional demotions of M1, M2,
…, Mn are also required ~
 X will require more time to be acquired
Responsibilities of Grammatical Theory
Chomsky’s “Big 4” questions
concerning knowledge of grammar
①
Structure
②
Nativist hypothesis
Acquisition
①
OT
③
Processing
Neuro-genetics
Possible strong version – Explanatory Goal ③ :
Substantive structure (①) ③
of a UG module
governing phenomenon Φ
⇒
Processing theory — e.g.,
parsing algorithm — for
phenomenon Φ
General Processing Theory
 Processing: Formal Results
Context-Free Parsing Algorithm
Theorem (Tesar 1994, 1995b, a, 1996). Suppose
• Gen parses a string of input symbols into structures
specified via a context-free grammar
• Con constraints meet a tree-locality condition and
penalize empty structure
Then a given dynamic programming algorithm is
•
•
•
•
Left-to-right
General (any such Gen, Con)
Guaranteed to find the optimal outputs
As efficient as parsers for conventional context-free
grammars.
 Processing: Formal Results
Finite-State Parsing Algorithm
Theorem (Ellison 1994). Suppose
• Gen(I) is representable as a (non-deterministic) finitestate transducer (particular to I) mapping the input
string to a set of output candidates
• Con constraints are reducible to multiply-violable
binary constraints each representable as a finite-state
transducer mapping an output candidate to a
sequence of violation marks
Then composing the Gen(I) and rank-sequenced
constraint-transducers yields a transducer that
• Directly maps I to its optimal outputs
• Can be efficiently pruned by dynamic programming
 Processing: Formal Results
Complexity of Violable Constraints
Theorem (Frank and Satta 1998). Suppose
• Gen is representable as a (non-deterministic) finite-state
transducer mapping an input string to a set of output
candidates
• Con: the set of structures incurring n violations of each
constraint is generable by a finite-state machine, and n
can be finitely bounded for each constraint
Then the mapping from inputs to optimal outputs
has the complexity of a finite-state transducer.
Theorem (Hiller 1996, Smolensky 1997).
If n is unbounded there are (extremely simple) OT
grammars with greater computational complexity.
 Processing: Conceptual “Question”
Processing (Symbolic): Theory
 “Infinite candidate set uncomputable”
 Processing: Conceptual “Question”
Processing (Symbolic): Theory
 Actually, achieves Explanatory Goal ③ (computational)
Substantive structure (①) ③
of a UG module
governing phenomenon Φ
⇒
Processing theory — e.g.,
parsing algorithm — for
phenomenon Φ
General Processing Theory
 Processing: Empirical Application
Sentence Processing
 Because an OT grammar assigns a parse
to any input, no additional principles
(e.g., ‘parsing heuristics’) are needed for
parsing the initial, incomplete segment
of a sentence
 Linking hypothesis:
Processing difficulty arises when previously
established structure needs to be abandoned
in the face of further input
 Processing: Empirical Application
PP Attachment
The servant of the actress who…
NP
(Cuetos & Mitchell 88)
[Assuming who is ambiguous for Case.]

who [+nom] Violates: *NOM, LOCALITY2
NP
the
servant
PP

P
of
NP
who [+nom] Violates: *NOM, AGRCASE

the
actress [+gen]
who [+gen] Violates: *GEN
• LOCALITY: If XP c-commands YP, then XP precedes YP.
• AGRCASE: A relative pronoun must agree in Case with the modified NP.
• *CASE: *GEN ≫ *DAT ≫ *ACC ≫ *NOM (universal)
 Processing: Empirical Application
PP Attachment
The servant of the actress who…
NP
(Cuetos & Mitchell 88)

who [+nom] Violates: *NOM, LOCALITY2
NP
the
servant
PP

P
of
NP
who [+nom] Violates: *NOM, AGRCASE

the
actress [+gen]
who [+gen] Violates: *GEN
• If *GEN, AGRCASE ≫ LOCALITY2, then  : attach high
• If LOCALITY2 ≫ *GEN or AGRCASE, then   or : attach low
 Processing: Empirical Application
PP Attachment
 Preliminary result: A cross-linguistic
typology of PP attachment patterns
(across differences in case and
embedding depth)
 Empirically promising, but not perfect
 Unclear yet how rankings determining
parsing preferences relate to rankings in
the pure ‘competence grammar’
 Processing: Jakobson’s Program
Processing and Markedness
 Phonological analogy: Incrementally parse
C…V…C…
• /C/  [C•
]
• /CV/  [CV]
• /CVC/  [CV][C•
]
 Now ‘expect’ a V … if get it, no ‘reanalysis’
• But if get a C, need reanalysis  difficulty:
• /CVCC/  [CVC][C•
]
 Processing marked material (coda C) creates
difficulty because it is initially analyzed as
unmarked (as an onset)
 Processing: Conceptual “Question”
Processing (Symbolic): Theory
 “OT not psychologically plausible”
 Processing: Conceptual “Question”
Processing (Symbolic): Theory
 Actually, achieves Explanatory Goal ③
(empirical perspective): a competence theory
automatically entails an empirically fruitful
performance (processing) theory
Responsibilities of Grammatical Theory
Chomsky’s “Big 4” questions
concerning knowledge of grammar
①
Structure
②
①
Nativist hypothesis
Acquisition
OT
③
Processing
④
Neuro-genetics
Possible strong version –Explanatory Goal ④:
④
Neural network
Substantive structure (①)
instantiating M (nativism:
of a UG module M
with genetic encoding)
⇒
General Biological Realization
 Neuro-genetics: Formal Results
Neural Representations (Gen)
σ
k
æ
t
{ f i / ri } i
Activation patterns: cat and its constituents
i fi  ri
σ/rε
k/r0
æ/r01
t/r11
[σ k [æ t]]
-1
4
9
Unit (Area = activation level)
14
OT & Connectionism
 OT derives from the numerical
formalism, derived from connectionist
Harmony maximization, of
• Harmonic Grammar (Legendre, Miyata, &
Smolensky, 1990)
 Neuro-genetics: Formal Results
Neural Constraints (Con)
σ
NOCODA: A syllable has no coda
k
æ
t
*
violation
* H(a[σ k [æ t]) =
–sNOCODA < 0
W
a[σ k [æ t ]] *
 Neuro-genetics: Formal Results
UGenome for CV Theory
 The game: take a first shot at a concrete
example of a genetic encoding of UG in a
Language Acquisition Device
¿ Proteins ⇝ Universal grammatical principles ?
 Case study: Basic CV Syllable Theory
 Introduce an ‘abstract genome’ notion
parallel to (and encoding) ‘abstract neural
network’
 Collaborators
• Melanie Soderstrom
• Donald Mathis
 Neuro-genetics: Formal Results
Network Architecture
 /C1 C2/  [C1 V C2]
/C1
C
[
V
C1
V
C2
]
C2
/
 Neuro-genetics: Formal Results
PARSE
 All connection coefficients are +2
1
C
1
1
V
1
1
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
1
3
3
 Neuro-genetics: Formal Results
ONSET
 All connection coefficients are 1
C
V
 Neuro-genetics: Formal Results
Connectivity geometry
 Assume 3-d grid geometry
C
V
‘N’
‘E’
‘back’
 Neuro-genetics: Formal Results
Constraint: PARSE
 Input units grow south and connect
 Output units grow east and connect
 Correspondence units grow north & west
and connect with input & output units.
C
V

1
1

1

1
3
3

3
3

3
3
3
3
3
3
3
3
3

1
3
3
3

1

3
3
 Neuro-genetics: Formal Results
Connectivity Genome
 Contributions from ONSET and PARSE:
Source:
CI
VI
CO
Projections:
S LCC
S L VC
E L CC
 Key:
Direction
N(orth) S(outh)
E(ast) W(est)
F(ront) B(ack)
VO
CC
VC
xo
E L VC
N L CI N L VI S S VO
N&S S VO W L CO W L VO
N S x0
Extent
Target
L(ong) S(hort) Input: CI VI
Output: CO VO
x(0)
Corr: VC CC
 Neuro-genetics: Formal Results
Processing
[P1] ∝ s1
1
R1  c
0
1
1
w1  [ P1 ] R
 s1c
Φ
2
R2  c 
0
W =i wi
Ψ
 Neuro-genetics: Formal Results
Learning
(during phase P+; reverse during P )
[ P1 ]  K1
1
1
1
 K1  L1  c
L1  G
 c
Φ
2
2
[ P2 ]   K2   L2   G
 c
When  and  are simultaneously active,
1
1
si  [ P1 ]   K1  L1  G
 c
Ψ
 Neuro-genetics: Formal Results
Learning Behavior
 A simplified system can be solved
analytically
 Learning algorithm turns out to ≈
si() = e [# violations of constrainti P ]
Conclusion
OT is enabling progress on several
explanatory goals for linguistic theory
 Inherent typology
 General learning theory
 General processing theory
 General biological realization
Often, OT formalizes Jakobson’s program
Thank you for your attention