General Morphology Thoughts

Download Report

Transcript General Morphology Thoughts

Morphology, Part 1
September 24, 2010
For Starters
•
The “Turing Test”
•
Conceived by the English
mathematician/philosopher
Alan Turing (1912-1954).
•
Turing developed much of the
theoretical groundwork for
modern-day computing machines.
•
He also worked on cracking
enemy codes during World War II.
• The Turing Test: don’t ask whether or not a machine can
“think”; ask whether or not it can fool someone into thinking
it’s human in a natural language conversation.
• Check out ELIZA: http://psych.fullerton.edu/mbirnbaum/psych101/Eliza.htm
Moving On…
• First: a Simpsons-based Quick Write
• Second: review of concepts from last time…
• The set of rules that we know for creating sentences in a
language is the grammar of that language.
• The rules of grammar that we know are very abstract.
(patterns of patterns)
• Strings of words which do not adhere to these rules are
ungrammatical.
• Q: If these rules are so abstract, how did we figure out
what they are?
• How do we learn language?
Beneath the Surface
• Note: we learn the language that we hear as we grow up,
but…
• We never hear the rules.
• We can only learn from examples.
• Our knowledge of language is sub-conscious.
• Analogy: driving a car.
• This knowledge is difficult to characterize.
• (It is not explicitly taught to us.)
How is that possible?
• Theory: language acquisition is so hard that we can’t do it
by just observing other language users.
• (we need help)
• (Chomsky) Claim: every human being has a “Language
Acquisition Device” (LAD)
• LAD = innate knowledge of language.
• The LAD helps us learn language as we grow up.
• Interacts with experience.
Predictions
•
The LAD theory makes some important predictions.
1. Universal Grammar (UG)
• All languages should share certain features in
common
• …due to the workings of LAD.
• A basic example:
• All languages have nouns and verbs.
2. Poverty of the Stimulus
• There should be properties of language that people
“know” without ever having experienced them.
An Impoverished Example
• How do you turn the following sentence into a yes/no
question?
• The boy who is sleeping is dreaming of a new car.
• = Is the boy who is sleeping dreaming of a new car?
• Not: *Is the boy who sleeping is dreaming of a new car?
• “The boy” is linked to the second “is”.
• Kids understand this connection without ever being
taught about the link.
• They never form the question the wrong way.
• Think: baby turtles crawling towards the ocean.
Half of the Story
• Recall that we can be creative with language because:
• We know the rules for putting sounds and words
together to form sentences.
• Patterns (Sentence = Noun + Verb)
• Patterns of Patterns (Recursive sentences)
• These rules = the grammar of the language we know.
• Q: What else do we need to know to be a competent
speaker of a language?
The Rest of the Story
• We need to know what units can be put together by the
rules of grammar.
• Including: the units of a sentence
• color, green, idea, sleep, furious, brown, dog, odor,
bark, angry, large, lizard...
• These units = the lexicon of the language we know
• From Ancient Greek: lexikon “dictionary”
• lexis = “word”
• Remember: language is discrete.
Knowledge of Language
Grammar
Lexicon
RULES
UNITS
1) Sentence = Noun + Verb
etc.
1) ragamuffin (N)
2) rotund (Adj)
3) rutabaga (N)
etc.
What’s in the Lexicon?
• Generally speaking, the lexicon contains:
• all the words in the language you know
• the building blocks of grammatical sentences
• Note, however:
• not only do lexical items differ from language to
language: (tree, Baum, arbre)
• …but one person’s lexicon might be different from
another’s
• It also happens to be a bit tricky to define exactly what a
“word” is…
Words, words, words
• Here’s a working definition--words are the smallest free
form elements of language:
• They do not have to occur in a fixed position with
respect to their neighbors.
• Example words:
bird
cycle
talk
happy
birds
recycle
talked
happiness
“-ed”
“-ness”
• Example “non-words”:
“-s”
“re-”
• The “non-words” cannot stand on their own-• They have to be attached to something else.
Morphemes
• Words consist of one or more morphemes.
• Morphemes
• = the smallest meaningful unit of speech
• = a string of sound(s) that carries some information
about meaning or function.
• An example (non-word) morpheme: [-s] = plural marker
• Note the pattern:
bird
birds
dog
dogs
cat
cats
cow
cows
...etc.
Plural Formation
• Plural nouns in English are formed by rule:
Singular noun + [-s]  Plural noun
• So: plural nouns contain two morphemes:
• the singular noun (e.g., “bird”)
• the plural marker (e.g., “s”)
• The rule for putting them together is a word-formation
rule.
• Q: Are “bird” and “birds” two different words?
• Do we need two different entries for them in the
lexicon?
Language Model, version 2.0
Grammar
Lexicon
RULES
MORPHEMES
[bird]
Word-formation rules
Singular N+ /-s/  Plural N
[-s]
Morpheme Types
• Free morpheme: a morpheme that can stand on its own
• bird
toast
• cycle
happy
• Bound morpheme: a morpheme that must attach to
another morpheme
• -s
-er
• re-
-ness
• Another distinction:
• simple words contain only one morpheme
• complex words contain more than one morpheme
Simple and Complex
simple
complex
Language Model, version 3.0
Grammar
Lexicon
RULES
MORPHEMES
Bound
Free
Word-formation rules
Singular N+ /-s/  Plural N
[-s]
[bird]
[re-]
[cycle]
Roots and Affixes
• Bound morphemes are also known as affixes
• Affixes attach to roots in word-formation rules
• Ex. 1: “birds”
• root = [bird] + affix = [-s]
• Ex. 2: “recycle”
• affix = [re-] + root = [cycle]
• Affixes which precede the root are known as prefixes
• Affixes which follow the root are known as suffixes
Infixes
• When affixes are inserted into the middle of a root, they
are known as infixes.
Bontoc (Phillippines):
fikas “strong”
fumikas
“to be strong”
kilad “red”
kumilad
“to be red”
fusul “enemy”
fumusul
“to be an enemy”
• Can this sort of thing happen in English?
• Abso-freakin’-lutely!
• (but it’s not particularly common)
Circumfixes
• In some languages, there are even circumfixes.
• Circumfixes attach to both sides the root.
Chokma (Oklahoma)
chokma
“he is good” ikchokmo
“he isn’t good”
lakna
“it is yellow” iklakno
“it isn’t yellow”
palli
“it is hot”
ikpallo
“it isn’t hot”
“love” (root)
geliebt
“loved”
frag- “ask” (root)
gefragt
“asked”
German
lieb-
Hand in Hand
• Note: affixes are always bound morphemes.
• In English, roots tend to be free morphemes.
• However, this is not always the case-• For instance: blueberry, blackberry…
• but: cranberry, huckleberry, raspberry.
• What do [cran-], [huckle-] and [rasp-] mean?
• Bound roots in English are called cranberry morphemes
• (technical term)
Cranberry Morphemes
• Cranberry morphemes are bound root morphemes.
• They have no independent meaning.
• They also have no parts of speech
• Some more examples:
• Also: the liberation of cran?
• perceive, receive, deceive
• -ceive?
• infer, refer, defer
• -fer?
• commit, permit, submit
• -mit?