The Sounds and Shapes of Words

Download Report

Transcript The Sounds and Shapes of Words

The North American
Compuational Linguistics
Olympiad
Haven’t you always wanted to see
Bulgaria?
What NACLO Is
 A challenge where high school students
compete on the field of linguistic,
computational, and analytic battle.
 Based around using creativity and logic to
solve puzzles that involve human
languages and language technologies.
 No prior knowledge of linguistics or
computer science is required--all you need
is a brain.
Linguistic Olympiads
 This family of contests began in the Soviet
Union during the mid-1960s.
 Several years ago, and international
competition was started.
 Last year was the first year a US team
participated.
 Despite the newness of this competition in
North America, our team won! The top score
was from a member of the US team.
 Let’s continue to show the world that we are
smarter than our television programming and
test scores would seem to indicate!
NACLO Pittsburgh
 First Round:
 February 5, at CMU [location]
 Registration at 9:00, contest starts at 10:00
 3 hours of challenging puzzles.
 Second Round
 March 11, at CMU [location]
 Harder puzzles, higher stakes
 The best contestants in the second round
(nationwide) will be selected for the American
National Team
ILO
 The International Linguistic Olympiad
will be held in Bulgaria in late July.
 Pending budgetary approval, the basic
travel expenses of the US Team will be
paid by NACLO.
 The team will also attend a summer
training camp, either in the US or in
Bulgaria.
What Linguistics Is
Or, how, if somebody asks me
how many languages I speak one
more time, I will strangle her.
What is Linguistics?
 Linguistics is, ostensibly (SAT word!)
“the scientific study of language.”
 I would argue that this is technically
untrue, but it is a useful myth to
perpetuate because it makes us look
less silly when we ask the National
Science Foundation for grant money.
 What, then, is linguistics?
Linguists like languages, but
they love Language
 Language is fully of what I call “mundane
mysteries.”
 We use language every day, none of us really
knows for certain what we are doing when we
use it.
 Language use seems effortless for us, but it
involves knowing and following a broad range
of rules and conventions that govern...
 how to form words
 how to form sentences
 What kinds of words, sentences, and
pronunciations can be used in a particular context
Rules you know you know
 We know some rules of language:
 Don’t end sentences with prepositions (unless you
have nothing else to end them with).
 Only use a double negative if you mean a positive.
 What else?
 Most linguists are not particularly interested in
these rules.
 They will tell you that some of them were just
made up by bitter English teachers in order to
make the writing of essays less pleasurable.
Rules you know but don’t
know you know
 There are other rules, often more intricate,
that we all know as speakers of a language,
but of which we are not usually aware.
 Example: you cannot put reflexive pronouns
(himself, herself, itself, yourself, etc.) the same
places where you can put normal pronouns (him,
her, it you).
 Where do you put each kind?
 Not so easy, is it? You follow a rule 99% of the
time, without even being able to say what it is!
Linguistics is about saying it.
 Linguists try to make precise and falsifiable
statements about what the unspoken rules of
Language are, in many areas:






The patterns in language sounds.
The structure of words.
The structure of sentences.
The structure of discourse.
The social use of language.
The ways and reasons languages change.
Expletive Infixation
It's not just for swearing anymore.
Expletive Infixation
 Expletive infixation is a fancy name for
what you do when you put an expletive
(a swearword, curse, etc.) inside of
another word.
 You know these expletives, and it’s a
good thing, because we can’t say most of
them in a high school.
 We’ll use bloomin’ in order to downplay
our edgy, hipster image.
Examples
 Here are some examples of
expletive infixation:




Pennsyl-bloomin’-vania
Minne-bloomin’-sota
exo-bloomin’-skeleton
impe-bloomin’-cunious
Where does the expletive go?





California
Massachusetts
Alabama
Indiana
Based on these examples, where in
the word do you put the expletive?
Where does the expletive go?





Cà.li.fór.nia
Mà.ssa.chú.setts
À.la.bá.ma
Ìn.di.á.na
Based on these examples, where in
the word do you put the expletive?
But what about these?





Vermont
Nevada
biology
cohesion
macguyverism
But what about these?





Ver.mónt
Ne.vá.da
bi.ó.lo.gy
co.hé.sion
mac.gúy.ver.ism
Ah, but there are problems
 Where do expletive infixes go in the
following words?
 Cárrick
 Mífflin
 Téxas
 And what about these?
 Gréenfield
 Hómewood
 What's the problem here?
The story so far
 We try to place the expletive immediately
before a stressed syllable (but not
necessarily immediately before a stressed
vowel).
 We try to place the expletive after the first
syllable, though not always immediately
after the first syllable.
 Sometimes we cannot satisfy both of
these constraints.
The story so far
 When we can't do both, we have to
decide where to compromise.
 In what cases do we put the expletive at the
very beginning of a word?
 In what cases do we put the expletive before
an unstressed syllable?
 In compound words–words made out of
two smaller words—we seem to prefer
putting the word at the break between
smaller words.
A final puzzle
 Consider two final examples
 ìrrespónsible
 ùnrelíable
 Same number of syllables
 Same stress pattern
 s
= strong (stressed)
 w = weak (unstressed)
 swsww
A final puzzle
 Most speakers of English prefer:
 irre-bloomin'-sponsible
 un-bloomin'-reliable
 Why the difference?
 sw-EXPL-sww (what we expected)
 s-EXPL-wsww (not what we expected)
A final puzzle
 “Ah, ha!” you say, “unreliable has a
prefix, and the expletive goes between
the prefix and the rest of the word.
 Problem: irresponsible also has a prefix.
 responsible
 regular
 redeemable
irresponsible
irregular
irredeemable
 However, it turns out than un- and ir- are
different kinds of prefixes.
A final puzzle
 Doesn't change.







unbelievable
unmentionable
undecided
unnerving
unrealistic
unleavened
ungrateful
 Does change







imbalance
immobile
indefinite
innocuous
irreligious
illegible
ingratitude
Compare unnerving and innocuous: is the
n the same in both words, or is it longer in
one? Which one?
Expletive infixation and our
knowledge of language
 At first glance, expletive infixation looks
quite inane, and not terribly complicated.
 However, on closer examination, we find
that...
 We have clear intuitions about how to do it.
 These intuitions are based on a rather clear
set of rules, of which we are not consciously
aware.
 These rules intersect with others in intricate
ways.
Because Swahili is for
Learners
A puzzle to get started.
Match the Words
 Swahili






mbuzi
kibuzi
mgeni
jito
mtu
jitu
 English






‘man’
‘giant (large man)’
‘kid (young goat)’
‘goat’
‘big river’
‘guest’
Strategies
 Look for recurring elements:
 recurring sequences of sounds (“letters”)
 recurring aspects of meaning
 Group like with like.
 Assume the principle of least coincidence:
 When choosing between hypotheses (about how
to divide words up, about what parts mean, etc.)
choose the hypothesis that makes the fewest
patterns look accidental.
 The more patterns you are able to “factor out” of
the data by applying your hypothesis, the more
likely it is to be on the right track.
A Problem from a Mysterious
Language of Ancient Mongolia
Or, who did what to whose what?
Baby Steppes
 You will see some sentences from OrkhonoYeniseyan translated into English.
 Orkhono-Yeniseyan was a language anciently
spoken in parts of Central Asia.
 Scrolls containing the language were found in
Mongolia near the confluence of the Orkhon
and Yenisey rivers (with which, I assume, we
are all familiar), thus the name.
 You will figure out the meanings of the words,
and a little bit of the grammar, so that you can
translate two sentences from OY to English
and from English to OY (becuause you are
just that devious).
1. Oghuling baliqigh alti.
‘Ya’ll’s son conquered the city.’
To the left are the
2. Baz oghuligh yangilti.
to get
‘The vassal betrayed the son.’ sentences
you started. Here
are the sentences
3. Siz baliqimizin buzdingiz.
you should
‘Y’all destroyed our city.’
translate:
4. Qaghanimiz oghulingin yangilti.
1. Qaghan baliqigh alti
‘Our king betrayed y’all’s son.’
2. Men barqigh buzdim.
5. Oghulim barqingin buzdi
‘My son destroyed y’all’s house.’
3. The son conquered
your city.
6. Siz qaghanigh yangiltingiz.
‘Y’all betrayed the king.’
4. The king betrayed
the vassal.
7. Biz baliqigh altimiz.
‘We conquered the city.’
5. Ya’ll’s vassal
destroyed my house.
8. Bazim qaghanimizin yangilti.
‘My vassal betrayed our king.’
Conclusion
 If you thought that was fun (and to be
perfectly frank, it was), please join us on
February 5th, 2008, for the second
annual NACLO-Pittsburgh.
 For more information on NACLO, visit
our new website at:
http://www.naclo.cs.cmu.edu/