Transcript Document

Historical Linguistics
• Language history
– drift: change by internal development
– contact: change by external borrowing
• Possible relations among languages
– family tree:
• similarity due to separate development from common ancestor
– diffusion of traits
• similarity due to borrowing in period of contact
– or, no provable relationship
• Tasks of historical linguistics
– inference of historical connections
– reconstruction of “proto” languages
July 16, 2015
Colonial Philology
• Thomas Jefferson corresponded with
many sources to obtain word lists in
Indian languages
• Examined and compared the results of
Peter the Great’s Siberian expeditions
• Benjamin Franklin also collected Indian
word lists
July 16, 2015
How many ages have elapsed since the English, Dutch, the Germans, the
Swiss, the Norwegians, Danes and Swedes have separated from their
common stock? Yet how many more must elapse before the proofs of
their common origin, which exist in their several languages, will
disappear? It is to be lamented then … that we have suffered so many
of the Indian tribes already to extinguish, without our having previously
collected and deposited in the records of literature, the general
rudiments at least of the languages they spoke. Were vocabularies
formed of all the languages spoken in North and South America,
preserving their appellations of the most common objects in nature, of
those which must be present to every nation barbarous or civilised, with
the inflections of their nouns and verbs, their principles of regimen and
concord, and these deposited in all the public libraries, it would furnish
opportunities to those skilled in the languages of the old world to
compare them with these, now or at a future time, and hence to
construct the best evidence of the derivation of this part of the human
race.
Thomas Jefferson,
Notes on the State of Virginia. [Written1781-82].
July 16, 2015
Benjamin Barton sees a pattern
By a careful inspection of the vocabularies, the reader will find no
difficulty in discovering that in Asia the languages of the … tribes of
the Delaware-stock may be all traced to ONE COMMON SOURCE.
Nor do I limit this observation to the languages of the American
tribes just mentioned… HITHERTO, WE HAVE NOT DISCOVERED
IN AMERICA… ANY TWO, OR MORE LANGUAGES BETWEEN
WHICH WE ARE INCAPABLE OF DETECTING AFFINITIES (AND
THOSE VERY OFTEN STRIKING) EITHER IN AMERICAN, OR IN
THE OLD WORLD.
New Views of the Origin of the Tribes and Nations of America
Benjamin Smith Barton M.D., Professor of Materia Medica, Natural
History and Botany, in the University of Pennsylvania (1798)
July 16, 2015
Barton’s hypothesis:
My inquiries seem to render it probable, that all
the languages of the countries of America may
… be traced to one or two great stocks…
July 16, 2015
Jefferson disagreed:
…imperfect as is our knowledge of the tongues spoken in
America, it suffices to discover the following remarkable fact.
Arranging them under the radical ones to which they may be
palpably traced, and doing the same by those of the red men of
Asia, there will be found probably twenty in America, for one in
Asia, of those radical languages, so called because, if they were
ever the same, they have lost all resemblance to one another. A
separation into dialects may be the work of a few ages only, but
for two dialects to recede from one another till they have lost all
vestiges of their common origin, must require an immense course
of time; perhaps not less than many people give to the age of the
earth. A greater number of those radical changes of language
having taken place among the red men of America, proves them
of greater antiquity than those of Asia.
Notes on the State of Virginia [Written 1781-82]
July 16, 2015
though later,
J. considered a sociolinguistic explanation…
Having heard that some Indians considered it
dishonorable to use any language but their own,
he suggested that when a part of a tribe
separated itself, the seceded group might refuse
to use the original language and invent their own.
“Perhaps this hypothesis presents less difficulty
than that of so many radically distinct languages
preserved by such handfuls of men from an antiquity
so remote that no data we possess will enable us
to calculate it.” [Ms. notes circa 1800]
July 16, 2015
Jefferson’s plans
• By 1801, he had collected vocabularies
for dozens of indigenous languages
– and began to arrange this for publication
“lest by some accident it might be lost”
• He put off publication in 1803
– due to the opportunity to include the results
of the Lewis & Clark expedition
July 16, 2015
The sad end of J.’s linguistic career
• His linguistic papers were packed in a large
trunk and shipped back to Monticello in
1809 with his other effects
• The trunk was stolen during the trip up the
James River
– The disappointed thief dumped the contents in
the river
– Only a few items floated to shore and were
recovered
July 16, 2015
Jefferson to Barton (1809),
sent with Lewis’ vocabulary of Pani:
It is a specimen of the condition of the little that was
recovered. I am the more concerned at this accident,
as of the two hundred and fifty words of my
vocabularies, and the one hundred and thirty words
of the great Russian vocabularies … seventy three
were common to both, and would have furnished
materials… from which something might have resulted.
Perhaps I may make another attempt to collect,
although I am too old to expect to make much progress
in it.
July 16, 2015
Sir William (“Oriental”) Jones
• Lawyer appointed in 1783 to superintend
British jurisprudence in India
• Founded the Asiatic Society in Calcutta “for
Inquiring into the History, Civil and Natural,
the Antiquities, Arts, Sciences, and Literature,
of Asia”
• Learned Sanskrit because “the laws of the
natives must be preserved inviolate; but the
learning and vigilance of the English judge
must be a check upon the native interpreters”
July 16, 2015
One of the early European “orientalists”
– Cross-cultural pioneers?
– Agents of colonial domination?
July 16, 2015
Historical Context
• The British in India
– piecemeal conquest 1750-1900
• began with trade concessions in Calcutta and Bombay
• expanded one principality at a time
– mixture of direct and indirect rule
• many Indian institutions left in place
• rule mainly administered and enforced by Indians
– until 1850s, administration was in the hands of the
East India Company rather than the British Crown
July 16, 2015
India in 1785
July 16, 2015
Jones learns Sanskrit (1783-1786)
• Sanskrit
– Language of Hindu holy texts (1000 BC)
– Formalized by grammarians c. 600 BC
– Preserved to the present day as a
language of religion and learning
• No Brahman would teach a foreigner
– Jones hired a vaidya (doctor) as tutor while
the Brahmanic scholars were away on a
religious retreat
July 16, 2015
Jones’ Third Discourse (1786)
• Anniversary addresses to the Asiatic Society
– First Discourse: purposes and procedures of the Society
– Second Discourse: a detailed research program
– Third Discourse: on the nations of Asia
The five principal nations, who have in different ages divided among
themselves, as a kind of inheritance, the vast continent of Asia, with
the many islands depending on it, are the Indians, the Chinese, the
Tartars, the Arabs, and the Persians; who they severally were, whence
and when they came, where they now are settled, and what advantage
a more perfect knowledge of them all may bring to our European world,
will be shown, I trust, in five distinct essays; the last of which will
demonstrate the connexion or diversity between then, and solve the great
problem, whether they had any common origin, and whether that origin
was the same, which we generally ascribe to them.
July 16, 2015
The Indo-European Hypothesis
The Sanskrit language, whatever be its antiquity, is of a
wonderful structure; more perfect than the Greek; more
copious than the Latin, and more exquisitely refined than
either, yet bearing to both of them a stronger affinity, both
in the roots of verbs and in the forms of grammar, than
could possibly have been produced by accident; so
strong indeed, that no philologer could examine them all
three, without believing them to have sprung from some
common source, which, perhaps, no longer exists; there
is a similar reason, though not quite so forcible, for
supposing that both the Gothick and the Celtick, though
blended with a very different idiom, had the same origin
with the Sanskrit, and the old Persian might be added to
the same family.
July 16, 2015
Jones’ American connection
• Jones was a radical Whig and an early political
supporter of the American Revolution
• Met Benjamin Franklin at the RS in 1771
• Visited Franklin in Paris in 1779, 1780, and 1782
– To explore compromise peace plans
– To deal with a client’s property claims in Virginia
– To obtain a pass for travel to America
• considered emigration to Charleston or Philadelphia!
• Many weeks of political and philosophical
conversations
• Indirect communication with Jefferson
Relations to the Virginia manuscript?
July 16, 2015
Indo-European Examples
English
father
brother
Latin
pater
frater
Greek
patêr
phrater
Sanskrit
pitar
bhratar
(fellow tribesman)
two
three
four
seven
July 16, 2015
duo
tres
quattuor
septem
duo
treis
tettares
hepta
dva
tryas
catvaras
sapta
Jones’ methods
• Analyst must be “perfectly acquainted” with the
languages compared
• Meanings of proposed cognates must be nearly
identical
• Vowels should not be disregarded
• No metathesis or unexplained consonant insertions
• Transliterations must be systematic and careful
• Use basic vocabulary, not exotic words more likely to
be borrowed
July 16, 2015
Remember Barton
By a careful inspection of the vocabularies, the reader will find no
difficulty in discovering that in Asia the languages of the … tribes of
the Delaware-stock may be all traced to ONE COMMON SOURCE.
Nor do I limit this observation to the languages of the American
tribes just mentioned… HITHERTO, WE HAVE NOT DISCOVERED
IN AMERICA… ANY TWO, OR MORE LANGUAGES BETWEEN
WHICH WE ARE INCAPABLE OF DETECTING AFFINITIES (AND
THOSE VERY OFTEN STRIKING) EITHER IN AMERICAN, OR IN
THE OLD WORLD.
New Views of the Origin of the Tribes and Nations of America
Benjamin Smith Barton M.D., Professor of Materia Medica, Natural
History and Botany, in the University of Pennsylvania (1798)
July 16, 2015
Thomas Jefferson again:
…imperfect as is our knowledge of the tongues spoken in
America, it suffices to discover the following remarkable fact.
Arranging them under the radical ones to which they may be
palpably traced, and doing the same by those of the red men of
Asia, there will be found probably twenty in America, for one in
Asia, of those radical languages, so called because, if they were
ever the same, they have lost all resemblance to one another. A
separation into dialects may be the work of a few ages only, but
for two dialects to recede from one another till they have lost all
vestiges of their common origin, must require an immense course
of time; perhaps not less than many people give to the age of the
earth. A greater number of those radical changes of language
having taken place among the red men of America, proves them
of greater antiquity than those of Asia.
Notes on the State of Virginia, 1787
July 16, 2015
The controversy continues
• (Like Barton) Joseph Greenberg (1987):
– All American languages in three groups:
• Eskimo-Aleut
• Na-Dene
• Amerind
• (Like Jefferson) Other scholars:
– The Amerind category is a fiction
– There are
• ~60 unrelated families in N. America
• ~19 unrelated families in C. America
• ~80 unrelated families in S. America
July 16, 2015
Different methods
• Mass comparison
– Cognate ratios (lexicostatistics)
– Glottochronology
– Typological features
• e.g. classifier systems
• Comparative reconstruction
– Determination of systematic sound laws
– Lexical and morphological reconstruction
July 16, 2015
“Laws” of sound change
• Meaning change is usually sporadic
• Sound change is usually systematic, e.g.
– t/d deletion (best, past, lost, etc.)
– short a raising (camera, man, vanish, etc.)
• “Neogrammarian hypothesis” (1870):
– All sound change is systematic
– Apparent exceptions: analysis is incomplete
– Article of faith with scholars known as
“the young grammarians”
July 16, 2015
Grimm’s Law
• Jakob Grimm (1822)
• Gradation of consonant manner
bh dh gh -> b d g
b d g -> p t k
p t k -> f th h
pater father
tres three
canis hound
labium
duo
ager
bhratar brother
dha
do
vah
wagon
July 16, 2015
lip
two
acre
Verner’s Law
• Karl Adolf Verner (1875)
• Fixes “gaps” in Grimm’s Law:
– voicing after accentless vowels
– applies to non-Grimm’s Law cases as well
– from PIE to Gothic in four algorithmic steps:
PIE
GL
(vowels)
VL
AS
July 16, 2015
p@tér
f@thér
fathár
fadár
fádar
More on sound change
• Well attested in recent history
– I.e. English Great Vowel Shift
• Can study sound change in progress today
• Tends to produce tree-like histories.
operates on the system as a whole
isn’t easily borrowed across languages
July 16, 2015
Problems with comparative reconstruction
• Requires detailed knowledge of
languages involved
• Must be enough cognates for patterns
to emerge
– and layers of borrowing to be identified and
discarded
• Maximum time depth of 5-10K years
– (Jefferson was right)
July 16, 2015
Cognate percentages
• Catherine the Great’s method
– make a list of appellations of the most common
objects in nature, of those which must be present
to every nation barbarous or civilised
• Standard lists devised by Morris Swadesh around 1950
– For each pair of languages, estimate the
proportion of cognate words
• Raw result is a table of percentages
– like a table of trip distances
July 16, 2015
Example
Gunu [two lists]
82
Elip
85
90
Mmala [two lists]
78
90
89
Yangben[two lists]
77
81
81
88
Baca [two lists]
66
72
72
77
78
Mbule [two lists]
58
63
64
66
70
69
Bati
42
41
42
42
42
46
45
Hijuk [two lists]
39
38
41
38
37
40
41
88
Basaa
Central Yambasa languages (Cameroon)
July 16, 2015
Questions about lexicostatistics
• “Genetic descent” vs. borrowing
– borrowing creates non-tree structures
• Variability of rate of change
– Swadesh: 14% per millenium
• Expected rate of false cognates
• How to combine with other evidence
• Inference of tree structure
– from cognate percentages
– from detailed account of shared traits
July 16, 2015
Historical inference
from linguistic and genetic data
Potentially “…the best evidence of the
derivation of … the human race”
(Thomas Jefferson)
BUT
Inferences are complex
methods and results from several disciplines
Intellectual stakes are high
Work has often been careless
sometimes spectacularly so
dangers of overinterpretation and “scientism”
July 16, 2015
General methodological problems
• Not all graphs are trees
– “treeness” tests often left out
– “treeness” hypothesis can often be rejected
• Tree inference may be underdetermined
– Branching structure
– Root choice
• Rates of change may not be constant
– for different markers
– across time
• Gene trees (and language trees) may not be
population trees
• Biology and language are complicated
– simplifying assumptions are sometimes perniciously
mistaken
July 16, 2015
Trees vs. Clines (etc.)
• A tree structure represents the results of a
sequence of splits in population (or language)
– no further influences among separate branches
– if rates of change are constant, distances should
be quantized
• Within an interbreeding (intercommunicating)
population, distances reflect the amount of
gene flow (transmission of linguistic traits)
– should correlate strongly with accessibility
– e.g. geographical distance in the simplest case
July 16, 2015
July 16, 2015
The… procedures outlined here provide a rigorous method for
inferring whether the geographical pattern of variation is consistent
with an historical split (fragmentation) or no split(recurrent gene
flow) using criteria that are completely explicit. For example, in
analyzing the mtDNA of tiger salamanders, a clear split into eastern
and western lineages was detected for mtDNA. Using the same
explicit criteria, there was no split among any human populations.
Quite the contrary, the present analysis documents recurrent and
continual genetic interchange among all Old World human
populations throughout the entire time period marked by mt DNA.
Accordingly, estimating a date for a 'split' of Africans from nonAfricans based on evidence from mtDNA is certainly allowed by
many computer programs, but the results are meaningless because
a date is being assigned to an 'event' that never occurred.
Templeton (1997)
July 16, 2015
Methods for tree inference
(“phylogeny”)
• Two general approaches
– clustering (easier but cruder)
– generate and evaluate alternative trees
• Distance-based methods
– based on matrix of distances/similarities
• Parsimony
– based on set of partly-shared characters or traits
http://evolution.genetics.washington.edu/phylip/software.html
documents 193 different phylogeny packages
July 16, 2015
Cognate percentages
for 8 Vanuatu languages
Toga
64 Mosina
64 58 Peterara
57 51 65 Nduindui
29 28 34 32 Sakao
51 45 55 52 40 Malo
39 39 45 41 43 50 Fortsenal
52 48 57 60 31 48 45 Raga
Data from Guy (1994)
July 16, 2015
Reconstruction Algorithm
(Guy 1994)
“A message is input at the root of a tree-shaped transmission
network, whence it is transmitted to the terminal nodes. As they
travel, copies of the original message are affected by errors
consisting in randomly selected segments of the message being
replaced by other segments randomly drawn from a pool of
possible segments (the "alphabet“ of the message). The
problem is: from the garbled versions of the original message
collected at the terminal nodes, reconstruct the network and the
history of the transmission of the message.”
“Additive-distance” tree with weights on branches rather
than on nodes -- doesn’t assume constant rate of change…
July 16, 2015
Explanatory force of the model
• Set of distances grows as
N N
2
2
• Set of binary-tree branch labels
grows as 2( N  1)
• For 8 languages: we predict 28 numbers
(the inter-language cognate proportions)
with 14 numbers
(the binary tree branch proportions)
July 16, 2015
Inferred tree
Toga
Mosina
Peterara
Nduindui
Raga
Sakao
Fortsenal
Malo
-830-----:-919-----:-972-----:-947-----:
-770-----'
|
|
|
-----829-----------'
|
|
-----795-----------:-949-----'
|
-----755-----------'
|
-----567-----------:-883-----:-895-----'
-----759-----------'
|
----------772----------------'
Mosina/Toga:
.77*.83 = .6391 (really 64%)
Peterara/Mosina: .829*.919*.77 = .5866 (really 58%)
Peterara/Toga:
.829*.919*.830 = .6323 (really 64%)
from Guy (1994)
July 16, 2015
True - predicted
cognate percentages
Toga
0 Mosina
1 -1 Peterara
1 -1
4 Nduindui
-2 -1
0
0 Sakao
2
0
2
3
1 Malo
-3
0 -1 -2
0 -2 Fortsenal
-1 -1 -1
0
1
1
4 Raga
The model fits very well!
July 16, 2015
Where’s the root?
Isn’t it obvious?
Toga
Mosina
Peterara
Nduindui
Raga
Sakao
Fortsenal
Malo
-830-----:-919-----:-972-----:-947-----:--Protolanguage
-770-----'
|
|
|
-----829-----------'
|
|
-----795-----------:-949-----'
|
-----755-----------'
|
-----567-----------:-883-----:-895-----'
-----759-----------'
|
----------772----------------'
July 16, 2015
Oops: other options
protolanguage
Toga
Mosina
Peterara
Nduindui
Raga
Sakao
Fortsenal
Malo
July 16, 2015
-830-----:-919-----:-972-----:-947-----:
-770-----'
|
|
|
-----829-----------'
|
|
-----795-----------:-949-----'
|
-----755-----------'
|
-----567-----------:-883-----:-895-----'
-----759-----------'
|
----------772----------------'
And some more…
protolanguage
Toga
Mosina
Peterara
Nduindui
Raga
-830-:-919-:-972-:-947-:-895-:-883-:-567- Sakao
-770-'
|
|
|
`-759- Fortsenal
-----829---'
|
`---772----- Malo
-----795---:-949-'
-----755---'
In the absence of other constraints, the root can be placed anywhere
in the tree without changing the model’s fit!
July 16, 2015
Possible “other constraints”
• Historical evidence
– about earlier forms
– about structure of relationships among
contemporary forms
• “outgroup”
• Constraints on rate of change
– linguistic (or genetic) “clock”
July 16, 2015
A universal constant
for glottochronology?
Thirteen sets of data, presented in partial justification of
these assumptions, serve as a basis for calculating a
universal constant to express the average rate of
retention k of the basic-root morphemes:
k = 0.8048 ± 0.0176 per millennium,
with a confidence limit of 90%.
Lees (1953)
July 16, 2015
Some of Lees’ data:
Language
Years
Words
Cognates
Rate
(per millenium)
English
1000
209
160
.766
Latin/Spanish
1800
200
131
.790
Latin/French
1850
200
125
.776
German
1100
214
180
.854
Middle Egyptian/ 2200
Coptic
200
106
.760
Greek
2070
213
147
.836
Chinese
1000
210
167
.795
Swedish
1050
207
176
.853
July 16, 2015
Some more retentive languages
(rates per 1000 years)
Language
100-word list
200-word list
Icelandic (rural)
99%
97.6%
Icelandic (urban)
98%
96.2%
Georgian
96.5%
89.9%
Amenian
97.8%
94%
Bergsland & Vogt (1962)
July 16, 2015
Some less retentive ones
Bergsland & Vogt estimate of vocabulary retention in East
Greenlandic as .722 in 600 years, or .34 per millenium.
David Lithgow (pers. com. circa 1970) has observed a
replacement of some 20% of the basic vocabulary in
Muyuw (Woodlark island) in one generation. Raise 0.8
to the 33rd power, and that gives you the retention rate
of Muyuw per 1000 years should it continue to evolve
at that rate: 0.06%.
Jacques Guy (1994)
July 16, 2015
“Language chains”
A
.77 B
.65 .76 C
Configurations like this are taken as prima facie evidence of
“non-treeness”, to be attributed to borrowing/mixing/cline
types of situations. But in fact they can also easily be generated
by variable rates of change:
A ----------- 90% -----------.
|____ protolanguage
B ---- 95% ----.
|
|---- 90% ----'
C ---- 80% ----'
Note that the required difference in mean rate of change
is only (.9-.9*.8)/.9 = .2 , or 20%
July 16, 2015
Mitochondrial Genome
July 16, 2015
Mitochondrial family tree
July 16, 2015
Mitochondrial phylogeny
July 16, 2015
Three fascinating “results”
• Mitochrondrial Eve
• Mitochrondial Clans
• The three-wave theory: converging
linguistic and genetic evidence
July 16, 2015
Mitochondrial Eve
Cann, Stoneking, and Wilson (1987):
mtDNA comparisons of 147 people from
Europe, Africa, Asia, Australia, and new
Guinea show that all present human
mtDNA is descended from a single
African woman who lived about 200,000
years ago.
July 16, 2015
First problem
• Computer program was used to find a tree
consistent with the mtDNA data
• But so were many other (unreported)
trees!
– order of answers depended on order of data
– root could be effectively anywhere in the
dataset
• e.g. Melanesian Eve, Asian Eve, European Eve…
July 16, 2015
Other problems
• mtDNA may not change at a constant rate
• mtDNA changes may be adaptive
• Gene trees may not be population trees
– DNA (including mtDNA) can spread by
gradual flow or by range expansion
– spread can be influenced by other factors
July 16, 2015
Early results: Native Americans come from four genetic lineages,
labeled A through D.
Amerinds have all four lineages,
NaDene only A, and Eskaleuts A and D.
Current results:
The four mtDNA lineages divide into nine distinct genetic subtypes.
All four lineages are in all three language groups.
Many local populations have all four lineages and a number even have
all the subtypes.
All subtypes can be found in North, Central and South America.
“It isn't realistic to believe that the same lineages ended up in all these
populations across two continents by separate migrations."
July 16, 2015
http://www.oxfordancestors.com/:
Oxford Ancestors
We put the Genes in Genealogy
Oxford Ancestors is the World's first organization to harness
the power and precision of modern DNA- based genetics in
the service of genealogy.
MatriLine™ interprets your deep maternal ancestry, linking
you - if your roots are in Europe - to one of seven women:
Ursula, Tara, Helena, Katrine, Velda, Xenia or Jasmine.
July 16, 2015
July 16, 2015
And MtDNA inheritance
may not even be entirely clonal!
• Mice
– demonstration of “paternal leakage”
• Hagelberg
– rare mtDNA mutation in Vanuatu
• Erye-Walker
– statistics of mtDNA “homoplasies”
July 16, 2015
Island evidence
• Erika Hagelberg (Proc. R. Soc. 1999)
– Island of Nguna (Vanuatu, Melanesia)
– 3 main MtDNA population groups
• as expected for the region
– In all three groups, the same mutation is
sometimes found
• previously known only from one Northern European
– Repeated chance mutation is unlikely
• local spread by recombination seems more probable
July 16, 2015
Statistics of mtDNA “homoplasies”
• Mutations that occur in different mtDNA
haplogroups around the world
• Assuming purely maternal inheritance, these
were thought to represent chance recurrence of
mutations in “hypervariable” regions
• Eyre-Walker et al. (Proc. R. Soc. 1999):
– regions are not statistically more variable than others
– mutations cluster geographically
• MacCauley (1999) counters
– much of the result comes from a dataset that may be
errorful
– “no need to panic”
July 16, 2015
Reaction of another mtDNA afficionado:
…I am reminded of a comment by a bishop’s wife
in Victorian England, also concerning human origins:
“Let us hope that it isn’t true, and if it is, that it will
not become generally known.”
July 16, 2015