Transcript Document

The Evolution of Colour Terms
Explaining Typology
Mike Dowman
Language, Evolution and Computation Research Unit,
University of Edinburgh
3 September, 2005
Colour Term Typology
There are clear typological patterns in how
languages name colour.
 neurophysiology of vision system
 or cultural explanation?
• Constraints on learnable languages
• or an evolutionary process?
Basic Colour Terms
Most studies look at a subset of all colour
terms:
• Terms must be psychologically salient
• Known by all speakers
• Meanings are not predictable from the
meanings of their parts
• Don’t name a subset of colours named by
another term
Number of Basic Terms
English has red, orange, yellow, green, blue,
purple, pink, brown, grey, black and white.
crimson, blonde, taupe are not basic.
All languages have 2 to 11 basic terms
• Except Russian and Hungarian
Prototypes
Colour terms have good and marginal
examples  prototype categories
• People disagree about the boundaries of
colour word denotations
• But agree on the best examples – the
prototypes
Berlin and Kay (1969) found that this was
true both within and across languages.
World Colour Survey
110 minor languages (Kay, Berlin, Merrifield,
1991; Kay et al 1997; Kay and Maffi, 1999)
• All surveyed using Munsell arrays
Black, white, red, yellow, green and blue
seem to be fundamental colours
• They are more predictable than derived
terms (orange, purple, pink, brown and
grey)
Evolutionary Trajectories
white-red-yellow +
black-green-blue
white + red-yellow +
black-green-blue
white + red-yellow +
black + green-blue
white + red + yellow +
black + green-blue
white + red + yellow + black +
green + blue
white + red + yellow +
black-green-blue
white + red + yellow + green +
black-blue
white + red +
yellow-green-blue + black
white + red + yellow-green + blue +
black
Derived Terms
• Brown and purple terms often occur
together with green-blue composites
• Orange and pink terms don’t usually occur
unless green and blue are separate
• But sometimes orange occurs without
purple
• Grey is unpredictable
• No attested turquoise or lime basic terms
Exceptions and Problems
• 83% of languages on main line of trajectory
• 25 languages were in transition between stages
• 6 languages didn’t fit trajectories at all
 Kuku-Yalanji (Australia) has no consistent term
for green
 Waorani (Ecuador) has a yellow-white term that
does not include red
 Gunu (Cameroon) contains a black-green-blue
composite and a separate blue term
Neurophysiology and Unique Hues
Red and green, yellow and blue are opposite
colours
De Valois and Jacobs (1968):
 There are cells in the retina that respond
maximally to either one of the unique hues,
black or white
Heider (1971):
 The unique hues are especially salient
psychologically
Tony Belpaeme (2002)
• Ten artificial people
• Colour categories represented with
adaptive networks
• CIE-LAB colour space used (red-green,
yellow-blue, light-dark)
• Agents try to distinguish target from
context colours (the guessing game).
• Correction given in case of failure
Emergent Languages
• Coherent colour categories emerged that
were shared by all the artificial people
• Colour space divided into a number of
regions – each named by a different colour
word
• But some variation between speakers
 No explanation of Typology
Belpaeme and Bleys (2005)
Colour terms represented using points in the
colour space
Colours chosen from natural scenes, or at random
 Few highly saturated colours
Emergent colour categories tend to be clustered at
certain points in the colour space
Similarity with WCS was greatest when both
natural colours were used and communication
was simulated
Colour Space in Bayesian
Acquisitional Model
red - 7
orange
purple
yellow - 19
blue - 30
green - 26
Possible Hypotheses
low probability
hypothesis
high probability
hypothesis
medium probability
hypothesis
Equations
Bayes’ Rule
P ( d | h) P ( h)
P( h | d ) 
P( d )
Probability of an accurate example at
colour c within h if hypothesis h is correct
Rc
Rh
Probability of an erroneous example at
colour c
Rc
Rt
Rc is probability of remember an example at colour c
Rh is sum of Rc for all c in hypothesis h
Rt is sum of Rc for whole of the colour space
Probability of the data
Problem – we don’t know which examples are accurate
p is the probability for each example that it is accurate
e is an example
E is the set of all examples
Probability for examples outside of
hypothesis (must be inaccurate)
Probability for examples inside of
hypothesis (may be accurate or
inaccurate)
P(d | h)   P(e | h)
eE
(1  p) Rc
P(e | h) 
Rt
pRc (1  p) Rc
P ( e | h) 

Rh
Rt
P(d )   P(h) P(d | h)
hH
Hypothesis Averaging
Substituting into Bayes’ rule:
P(h) P(d | h)
P( h | d ) 

 P(hi ) P(d | hi )
hi H
P ( d | h)
 P(d | hi )
hi H
We really want to know the probability that
each colour can be denoted by the colour
term
So, sum probabilities for all hypotheses
that include the colour in their denotation
Doing this for all colours produces fuzzy
sets
Urdu
1
0.8
Nila
Hara
0.6
Banafshai
0.4
Lal
Pila
0.2
0
Hue (red at left to purple at right)
Unique Hues
Start
A speaker is chosen.
Evolutionary
Model
A hearer is chosen.
A colour is chosen.
Yes (P=0.001)
Decide whether
speaker will be
creative.
The Speaker makes up a new
word to label the colour.
No (P=0.999)
The speaker says the word which they think is most likely
to be a correct label for the colour based on all the
examples that they have observed so far.
The hearer hears the word, and remembers
the corresponding colour. This example will be
used to determine the word to choose, when it
is the hearer’s turn to be the speaker.
Evolutionary Simulations
• Average lifespan (number of colour
examples remembered) set at:
18, 20, 22, 24, 25, 27, 30, 35, 40, 50, 60, 70,
80, 90, 100, 110 or 120
• 25 simulation runs in each condition
Languages spoken at end analysed
• Only people over half average lifespan
included
• Only terms for which at least 4 examples
had been remembered were considered
Analyzing the Results
Speakers didn’t have identical languages
 Criteria needed to classify language
spoken in each simulation
• For each person, terms classified as red,
yellow, green, blue, purple, orange, lime,
turquoise or a composite (e.g. blue-green)
• Terms must be known by most adults
• Classification favoured by the most people
chosen
Typological Results
25
20
WCS
Simulations
15
10
G-B-R
Y-G-B
R-Y-G
B-R
G-B
Y-G
R-Y
Blue
Green
0
Yellow
5
Red
Percent of terms of this type
30
Type of colour term
Percentage of Color Terms of each type in the
Simulations and the World Color Survey
Derived Terms
•
•
•
•
80 purple terms
20 orange terms
0 turquoise terms
4 lime terms
Divergence from Trajectories
• 1 Blue-Red term
• 1 Red-Yellow-Green term
• 3 Green-Blue-Red terms
Most emergent systems fitted trajectories:
• 340 languages fitted trajectories
• 9 contained unattested color terms
• 35 had no consistent name for a unique hue
• 37 had an extra term
0
Type of colour term
G-B-R
Y-G-B
R-Y-G
B-R
G-B
Y-G
R-Y
Blue
Green
Yellow
Red
Percent of terms of
this type
Adding Random Noise
30
25
20
15
10
5
WCS
No noise
50% noise
Derived terms with noise
•
•
•
•
60.6% purple
26.8% orange
0.3% turquoise
9.9% lime
The model is very robust to noise
Mean number of basic colour terms
in emergent languages
Number of Colour Terms Emerging
5.5
5
4.5
4
No noise
3.5
50% noise
3
2.5
2
0
20
40
60
80
100
120
Number of colour acurate examples remembered during an
average lifetime
Implications of number of words
Emerging
 Languages are complex because we talk
a lot
 Not because complex languages help us
to communicate
• No communication ever takes place
• So no truly functional pressures
Conclusions
(1) Colour term typology a product of the uneven
spacing of unique hues in the conceptual colour
space.
Problem: we might be able to obtain similar results
with a significantly different model.
(2) Colour term typology can be explained as a
product of learning biases and cultural
evolution.