The Evolution of Conventions
Download
Report
Transcript The Evolution of Conventions
The Evolution of Conventions
H. Peyton Young
Presented by Na Li and Cory Pender
What is a convention?
Customary behavior
Self-enforcing
Not always symmetric
Follow given that other people do
Examples
• Driving on the right
• Eating with utensils
• Men propose to women
How are conventions “chosen”?
A convention is an equilibrium, but there
could be others
Some equilibria are inherently more
reasonable (Harsanyi and Selten)
One equilibrium more prominent
(Schelling)
Evolutionary explanation
Past plays influence players’ choices
One equilibrium eventually becomes
more prevalent
This paper shows that behavior will
converge over time to a Nash, given
some limitations on the game
The model
n people randomly selected from large
population
Base actions on sampling of plays from
recent past
No individual learning
Mistakes possible
“Adaptive play”
Goals
In weakly acyclic games:
– If samples are sufficiently incomplete and
memory is finite, converge to Nash
With mistakes:
– Almost always converges to a particular
equilibrium
Adaptive play
n-person game G, strategy set Si
N divided into classes C1, C2, ..., Cn.
G played once per period; t = 1, 2, ...
Play at time t is s(t) = (s1(t), s2(t), ... sn(t))
In class Ci, utility ui(s)
History of plays is h(t) = (s(1), s(2), ..., s(t))
Choosing strategies
Choose m, k such that 1≤k≤m
In period t+1, where t ≥ m:
– Each player sees k plays from past m
periods
– k/m is completeness of information
– Plays are not necessarily equally likely to
be seen
First m plays random
H consists of all sequences of length m
drawn from ∏Si
Finite Markov chain on H with initial h(m)
Successor of h H is h’ H
For s Si, pi(s|h)
Pi( · ) is a best-reply distribution
– pi(s|h) > 0 iff s is i’s best reply for some k
– pi(s|h) independent of t
P moving from h to h’ is ∏i=1,npi(si|h)
Convergence of adaptive play
h is an absorbing state iff it is Nash
played m times
h = h’ = (s, s, ..., s)
Convergence strict Nash
– But strict Nash does not guarantee convergence
– Cycling
Use weakly acyclic games
Best-reply graph
s’
s
s*
Theorem
G is a weakly acyclic n-person game
L(s) = length shortest path from s to Nash
LG = maxsL(s)
If k ≤ m/(LG + 2), adaptive play “almost
surely” converges to convention
Main idea: If information is sufficiently
incomplete, adaptive play converges
Proof
Positive probability that:
– At some t + 1, all agents sample last k plays
(call this µ)
– From periods t + 1 to t + k, all agents choose
sample µ
– Each agent makes same best-reply to µ k
times in a row
So positive probability of a run (s, s, ..., s)
from t + 1 to t + k
If s is a strict Nash:
Positive probability that from t + k + 1 to
t + m, each agent samples last k plays
s is played for m - k more periods, then
absorbing state has been reached
If s is not a strict Nash:
There is a best-reply path from s to strict Nash sr
along the path ss1s2...sr
For ss1:
– Player i samples from periods t + 1 to t + k (i.e.
samples s)
– Everyone else samples µ
– Positive probability that these will occur for the next k
periods
By similar argument, you can move from s1 to s2,
and so on to sr
Hence limiting the size of k
Example
Battle of the sexes
– Opera vs. football game - yield or not yield
Man
Yield
Not Yield
Yield
0,0
1,√2
Not Yield
√2,1
0,0
Woman
Why must we limit k?
Let k = m
Consider initial sequence where they both
yielded/both didn’t yield
To decide next round: pick choice with
highest expected payoff (in this case,
each yields if 1 - f > f√2)
What would happen if k is bounded as
specified by adaptive play?
Is this the best we can do?
Note that the theorem guarantees
convergence to an equilibrium
– But which equilibrium?
Also, it seems unlikely that people
would always play best response
perfectly
Back to our example...
With slightly different payoffs
Man Yield
Woman
Not Yield
Yield
0,0
1,√2
Not Yield
√2/2, 1/2
0,0
Let k = 1, m = 3
We can imagine a situation where
– Both yield on first round
– Both not yield on second round
– On 3rd round, woman samples yielding
round, man not yielding round
– What would be each player’s best reply?
– Next round?
– Get stuck in suboptimal equilibrium
Perhaps introducing mistakes could
solve this problem
Simulation
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.