The Evolution of Conventions

Download Report

Transcript The Evolution of Conventions

The Evolution of Conventions
H. Peyton Young
Presented by Na Li and Cory Pender
What is a convention?





Customary behavior
Self-enforcing
Not always symmetric
Follow given that other people do
Examples
• Driving on the right
• Eating with utensils
• Men propose to women
How are conventions “chosen”?

A convention is an equilibrium, but there
could be others

Some equilibria are inherently more
reasonable (Harsanyi and Selten)

One equilibrium more prominent
(Schelling)
Evolutionary explanation
Past plays influence players’ choices
 One equilibrium eventually becomes
more prevalent
 This paper shows that behavior will
converge over time to a Nash, given
some limitations on the game

The model
n people randomly selected from large
population
 Base actions on sampling of plays from
recent past
 No individual learning
 Mistakes possible
 “Adaptive play”

Goals

In weakly acyclic games:
– If samples are sufficiently incomplete and
memory is finite, converge to Nash

With mistakes:
– Almost always converges to a particular
equilibrium
Adaptive play






n-person game G, strategy set Si
N divided into classes C1, C2, ..., Cn.
G played once per period; t = 1, 2, ...
Play at time t is s(t) = (s1(t), s2(t), ... sn(t))
In class Ci, utility ui(s)
History of plays is h(t) = (s(1), s(2), ..., s(t))
Choosing strategies
Choose m, k such that 1≤k≤m
 In period t+1, where t ≥ m:

– Each player sees k plays from past m
periods
– k/m is completeness of information
– Plays are not necessarily equally likely to
be seen
First m plays random
 H consists of all sequences of length m
drawn from ∏Si
 Finite Markov chain on H with initial h(m)
 Successor of h H is h’ H
 For s Si, pi(s|h)
 Pi( · ) is a best-reply distribution

– pi(s|h) > 0 iff s is i’s best reply for some k
– pi(s|h) independent of t

P moving from h to h’ is ∏i=1,npi(si|h)
Convergence of adaptive play
h is an absorbing state iff it is Nash
played m times
 h = h’ = (s, s, ..., s)
 Convergence strict Nash

– But strict Nash does not guarantee convergence
– Cycling

Use weakly acyclic games
Best-reply graph
s’
s
s*
Theorem
G is a weakly acyclic n-person game
 L(s) = length shortest path from s to Nash
 LG = maxsL(s)
 If k ≤ m/(LG + 2), adaptive play “almost
surely” converges to convention
 Main idea: If information is sufficiently
incomplete, adaptive play converges

Proof

Positive probability that:
– At some t + 1, all agents sample last k plays
(call this µ)
– From periods t + 1 to t + k, all agents choose
sample µ
– Each agent makes same best-reply to µ k
times in a row

So positive probability of a run (s, s, ..., s)
from t + 1 to t + k
If s is a strict Nash:

Positive probability that from t + k + 1 to
t + m, each agent samples last k plays

s is played for m - k more periods, then
absorbing state has been reached
If s is not a strict Nash:
There is a best-reply path from s to strict Nash sr
along the path ss1s2...sr
 For ss1:

– Player i samples from periods t + 1 to t + k (i.e.
samples s)
– Everyone else samples µ
– Positive probability that these will occur for the next k
periods
By similar argument, you can move from s1 to s2,
and so on to sr
 Hence limiting the size of k

Example

Battle of the sexes
– Opera vs. football game - yield or not yield
Man
Yield
Not Yield
Yield
0,0
1,√2
Not Yield
√2,1
0,0
Woman
Why must we limit k?
Let k = m
 Consider initial sequence where they both
yielded/both didn’t yield
 To decide next round: pick choice with
highest expected payoff (in this case,
each yields if 1 - f > f√2)
 What would happen if k is bounded as
specified by adaptive play?

Is this the best we can do?

Note that the theorem guarantees
convergence to an equilibrium
– But which equilibrium?

Also, it seems unlikely that people
would always play best response
perfectly
Back to our example...

With slightly different payoffs
Man Yield
Woman
Not Yield
Yield
0,0
1,√2
Not Yield
√2/2, 1/2
0,0
Let k = 1, m = 3
 We can imagine a situation where

– Both yield on first round
– Both not yield on second round
– On 3rd round, woman samples yielding
round, man not yielding round
– What would be each player’s best reply?
– Next round?
– Get stuck in suboptimal equilibrium

Perhaps introducing mistakes could
solve this problem
Simulation
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.