Minimax Equilibrium in Zero-Sum Game, mixed strategy

Download Report

Transcript Minimax Equilibrium in Zero-Sum Game, mixed strategy

SCIT1003
Chapter 4: Minimax Equilibrium
in Zero Sum Game

Prof. Tsang
1
Maximin & Minimax Equilibrium in
a zero-sum game
• Minimax - minimizing the maximum loss
(loss-ceiling, defensive)
• Maximin - maximizing the minimum gain
(gain-floor, offensive)
• Minimax = Maximin
2
The Minimax Theorem
“Every finite, two-person, zero-sum game
has a rational solution in the form
of a pure or mixed strategy.”
John Von Neumann, 1926
For every two-person, zero-sum game with finite strategies, there
exists a value V and a mixed strategy for each player, such that (a)
Given player 2's strategy, the best payoff possible for player 1 is V,
and (b) Given player 1's strategy, the best payoff possible for
player 2 is −V.
3
Pure strategy game: Saddle point
Is this a Nash
Equilibrium?
1
3 MaxiMin
4
A zero-sum game with a saddle
3 MiniMax
point.
4
Pure & mixed strategies
A pure strategy provides a complete definition of how
a player will play a game. It determines the move a
player will make for any situation they could face.
A mixed strategy is an assignment of a probability to
each pure strategy. This allows for a player to randomly
select a pure strategy.
In a pure strategy a player chooses an action for sure,
whereas in a mixed strategy, he chooses a probability
distribution over the set of actions available to him.
5
All you need to know about
Probability
If E is an outcome of action, then P(E) denotes the
probability that E will occur, with the following
properties:
1. 0  P(E)  1 such that:
If E can never occur, then P(E) = 0
If E is certain to occur, then P(E) = 1
2. The probabilities of all the possible outcomes
must sum to 1
6
Mixed strategy
• In some zero-sum game, there is no pure
strategy solution (no Saddle point)
• Play’s best way to win is mixing all
possible moves together in a random
(unpredictable) fashion.
• E.g. Rock-Paper-Scissors
7
Mixed strategies
Some games, such as Rock-Paper-Scissors, do not
have a pure strategy equilibrium.
In this game, if Player 1 chooses R, Player 2 should choose p, but if Player 2
chooses p, Player 1 should choose S. This continues with Player 2 choosing r
in response to the choice S by Player 1, and so forth.
In games like Rock-Paper-Scissors, a player will want
to randomize over several actions, e.g. he/she can
choose R, P & S in equal probabilities.
8
A soccer penalty shot at 12-yard
left or right?
p.145 payoffs are winning probability
Goalie
Left
Left
Kicker
Right
42
58
Right
5
95
7
93
30
70
9
A penalty shot at 12-yard
left or right?
If you are the kicker, which side you use?
The best chance you have is 95%. So you kick left.
But the goalie anticipates that because he knows that’s
your best chance. So his anticipation reduces your chance
to 58%.
What if you anticipate that he anticipates … so you kick
right & that increase your chance to 93%.
What if he anticipates that you anticipate that he
anticipates …
If you use a pure strategy, he always has a way to reduce
you chance to win.
10
A penalty shot at 12-yard
left or right?
• To end this circular reasoning, you do
something that the goalie cannot anticipate.
• What if you mix the 2 choices randomly with
50-50 chance?
• Your chance of winning is
(58+93)/2 if the goalie moves to left
(93+70)/2 if the goalie moves to right
Is this better?
11
Kicker’s mixture
p.166 graphical solution
12
Goalie’s mixture
p.168 graphical solution
13
If the goalie improves his skill at saving kicks to the Right side
14
A Parking meter game (p.164)
If you pay for the parking, it cause you $1.
If you don’t pay for the parking and you are caught
by the enforcer, the penalty is $50.
Should you take the risk of not paying for the
parking?
How often the enforcer should patrol to keep the
car drivers honest (to pay the parking fee)?
15
Parking meter game
p.164
Car driver
Pay
Enforce
Enforcer
Not pay
-1
1
Not
enforce
-50
50
-1
1
0
0
16
Mixed strategies
x=probability to take
action R
y=probability to take
action S
no
x
y
no
1-x-y
1-x-y=probability to take
action P
No Nash equilibrium for pure strategy
17
They have to be equal if expected payoff
independent of action of player 2
18
Janken step game (Japanese RSP)
p.171
19
Two-Person, Zero-Sum Games: Summary
•
•
•
•
Represent outcomes as payoffs to row player
Find any dominating equilibrium
Evaluate row minima and column maxima
If maximin=minimax, players adopt pure strategy
corresponding to saddle point; choices are in stable
equilibrium -- secrecy not required
• If maximin minimax, find optimal mixed strategy;
secrecy essential
20
Summary: Ch. 4
•
•
•
•
Look for any equilibrium
Dominating Equilibrium
Minimax Equilibrium
Nash Equilibrium
21
Assignment 4.1
22
Assignment 4.1
23