Artificial Intelligence in Game Design

Download Report

Transcript Artificial Intelligence in Game Design

Artificial Intelligence in
Game Design
Board Games and the
MinMax Algorithm
Board Games
• Examples:
– Chess, checkers, tic tac toe, backgammon, etc.
• Characteristics:
– Alternating moves:
player  AI player  AI  …
– Limited (but usually not small) set of possible moves
• Example: legal moves in chess
– Set of winning and losing board configurations
• Examples: checkmate in chess, all pieces gone in checkers
• Goal of AI:
– Choose next move so that will “likely” lead to victory
Key problem: What exactly does this mean?
Planning Against an Opponent
• Idea:
Create sequence of steps that lead from initial board state to
winning state
X
X
O
X
O
X
X
O
X
O
X
O
X
O
X
• Problem:
Unlike normal planning, AI does not get to choose every step of path
– Player gets to choose every other step in plan
– Will choose steps that defeat the AI
X
X
O
X
O
X
X
O
X
O
Tree Representation
• Can represent possible moves as tree
– Node = board state
– Branch = possible moves from that state
– Opponent controls every other branching of tree!
• Example: Tic Tac Toe
– Simplified to remove duplicate branches for reflection, rotation
Root node = initial board state
Possible X
moves
X
O
X
O
X
X
X
X
X
O
X
X
O
X
O
O
Possible O moves
X
O
X
X
O
O
X
X
O
O
O
Choosing Next Move
• Simple idea:
Choose next move so guaranteed win no matter which moves
opponent makes
– No real game is this simple!
• Example:
A
C
I win
B
D
F
E
I win
I lose
G
I win
H
I lose
Choosing Next Move
Reasoning:
• If choose move A, then opponent will choose move D
and I lose.
• If choose move B, I win, since:
– If opponent chooses move F, I win
– If opponent chooses move E, I choose move G and win
• Therefore, choose move B
– Guaranteed win regardless of opponent moves
Lookahead Problem
• Main Problem:
No nontrivial game has tree which can be completely
explored
– m possible moves each turn
– n turns until game ends
– mn nodes in game tree
• Exponential growth
– Example: Chess
• ~20 moves per turn
• ~50 turns per game
• ~2050 possible moves
– Close to number of particles in universe!
Heuristic Evaluation of Boards
• How many levels can be explored in tree?
• Can process p nodes per second
• Have maximum of s seconds per move
– Can process ps nodes
– mn nodes looking ahead n moves
– Can explore n = logm(ps) moves ahead
• What if game not over at deepest lookahead level?
– Should AI try to force this branch or avoid it?
No time to explore
moves past this point
Heuristic Evaluation of Boards
• Key idea:
Create heuristic measure of how “good” a board configuration is
• H(board)
– Positive or negative number: Higher = better, lower = worse
– Win = MAXINT, Loss = - MAXINT
• Goal:
Choose current move that guarantees reaching board position with
highest possible heuristic measure
MAXINT
93
17
-100
- MAXINT
Heuristic Evaluation of Boards
• TicTacToe Example (with AI playing X):
H(board) = 2 × # of possible rows/columns/diagonals where X could win in one move +
+ 1 × # of possible rows/columns/diagonals where X could win in two moves +
- 2 × # of possible rows/columns/diagonals where O could win in one move +
- 1 × # of possible rows/columns/diagonals where O could win in two moves
•
Example:
X
O
X
H=2+1+1-1=3
X
O
MINMAX Algorithm
• Explore possible moves from current board position
– Continue until reach maximum lookahead level
– Forms tree of possible moves and countermoves
• Apply heuristic measure to all boards at leafs of tree
• Work from leafs to root of tree
– Measure of each node based on measure of its child nodes
• Choose current move that results in highest heuristic
value
• AI makes move
• Player makes move (may not be one expected, but can’t be worse)
MINMAX Algorithm
• AI’s turn = MAX level
– Choose move that gives highest heuristic
• AI should choose move that guarantees best result!
– Value of board position at that level = maximum of its children
– Example: TicTacToe with AI playing X
Will choose move with highest
heuristic, so value of reaching this
board = 4
4
X
O
X
O
Best
move
X
O
X
X
O
O
4
X
X
X
X
X
X
O
X
O
O
3
O
3
Bad move, but
not relevant
X
X
X
3
O
X
O
X
O
1
MINMAX Algorithm
• Player’s turn = MIN level
– Assume they choose move which is best for them
• Assume good for player = bad for AI
• Assume move that gives lowest heuristic
– Value of board position at that level = minimum of its children
Player will choose move with
lowest heuristic, so value of
reaching this board = -2
X
O
Worst
move
for AI
O
X
X
-2
O
X
O
X
X
O
-2
X
O
X
O
X
O
X
O
O
-2
O
X
X
-1
O
X
X
O
X
-1
O
Bad moves for
player, but can’t
assume they will
do this
Simple MinMax Example
• AI move
• 3 levels of lookahead
AI move
player move
AI move
8
-2
3
5
-5
-1
Heuristic Measures
No time to explore moves past this point
4
12
Simple MinMax Example
• AI move
• 3 levels of lookahead
AI move
player move
Max(8, -2) = 8
Max(4, 12) = 12
Max(3, 5) = 5
Max(-5, -1) = -1
AI move
8
-2
3
5
-5
-1
Heuristic Measures
No time to explore moves past this point
4
12
Simple MinMax Example
• AI move
• 3 levels of lookahead
AI move
Min(8, 5) = 5
Min(-1, 12) = -1
player move
Max(4, 12) = 12
Max(8, -2) = 8
Max(3, 5) = 5
Max(-5, -1) = -1
AI move
8
-2
3
5
-5
-1
Heuristic Measures
No time to explore moves past this point
4
12
Simple MinMax Example
Best possible
outcome = 5
AI move
Follow this
branch
Max(5, -1) = 5
Min(8, 5) = 5
Min(-1, 12) = -1
player move
Max(4, 12) = 12
Max(8, -2) = 8
Max(3, 5) = 5
Max(-5, -1) = -1
AI move
8
-2
3
5
-5
-1
Heuristic Measures
No time to explore moves past this point
4
12
MinMax Example
AI playing X
Maximum lookahead of 2 moves
Initial move choice for AI
Choose this move
Max(-1, 1, -2) = 1
Assume player will then do this
Min(1, -1, 0, 1, 0) = -1
Min(-1, 0, -2, -1, 0) = -2
X
X
Min(1, 2) = 1
X
O
X
O
X
X
X
X
O
X
O
X
O
-1
0
1
0
X
X
O
1
O
X
O
X
O
O
1
2
X
O
-1
Heuristic Measures
No time to explore moves past this point
0
-2
-1
O
0
Alpha-Beta Pruning
• Goal: Speed up MinMax algorithm
• Idea:
Faster evaluation Can explore more levels in game tree
 Better decision making
– Can speed up process by factor of 10 – 100
Alpha-Beta Pruning
• Idea: Many branches cannot possibly improve score
– Once this is known, stop exploring them
2) So this can never be greater than -1,
which means it cannot increase BestSoFar
BestSoFar = max(-1, 1, ?) = 1
1) Minimum will never get higher than -1
BestSoFar = min (-1, ?, ?, ?, ?)
X
X
X
BestSoFar = -1
BestSoFar = 1
X
O
3) Which means that there is no point in
exploring any of these branches
-1
Alpha-Beta Pruning
• Define α as value found at previously explored max branch
– Can do no worse than α at this point
• Define β as a value found below the current branch
– Can do no better than β down this branch
• If α ≥ β, there is no point in exploring this branch any further
BestSoFar = max (α, x) where x ≤ β
= α if α ≥ β
α
BestSoFar = min (β, …)
≤β
β
Alpha-Beta Pruning
Also applies to min branches
• If β ≤ α, no point in exploring this branch any further
• Opponent will not choose this branch, as guaranteed to
be better for AI (and worse for player) than other branch
BestSoFar = min (β, x)
= β if β ≤ α
β
BestSoFar = max (α, …)
≥α
α
Move Ordering
• Problem: Alpha-Beta pruning only works well if best branches
explored first
– Only know branch is not worth exploring if have already seen a better one
X
X
X
Probably no branches
eliminated if explored in
this order
• Goal:
Order branches so moves “most likely” to be good are explored first
– This is very difficult!
Move Ordering
• Idea: Use results of last evaluation to order moves this time
Tree created in previous move
AI move chosen
player move chosen
A
2
D
7
4
4
B
E
2
F
4
C
G
8
Attempt to “reuse” this part of tree
Move Ordering
• Each board in subtree already has an evaluation
– May no longer be completely accurate
– Should still be somewhat accurate
• Use those to determine order in which next evaluation done
A
2
D
7
4
B
E
2
A
4
4
C
F
4
4
2
C
B
G
G
F
D
E
8
8
4
7
2
Games with Random Component
• Example: Backgammon
– Roll dice (random event)
– Determine which piece or
pieces to move that many
steps
• One piece 11 steps
• Two pieces 5 and 6 steps
• Which piece?
• Possible actions depend
on random component
Games with Random Component
• Can treat random event like extra level of game tree
Dice throw
2
3
4
5
6
7
8
9
10
Possible moves based on
dice throw (8 in this case)
11
12
MinMax in Random Games
• Generate tree of possible outcomes up to lookahead limit
– Alternate possible moves and possible outcomes of random event
– Start with possible AI moves after random event
• Point at which AI has to make decision about what to do
AI dice throw at
current board state
Possible AI moves from current board for that dice throw
Possible player dice throws
Possible player moves based on
board and their dice throw
Possible AI dice throws
Possible AI moves from this
board for that dice throw
MinMax in Random Games
• Use minmax to compute values at decision levels
– AI and player make best possible move based on their dice throw
• Weight each node at random event level based on probability of the
random event
4) Weight this state =
5/36 * 18 = 2.5
2
3
4
5
6
7
8
9
10
11
12
3) Probability of dice
throw of 8 = 5/36
1) Value of board for each
possible move after dice
throw of 8
2) Minmax uses
this value
-5
8
7
18
11
-1
MinMax in Random Games
• Compute value of board
before random event based
on expected value of boards
after random event
2
3
4
5
Total expected value = 12
6
7
8
9
10
11
12
Probability: 1/36 1/18 1/12 1/9
Minmax
value from
child nodes: -18 -27 -24
0
5/36
1/6 5/36 1/9 1/12 1/18 1/36
18
18
Weighted
value:
-0.5 -1.5
2.5
-2
0
3
18
27
24
36
36
2.5
3
2
2
1
MinMax in Random Games
• Random component greatly increases size of tree
• Example:
– 11 possibilities for dice throw
– n moves lookahead
– Increases nodes to explore by factor of 11n
• Will probably not be able to look ahead more than 2 or 3
moves
• TD-Gammon
– Plays at world championship level
– Also uses learning