Document 7917800
Download
Report
Transcript Document 7917800
Monte-Carlo Tree Search in
Settlers of Catan
István Szita, Guillaume Chaslot, Pieter Spronck
Board Games vs. Video Games
Abstract Board Games
Modern Video Games
Deterministic
Highly non-deterministic
Complete information
Highly incomplete information
2 players
1-50,000 players
Simple rules
Highly complex rules
AI must be strong
AI must be entertaining
Modern Board Games vs. Others
Abstract Board
Games
Modern Board
Games
Modern Video Games
Deterministic
Slightly nondeterministic
Highly nondeterministic
Complete information Some incomplete
information
Highly incomplete
information
2 players
2-8 players
1-50,000 players
Simple rules
Some rule complexity Highly complex rules
AI must be strong
AI must be strong
and entertaining
AI must be
entertaining
Settlers of Catan
Klaus Teuber
Spiel des Jahres 1995
Origins Awards 1996
> 11 million copies sold
Rule Changes
Removal of imperfect information
Contents of development cards
Contents of stolen cards
No trading of our agent with the opponents
But opponents are allowed to trade
JSettlers Architecture
Effect of Starting Position – Random Play
Effect of Starting Position – MCTS 1000
Effect of Starting Position
Random play: advantage for first player
MCTS play: advantage for third and second player
Thus: starting position has an effect
Therefore: for all following experiments, player
positions are randomized per game
Neighbour and Kingmaker Effect
There is probably no neighbour effect in Settlers of
Catan
and if there is, randomized seating order is effective
against it
There might be a kingmaker effect in Settlers of Catan
however, our bots are known to be impartial
still, in future research this issue may play a role
Domain Knowledge in MC Simulation
Basic action weight: +1
End turn: +1
Building a settlement or city: +10000
Building a road: 10/10R
Where R = #roads / (#settlements + #cities)
Playing a knight (soldier):
If robber is blocking a resource: +100
Otherwise: +1
Playing a development card other than knight: +10
Uniform sampling gives better results!
Domain Knowledge in Tree Search
Assign “virtual wins” to preferred actions
Building a settlement: 20 virtual wins
Building a city: 10 virtual wins
Virtual wins are not propagated to parents!
Otherwise, a weak action before one that gives
virtual wins, might be rated far too high
This improves playing strength considerably
Tests
Against three JSettlers AIs
Random AI
MCTS with 1000 simulations per move
MCTS with 10000 simulations per move
Against a strong human player
~30 games
Test 1
JSettlers (top)
vs.
Random (bottom)
Test 2
JSettlers (top)
vs.
MCTS 1000
(bottom)
Test 3
JSettlers (top)
vs.
MCTS 10000
(bottom)
Test Against Humans
SmartSettlers MCTS 10000 plays justifiable moves
Play style is often “human-like”
More challenging than JSettlers
Still, humans tend to play stronger
Human Supremacy
“Wood & clay” strategy
Focus on building settlements and roads
“Ore & wheat” strategy
Focus on building cities and buying developments
SmartSettlers prefers “ore & wheat” strategy, and tends
to build less settlements than a human would
“Wood & clay” requires that the player postpones
building roads in favour of building settlements
Search depth needed to discover this is pretty high
Possible solutions: more simulations, improved
move selection heuristics
Conclusion
MCTS is suitable for implementing AI for nondeterministic, multiplayer games such as Settlers of
Catan
Future work
Implementing a version that complies with all rules
Implementing trading
Improving playing strength through domain
knowledge, possibly automatically extracted
Contact details
Pieter Spronck
[email protected]
http://www.spronck.net
István Szita
[email protected]