Document 7917800

Download Report

Transcript Document 7917800

Monte-Carlo Tree Search in
Settlers of Catan
István Szita, Guillaume Chaslot, Pieter Spronck
Board Games vs. Video Games
Abstract Board Games
Modern Video Games
Deterministic
Highly non-deterministic
Complete information
Highly incomplete information
2 players
1-50,000 players
Simple rules
Highly complex rules
AI must be strong
AI must be entertaining
Modern Board Games vs. Others
Abstract Board
Games
Modern Board
Games
Modern Video Games
Deterministic
Slightly nondeterministic
Highly nondeterministic
Complete information Some incomplete
information
Highly incomplete
information
2 players
2-8 players
1-50,000 players
Simple rules
Some rule complexity Highly complex rules
AI must be strong
AI must be strong
and entertaining
AI must be
entertaining
Settlers of Catan




Klaus Teuber
Spiel des Jahres 1995
Origins Awards 1996
> 11 million copies sold
Rule Changes
 Removal of imperfect information
 Contents of development cards
 Contents of stolen cards
 No trading of our agent with the opponents
 But opponents are allowed to trade
JSettlers Architecture
Effect of Starting Position – Random Play
Effect of Starting Position – MCTS 1000
Effect of Starting Position




Random play: advantage for first player
MCTS play: advantage for third and second player
Thus: starting position has an effect
Therefore: for all following experiments, player
positions are randomized per game
Neighbour and Kingmaker Effect
 There is probably no neighbour effect in Settlers of
Catan
 and if there is, randomized seating order is effective
against it
 There might be a kingmaker effect in Settlers of Catan
 however, our bots are known to be impartial
 still, in future research this issue may play a role
Domain Knowledge in MC Simulation




Basic action weight: +1
End turn: +1
Building a settlement or city: +10000
Building a road: 10/10R
 Where R = #roads / (#settlements + #cities)
 Playing a knight (soldier):
 If robber is blocking a resource: +100
 Otherwise: +1
 Playing a development card other than knight: +10
 Uniform sampling gives better results!
Domain Knowledge in Tree Search
 Assign “virtual wins” to preferred actions
 Building a settlement: 20 virtual wins
 Building a city: 10 virtual wins
 Virtual wins are not propagated to parents!
 Otherwise, a weak action before one that gives
virtual wins, might be rated far too high
 This improves playing strength considerably
Tests
 Against three JSettlers AIs
 Random AI
 MCTS with 1000 simulations per move
 MCTS with 10000 simulations per move
 Against a strong human player
 ~30 games
Test 1
 JSettlers (top)
vs.
Random (bottom)
Test 2
 JSettlers (top)
vs.
MCTS 1000
(bottom)
Test 3
 JSettlers (top)
vs.
MCTS 10000
(bottom)
Test Against Humans




SmartSettlers MCTS 10000 plays justifiable moves
Play style is often “human-like”
More challenging than JSettlers
Still, humans tend to play stronger
Human Supremacy
 “Wood & clay” strategy
 Focus on building settlements and roads
 “Ore & wheat” strategy
 Focus on building cities and buying developments
 SmartSettlers prefers “ore & wheat” strategy, and tends
to build less settlements than a human would
 “Wood & clay” requires that the player postpones
building roads in favour of building settlements
 Search depth needed to discover this is pretty high
 Possible solutions: more simulations, improved
move selection heuristics
Conclusion
 MCTS is suitable for implementing AI for nondeterministic, multiplayer games such as Settlers of
Catan
 Future work
 Implementing a version that complies with all rules
 Implementing trading
 Improving playing strength through domain
knowledge, possibly automatically extracted
Contact details
Pieter Spronck
[email protected]
http://www.spronck.net
István Szita
[email protected]