Constraints And Adversarial Search

Download Report

Transcript Constraints And Adversarial Search

Constraint Satisfaction Problems
(CSP)
CSP atau Constraint Satisfaction Problem adalah
permasalahan yang tujuannya adalah mendapatkan suatu
kombinasi variabel-variabel tertentu yang memenuhi
aturan-aturan (constraints) tertentu.
State didefinisikan dengan variables Xi yang mempunyai
values dari domain Di
Goal Test adalah sebuah himpunan constraints yang
memberikan kombinasi yang diijinkan untuk mengisi variabel
Batasan CSP dalam perkuliahan ini: diskrit (solusi deterministik),
absolut (solusi pasti tersedia dalam domain), unair atau biner (satu
atau dua variabel yang harus diisi).
CSP Example: Map-Coloring
Variables: WA, NT, Q, NSW, V, SA, T
Domains: Di = {red, green, blue}
Constraints: adjacent regions must have different colors
e.g.: WA ≠ NT, WA ≠ SA, NT ≠ SA, ... (if the language allow this), or
(WA,NT) Є { (red, green), (red, blue), (green, red),...}
CSP Example: Map-Coloring
Solutions are complete and consistent assignments, e.g.
{WA=red, NT=green, Q=red, NSW=green,
V=red, SA=blue, T=green}
Varieties of Constraints
Unary : Constraints involve a single variable
e.g.: SA ≠ green
Binary : Constraints involve pairs of variables
e.g.: SA ≠ WA
Higher-Order : Constraints involve 3 or more variables
e.g.: cryptarithmetic column constraints
Preferences (soft constraints) :
e.g.: blue is better than green.
(Often representable by a cost for each variable assignment
 Constrained Optimization Problems
Standard Search Formulation for
CSP
States ditentukan dengan nilai yang sudah dialokasikan sekarang
Initial State: { }
Successor Function: assign value ke variable yang belum terisi
nilai tidak boleh melanggar constraint
Goal Test: bila assignment selesai dilakukan
Catatan:
1. Hal ini berlaku untuk setiap masalah CSP
2. Karena variable terbatas maka setiap solusi akan muncul pada kedalaman n
dengan n variable  gunakan depth-first Search
3. Karena path irrelevan kita dapat gunakan algoritma local search
Backtracking Example: Map
Coloring
Backtracking Example: Map
Coloring
Backtracking Example: Map
Coloring
Backtracking Example: Map
Coloring
Memperbaiki backtracking
1. Variable yang mana yang harus di assign terlebih dahulu ?
2. Bagaimana urutan nilai dicoba ?
3. Bisakah kita mendeteksi kegagalan lebih awal
4. Dapatkah kita menggunakan struktur problem untuk membantu
kita ?
Forward Checking
Idenya :
• Simpan nilai valid untuk variable yang belum diassign
• Bila salah satu variable tidak mempunyai kemungkinan nilai yang valid
maka pencarian dihentikan
Forward Checking
Idenya :
• Simpan nilai valid untuk variable yang belum di-assign
• Bila salah satu variable tidak mempunyai kemungkinan nilai yang valid
maka search dihentikan
Forward Checking
Idenya :
• Simpan nilai valid untuk variable yang belum diassign
• Bila salah satu variable tidak mempunyai kemungkinan nilai yang valid
maka search dihentikan
Forward Checking
Idenya :
• Simpan nilai valid untuk variable yang belum diassign
• Bila salah satu variable tidak mempunyai kemungkinan nilai yang valid
search dihentikan
Constraint Propagation
Forward checking memberikan informasi dari variabel yang dialokasi, namun
tidak dapat mendeteksi kegagalan sebelumnya.
NT dan SA tidak boleh diberikan warna biru !
Constraint Propagation secara berulang mengevaluasi alokasi variabel dalam
skala lokal (solusi sementara)
R7
R6
R2
R3
R4
R5
R1
Isikan bidang (R1..R7) di atas dengan warna:
merah, kuning, hijau, biru.
Bidang bertetangga tidak boleh memiliki warna yang sama.
1.
2.
3.
Apakah variabel yang Anda gunakan?
Apakah domain yang tersedia?
Bagaimana Anda mengevaluasi constraints-nya?
Variabel yang harus diisi: R1, .. R7
2. Domain yang tersedia: warna (merah,
1.
kuning, hijau, biru)
3. Constraints:
1.
2.
3.
4.
5.
6.
R1 <> R2, …, R7,
R2 <> R3,
R3 <> R4,
R4 <> R5,
R5 <> R6,
R6 <> R7
Backtracking
Backtracking
Backtracking
Backtracking
Backtracking
Backtracking
Backtracking
Backtracking
Most Constrained Variable Heuristic
Pilih variabel dengan kemungkinan nilai legal paling
sedikit, constraint terbesar. (Cari variabel yang paling
susah untuk diisi)
Dikenal juga dengan heuristik Minimum Remaining Values (MRV)
Least Constraining Value
Diberikan sebuah variabel, pilihlah yang memiliki nilai
constraint paling sedikit (legalitas terbesar, variabel
yang paling mudah diisi)
Degree Heuristic
Pilih variabel dengan constraint paling besar diantara
variabel yang belum terisi (kumpulkan variabel-variabel
yang paling sulit diisi)
Tie-breaker diantara MRV variabel

Why game playing ?
› It’s Fun
› Game Playing is non trivial
 Player need ‘human-like’ intelligence
 Games can vary in complexity
 Decision making should be done in limited time
› Games are :
 Well defined and repeatable
 Limited and accessible
› Games Can directly compare
human and computer
Checkers:
– 1994: Chinook (U.of A.) beat world champion Marion Tinsley, ending 40-yr reign.
Othello:
– 1997: Logistello (NEC research) beat the human world champion.
– Today: world champions refuse to play AI computer program (because it’s too
good).
Chess:

1997: Deep Blue (IBM) beat world champion Gary Kasparov

2005: a team of computers (Hydra, Deep Junior and Fritz), wins 8.5-3.5
against a rather strong human team formed by Veselin Topalov, Ruslan
Ponomariov and Sergey Karjakin, who had an average ELO rating of 2681.

2006: The undisputed world champion, Vladimir Kramnik, is defeated 4-2 by
Deep Fritz.
Backgammon:
– TD-Gammon (IBM) is world champion amongst humans and
computers
Go:
– Human champions refuse to play top AI player (because it’s too
weak)
Bridge:
– Still out of reach for AI players. Why ?
Others :

???
Perfect vs. Imperfect information:

Perfect: See the exact state of the game
› e.g. chess, backgammon, checkers, go, othello

Imperfect: Information is hidden
› e.g. scrabble, bridge, most card games
Deterministic vs Stochastic:

Deterministic: Change in state is fully determined by player
move.
› e.g. chess, othello

Stochastic: Change in state is partially determined by
chance.
› e.g. backgammon, monopoly
Deterministic
Stochastic
admissible,
perfect info
Checkers, Chess,
Go, Othello
Backgammon,
Monopoly
not admissible,
imperfect info
???
Bridge, Poker,
Scrabble
We can model game playing as a
search in a state space as we did with
other problems before.
 In order to model a game into a search
problem we need to decide

› the states,
› operator,
› initial state,
› goal, and
› utility function

Consider a two player board game:
– e.g., chess, checkers, tic-tac-toe
– board configuration: unique arrangement of "pieces“

Representing board games as search problem:
– states: board configurations
– operators: legal moves
– initial state: current board configuration
– terminal state: winning/terminal board configuration
– utility function: values for terminal state
(win: +1, loss: -1, draw: 0)

We want to find a strategy (i.e. way of picking moves) that wins the
game.

Assume the opponent’s moves can be predicted given
the computer's moves

How complex would search be in this case?
– Worst case: O(bm)
branching factor, max depth
– Tic-Tac-Toe: ~5 legal moves, max of 9 moves

–
Chess: ~35 legal moves, ~100 moves per game


59 = 1,953,125 states
bd ~ 35100 ~10154 states, “only” ~1040 legal states
Common games produce enormous search trees
The Problem is that the enemy will
not do exactly as we planned, in fact
the enemy will try to do the best move
for it and thus creating the worst move
for the player
How do we deal with this ?
Search Tree ?
 How do we implement the search tree ??

• Expand complete search tree in DFS manner,
until terminal states have been reached and their utilities
computed.
• Computer favors high utility value and the opponent favors
low utility value.
Computer will choose the moves that maximize the utility
value.
• Go back up from leaves towards the current state of the game.
 At each min node: backup the worst value among the
children. (opponent’s move)
 At each max node: backup the best value among the
children. (computer’s move)
1. Generate the complete game tree.
2. Apply the utility function to all the terminal
states.
3. Use the utility of the terminal states to
calculate a utility value for their parents
(either max or min) depending on depth.
4. Continue up to root node.
5. Choose move with highest value from root
node.
The utility function is only applied to terminal nodes.
If max makes move A1 then Min should make move A11.
Thus the result of making move A1 is a value of 3 for the utility
function.
Similarly A2 leads to a value of 2 and A3 a value of 2.
Max wants to maximise the utility function and so A1 is the best
move to make.
Complete ?
Only if tree is finite.
NB a finite strategy can exist even in an infinite tree
Optimality ?
Yes, against an optimal opponent. (Otherwise we don’t
know)
Time Complexity ?
O(bm)
Space Complexity ?
O(bm) (depth-first exploration)
Why not use Minimax to solve Chess ?
For chess, b  35, m  100 for “reasonable” games
exact solution completely infeasible
But do we need to explore every path?
• Suppose we have 100 seconds to make a move, and
we can search 104 nodes per second.
• So we can only search 106 nodes per move
(Or even fewer, if we spend time deciding which nodes to search.)
• Standard approach:
1. Use a cutoff test instead of terminal test
(e.g. based on depth limit)
2. Use an evaluation function instead of utility function for the
nodes where we cutoff the
search.
• An evaluation function v(s) represents the “goodness” of a
board state (e.g. chance of winning from that position).
• If the features of the board can be evaluated independently,
use a weighted linear function:
n
v(s) = w1f1(s) + w2f2(s) + … + wnfn(s) =
 w f (s)
i 1
i i
(where s is board state)
• More important features get more weight
• This function can be given by the expert or learned from
experience.
The evaluation function
w1 = 9; f1(s) = (number of white queens) – (number of black queens)
w2 = 3; f2(s) = (number of white knights) – (number of black knights)
w3 = 1; f3(s) = (number of white pawns) - (number of black pawns)
The quality of play depends directly on the quality of the evaluation
function
The evaluation function :
precision?
• Evaluation function is only approximate, and is usually better if we
are close to the end of the game.
• Move chosen is the same if we apply a monotonic transformation to
the evaluation function.
=
• Only the order of the numbers matter: payoffs in deterministic games
act as an ordinal utility function.

Diberikan sebuah pen jadwalan kelas sebagai berikut: ada 4
kelas (C1,… ,C4), dan 3 ruangan (R1, .., R3). Terdapat
penjadwalan sebagai berikut:

Terdapat pembatasan sebagai berikut:
› Setiap kelas harus menggunakan salah satu dari ketiga
ruangan yang tersedia
› R3 terlalu kecil untuk C3
› R2 dan R3 terlalu kecil untuk C4
Variabel dan domain apa saja yang
dapat diberikan untuk problem
penjadwalan tersebut?
2. Tunjukkan kemungkinan isi nilai untuk
setiap variabel sesuai dengan
constraints di atas.
3. Ekspresikan constraints problem secara
formal.
1.
1.
2.
Variabel: C1, C2, C3, C4;
Domain: R1, R2, R3
Kemungkinan alokasi variabel dari
domain
› C1: { R1, R2, R3 }
› C2: { R1, R2, R3 }
› C3: { R1, R2 }
› C4: { R1 }
3.
Constraints yang ada:
› Kelas tidak boleh ada yang bentrok
 C1 != C2,
 C1 != C3,
 C2 != C3,
 C2 != C4,
 C3 != C4
› Pembatasan kapasitas ruangan
 C3 != R3,
 C4 != R2,
 C4 != R3

Berikan sekarang solusinya  (manfaatkan
constraints graph)

Constraints graph
C1
C2
C3
C4
time
r1
r2
r3

Diberikan sebuah situasi permainan seperti di
bawah ini:

X (max player) sedang dalam giliran untuk
melanjutkan permainan. Berikan semua situasi
berikutnya yang mungkin untuk X

Pilihlah jalur yang tepat sesuai dengan algoritma
minmax, jika diketahui fungsi utilitas untuk situasi
menang untuk X = +10, kalah = -10, dan draw = 0.