Treasure Review Deterministic Search Problems Intuition Describe how a computer can try everything.
Download
Report
Transcript Treasure Review Deterministic Search Problems Intuition Describe how a computer can try everything.
Treasure
Review
Deterministic Search Problems
Intuition
Describe how a computer
can try everything
Modeling DSP
ππππππ¬: what makes a state
ππππ’π¨π§π¬ π : possible actions from state s
Succ π, π : states that could result from taking action a from state s
πππ°ππ«π π, π : reward for taking action a from state s
ππππππ β π¬πππππ¬: starting state
ππ¬πππ«π¦π’π§ππ₯(π): whether to stop
Breadth First Search
fringe = Queue()
fringe.enqueue(startState)
visited = Set([])
while not fringe.isEmpty():
currState = fringe.dequeue()
if not currState in visited:
visited.add(currState)
if isTerminal(currState) return currState
for nextState in getNextStates(currState)
fringe.enqueue(nextState)
Depth First Search
fringe = Queue()
fringe.push(startState)
visited = Set([])
while not fringe.isEmpty():
currState = fringe.pop()
if not currState in visited:
visited.add(currState)
if isTerminal(currState) return currState
for nextState in getNextStates(currState)
fringe.push(nextState)
DNA Alignment
ATTGGGAAATGCCCCATTATTBBC
ATTGGAATCGACATATTATTBBC
DNA Alignment
ATTATCCA
TTGCATGCA
ATT__ATCCA
_TTGCATGCA
DNA Alignment
State:
ATTATCCA
TTGCATGCA
DNA Alignment
Start State:
ATTATCCA
TTGCATGCA
DNA Alignment
ATTATCCA
TTGCATGCA
Actions:
Put a gap infront of the 1st cursor
Put a gap infront of the 2nd cursor
Match the letters at the cursors
DNA Alignment
ATTATCCA
TTGCATGCA
Successors:
Put a gap infront of the 1st cursor
Advance 2nd cursor
Put a gap infront of the 2nd cursor
Advance 1st cursor
Match the letters at the cursors
Move both cursors
DNA Alignment
ATTATCCA
TTGCATGCA
Cost:
Put a gap infront of the 1st cursor
2
Put a gap infront of the 2nd cursor
2
Match the letters at the cursors
0 if match 1 if !match
DNA Alignment
Goal Test:
ATT__ATCCA
_TTGCATGCA
Did We Break BFS?
Starting Line
CS221: More Intelligent
Search
Uniformed Cost Search
A Star
Markov Decision Problems
Uniformed Cost Search
A Star
Markov Decision Problems
My Hobby
Internet
Internet
Treasure
Treasure Map
Depth First Search
fringe = Queue()
fringe.push(startState)
visited = Set([])
while not fringe.isEmpty():
currState = fringe.pop()
if not currState in visited:
visited.add(currState)
if isTerminal(currState) return currState
for nextState in getNextStates(currState)
fringe.push(nextState)
Breadth First Search
fringe = Queue()
fringe.enqueue(startState)
visited = Set([])
while not fringe.isEmpty():
currState = fringe.dequeue()
if not currState in visited:
visited.add(currState)
if isTerminal(currState) return currState
for nextState in getNextStates(currState)
fringe.enqueue(nextState)
Breadth First Search
fringe = Queue()
fringe.enqueue(startState)
visited = Set([])
while not fringe.isEmpty():
currState = fringe.dequeue()
if not currState in visited:
visited.add(currState)
if isTerminal(currState) return currState
for nextState in getNextStates(currState)
fringe.enqueue(nextState)
UCS Search
fringe = PriorityQueue()
fringe.enqueue(startState, 0)
visited = Set([])
while not fringe.isEmpty():
currState = fringe.dequeue()
if not currState in visited:
visited.add(currState)
if isTerminal(currState) return currState
for nextState in getNextStates(currState)
costSoFar = nextState.getCostSoFar()
fringe.enqueue(nextState, costSoFar)
UCS Search
Proposition: UCS terminates with an
optimal path
If there are no negative cost
edges, path costs of expanded
node monotonically increase
UCS Search
Proposition: UCS terminates with an
optimal path
Goal
Start
Explored
Uniformed Cost Search
A Star
Markov Decision Problems
Uniformed Cost Search
A Star
Markov Decision Problems
Exponential
Insightβ¦
Goal
Start
Insightβ¦
Goal
Start
Insightβ¦
Goal
Start
Insightβ¦
Goal
Start
Insightβ¦
Goal
Really?
Start
Insightβ¦
Goal
Really?
Start
Lets put in an estimate?
Goal
Really?
Start
Will that always work?
No.
Lets put in an underestimate?
Goal
Really?
Start
Will that always work?
Terminology:
C(
)
True cost from node to goal
H(
)
Heuristic cost from node to goal
G(
)
Actual cost from start to node
F(
)
G(
) + H(
)
Will that always work?
Terminology:
C(
)
Goal
Start
H(
G(
)
)
Will that always work?
Proof by contradiction:
Goal 1
Start
n
Goal 2
Will that always work?
F( goal1 ) = G( goal1 )
Since H ( goal ) = 0
F( goal2 ) = G( goal2 )
Since H ( goal ) = 0
G( goal1 ) > G( goal2 )
Since goal 1 is sub opt.
H( n ) < C( n )
Since H is underestimate.
Goal 1
Start
F( goal1 ) > F( goal2 )
Putting first three lines together
G( n ) + H( n ) β€ G( n ) + C( n )
Because of line 4
F( n ) β€ G( n ) + C( n )
By the definition of F( n )
F( n ) β€ F ( goal2 )
By the definition of F( goal2 )
F( n ) β€ F ( goal1 )
From line 5
Which is a contradiction
n
Goal 2
Just Relax
UCS Search
fringe = PriorityQueue()
fringe.enqueue(startState, 0)
visited = Set([])
while not fringe.isEmpty():
currState = fringe.dequeue()
if not currState in visited:
visited.add(currState)
if isTerminal(currState) return currState
for nextState in getNextStates(currState)
costSoFar = nextState.getCostSoFar()
fringe.enqueue(nextState, costSoFar)
A Star
fringe = PriorityQueue()
fringe.enqueue(startState, heuristicCost(startState))
visited = Set([])
while not fringe.isEmpty():
currState = fringe.dequeue()
if not currState in visited:
visited.add(currState)
if isTerminal(currState) return currState
for nextState in getNextStates(currState)
costSoFar = nextState.getCostSoFar()
h = heuristicCost(nextState)
fringe.enqueue(nextState, costSoFar + h)
Observation
fringe = PriorityQueue()
fringe.enqueue(startState, heuristicCost(startState))
visited = Set([])
while not fringe.isEmpty():
currState = fringe.dequeue()
if not currState in visited:
visited.add(currState)
if isTerminal(currState) return currState
for nextState in getNextStates(currState)
costSoFar = nextState.getCostSoFar()
h = heuristicCost(nextState)
fringe.enqueue(nextState, costSoFar + h)
fringe = PriorityQueue()
fringe.enqueue(startState, 0)
visited = Set([])
while not fringe.isEmpty():
currState = fringe.dequeue()
if not currState in visited:
visited.add(currState)
if isTerminal(currState) return currState
for nextState in getNextStates(currState)
costSoFar = nextState.getCostSoFar()
fringe.enqueue(nextState, costSoFar)
Observation
fringe = PriorityQueue()
fringe.enqueue(startState, 0)
visited = Set([])
while not fringe.isEmpty():
currState = fringe.dequeue()
if not currState in visited:
visited.add(currState)
if isTerminal(currState) return currState
for nextState in getNextStates(currState)
costSoFar = nextState.getCostSoFar()
h = 0
fringe.enqueue(nextState, costSoFar + h)
fringe = PriorityQueue()
fringe.enqueue(startState, 0)
visited = Set([])
while not fringe.isEmpty():
currState = fringe.dequeue()
if not currState in visited:
visited.add(currState)
if isTerminal(currState) return currState
for nextState in getNextStates(currState)
costSoFar = nextState.getCostSoFar()
fringe.enqueue(nextState, costSoFar)
The 8 Puzzle
The 8 Puzzle
The 8 Puzzle
The 8 Puzzle
Solution depth β
20
Branching factor β
2.7
http://mypuzzle.org/sliding
A Star Wins
Demo
For Deterministic Search Problems
Intermission
https://www.youtube.com/watch?v=qybUFnY7Y8w
Uniformed Cost Search
A Star
Markov Decision Problems
Uniformed Cost Search
A Star
Markov Decision Problems
Markov Decision Problems
Deterministic State Problems
No Uncertainty
Discretized
Nondeterministic State Problems
Discretized
Markov Decision Problems
Discretized
Expectation
Expectation
πΈπ₯ππππ‘ππππ‘ππππ‘π¦ =
π π ππ‘ππππ‘π¦(π)
πβππ£πππ‘π
Event
Probability
1
1
2
1
3
1
4
1
5
1
6
1
Utility
6
$1
0.17
6
$2
0.33
6
$3
0.50
6
$4
0.66
6
$5
0.83
6
$6
1.00
$3.50
Volunteer
Markov Decision Problems
Discretized
Snowden MDP
Hong
Kong
Fly to
Moscow
US
Fly to US
Moscow
US
Modeling Discrete Search
ππππππ¬: what makes a state
ππππ’π¨π§π¬ π : possible actions from state s
Succ π, π : states that could result from taking action a from state s
πππ°ππ«π π, π : reward for taking action a from state s
ππππππ β π¬πππππ¬: starting state
ππ¬πππ«π¦π’π§ππ₯(π): whether to stop
Modeling Discrete Search
ππππππ¬: what makes a state
ππππ’π¨π§π¬ π : possible actions from state s
Succ π, π : states that could result from taking action a from state s
πππ°ππ«π π, π : reward for taking action a from state s
ππππππ β π¬πππππ¬: starting state
ππ¬πππ«π¦π’π§ππ₯(π): whether to stop
πππ«π¦π’π§ππ₯πΌππππππ(π): the value of reaching a given stopping point
Modeling Markov Decision
ππππππ¬: what makes a state
ππππ’π¨π§π¬ π : possible actions from state s
π» π, π : probability distribution of states that could result from
taking action a from state s
πππ°ππ«π π, π : reward for taking action a from state s
ππππππ β π¬πππππ¬: starting state
ππ¬πππ«π¦π’π§ππ₯(π): whether to stop
πππ«π¦π’π§ππ₯πΌππππππ(π): the value of reaching a given stopping point
Algorithm?
Hong
Kong
getAction
expectedUtility
expectedUtility
Fly to
Moscow
Fly to US
expectedUtility
expectedUtility
US
Moscow
expectedUtility
US
Slow Expectimax?
Pyramid Solitaire
Uniform Cost Search
Heuristics and A*
Building in Uncertainty
End.
(go do pset 1)