Treasure Review Deterministic Search Problems Intuition Describe how a computer can try everything.
Download ReportTranscript Treasure Review Deterministic Search Problems Intuition Describe how a computer can try everything.
Treasure Review Deterministic Search Problems Intuition Describe how a computer can try everything Modeling DSP ππππππ¬: what makes a state ππππ’π¨π§π¬ π : possible actions from state s Succ π, π : states that could result from taking action a from state s πππ°ππ«π π, π : reward for taking action a from state s ππππππ β π¬πππππ¬: starting state ππ¬πππ«π¦π’π§ππ₯(π): whether to stop Breadth First Search fringe = Queue() fringe.enqueue(startState) visited = Set([]) while not fringe.isEmpty(): currState = fringe.dequeue() if not currState in visited: visited.add(currState) if isTerminal(currState) return currState for nextState in getNextStates(currState) fringe.enqueue(nextState) Depth First Search fringe = Queue() fringe.push(startState) visited = Set([]) while not fringe.isEmpty(): currState = fringe.pop() if not currState in visited: visited.add(currState) if isTerminal(currState) return currState for nextState in getNextStates(currState) fringe.push(nextState) DNA Alignment ATTGGGAAATGCCCCATTATTBBC ATTGGAATCGACATATTATTBBC DNA Alignment ATTATCCA TTGCATGCA ATT__ATCCA _TTGCATGCA DNA Alignment State: ATTATCCA TTGCATGCA DNA Alignment Start State: ATTATCCA TTGCATGCA DNA Alignment ATTATCCA TTGCATGCA Actions: Put a gap infront of the 1st cursor Put a gap infront of the 2nd cursor Match the letters at the cursors DNA Alignment ATTATCCA TTGCATGCA Successors: Put a gap infront of the 1st cursor Advance 2nd cursor Put a gap infront of the 2nd cursor Advance 1st cursor Match the letters at the cursors Move both cursors DNA Alignment ATTATCCA TTGCATGCA Cost: Put a gap infront of the 1st cursor 2 Put a gap infront of the 2nd cursor 2 Match the letters at the cursors 0 if match 1 if !match DNA Alignment Goal Test: ATT__ATCCA _TTGCATGCA Did We Break BFS? Starting Line CS221: More Intelligent Search Uniformed Cost Search A Star Markov Decision Problems Uniformed Cost Search A Star Markov Decision Problems My Hobby Internet Internet Treasure Treasure Map Depth First Search fringe = Queue() fringe.push(startState) visited = Set([]) while not fringe.isEmpty(): currState = fringe.pop() if not currState in visited: visited.add(currState) if isTerminal(currState) return currState for nextState in getNextStates(currState) fringe.push(nextState) Breadth First Search fringe = Queue() fringe.enqueue(startState) visited = Set([]) while not fringe.isEmpty(): currState = fringe.dequeue() if not currState in visited: visited.add(currState) if isTerminal(currState) return currState for nextState in getNextStates(currState) fringe.enqueue(nextState) Breadth First Search fringe = Queue() fringe.enqueue(startState) visited = Set([]) while not fringe.isEmpty(): currState = fringe.dequeue() if not currState in visited: visited.add(currState) if isTerminal(currState) return currState for nextState in getNextStates(currState) fringe.enqueue(nextState) UCS Search fringe = PriorityQueue() fringe.enqueue(startState, 0) visited = Set([]) while not fringe.isEmpty(): currState = fringe.dequeue() if not currState in visited: visited.add(currState) if isTerminal(currState) return currState for nextState in getNextStates(currState) costSoFar = nextState.getCostSoFar() fringe.enqueue(nextState, costSoFar) UCS Search Proposition: UCS terminates with an optimal path If there are no negative cost edges, path costs of expanded node monotonically increase UCS Search Proposition: UCS terminates with an optimal path Goal Start Explored Uniformed Cost Search A Star Markov Decision Problems Uniformed Cost Search A Star Markov Decision Problems Exponential Insightβ¦ Goal Start Insightβ¦ Goal Start Insightβ¦ Goal Start Insightβ¦ Goal Start Insightβ¦ Goal Really? Start Insightβ¦ Goal Really? Start Lets put in an estimate? Goal Really? Start Will that always work? No. Lets put in an underestimate? Goal Really? Start Will that always work? Terminology: C( ) True cost from node to goal H( ) Heuristic cost from node to goal G( ) Actual cost from start to node F( ) G( ) + H( ) Will that always work? Terminology: C( ) Goal Start H( G( ) ) Will that always work? Proof by contradiction: Goal 1 Start n Goal 2 Will that always work? F( goal1 ) = G( goal1 ) Since H ( goal ) = 0 F( goal2 ) = G( goal2 ) Since H ( goal ) = 0 G( goal1 ) > G( goal2 ) Since goal 1 is sub opt. H( n ) < C( n ) Since H is underestimate. Goal 1 Start F( goal1 ) > F( goal2 ) Putting first three lines together G( n ) + H( n ) β€ G( n ) + C( n ) Because of line 4 F( n ) β€ G( n ) + C( n ) By the definition of F( n ) F( n ) β€ F ( goal2 ) By the definition of F( goal2 ) F( n ) β€ F ( goal1 ) From line 5 Which is a contradiction n Goal 2 Just Relax UCS Search fringe = PriorityQueue() fringe.enqueue(startState, 0) visited = Set([]) while not fringe.isEmpty(): currState = fringe.dequeue() if not currState in visited: visited.add(currState) if isTerminal(currState) return currState for nextState in getNextStates(currState) costSoFar = nextState.getCostSoFar() fringe.enqueue(nextState, costSoFar) A Star fringe = PriorityQueue() fringe.enqueue(startState, heuristicCost(startState)) visited = Set([]) while not fringe.isEmpty(): currState = fringe.dequeue() if not currState in visited: visited.add(currState) if isTerminal(currState) return currState for nextState in getNextStates(currState) costSoFar = nextState.getCostSoFar() h = heuristicCost(nextState) fringe.enqueue(nextState, costSoFar + h) Observation fringe = PriorityQueue() fringe.enqueue(startState, heuristicCost(startState)) visited = Set([]) while not fringe.isEmpty(): currState = fringe.dequeue() if not currState in visited: visited.add(currState) if isTerminal(currState) return currState for nextState in getNextStates(currState) costSoFar = nextState.getCostSoFar() h = heuristicCost(nextState) fringe.enqueue(nextState, costSoFar + h) fringe = PriorityQueue() fringe.enqueue(startState, 0) visited = Set([]) while not fringe.isEmpty(): currState = fringe.dequeue() if not currState in visited: visited.add(currState) if isTerminal(currState) return currState for nextState in getNextStates(currState) costSoFar = nextState.getCostSoFar() fringe.enqueue(nextState, costSoFar) Observation fringe = PriorityQueue() fringe.enqueue(startState, 0) visited = Set([]) while not fringe.isEmpty(): currState = fringe.dequeue() if not currState in visited: visited.add(currState) if isTerminal(currState) return currState for nextState in getNextStates(currState) costSoFar = nextState.getCostSoFar() h = 0 fringe.enqueue(nextState, costSoFar + h) fringe = PriorityQueue() fringe.enqueue(startState, 0) visited = Set([]) while not fringe.isEmpty(): currState = fringe.dequeue() if not currState in visited: visited.add(currState) if isTerminal(currState) return currState for nextState in getNextStates(currState) costSoFar = nextState.getCostSoFar() fringe.enqueue(nextState, costSoFar) The 8 Puzzle The 8 Puzzle The 8 Puzzle The 8 Puzzle Solution depth β 20 Branching factor β 2.7 http://mypuzzle.org/sliding A Star Wins Demo For Deterministic Search Problems Intermission https://www.youtube.com/watch?v=qybUFnY7Y8w Uniformed Cost Search A Star Markov Decision Problems Uniformed Cost Search A Star Markov Decision Problems Markov Decision Problems Deterministic State Problems No Uncertainty Discretized Nondeterministic State Problems Discretized Markov Decision Problems Discretized Expectation Expectation πΈπ₯ππππ‘ππππ‘ππππ‘π¦ = π π ππ‘ππππ‘π¦(π) πβππ£πππ‘π Event Probability 1 1 2 1 3 1 4 1 5 1 6 1 Utility 6 $1 0.17 6 $2 0.33 6 $3 0.50 6 $4 0.66 6 $5 0.83 6 $6 1.00 $3.50 Volunteer Markov Decision Problems Discretized Snowden MDP Hong Kong Fly to Moscow US Fly to US Moscow US Modeling Discrete Search ππππππ¬: what makes a state ππππ’π¨π§π¬ π : possible actions from state s Succ π, π : states that could result from taking action a from state s πππ°ππ«π π, π : reward for taking action a from state s ππππππ β π¬πππππ¬: starting state ππ¬πππ«π¦π’π§ππ₯(π): whether to stop Modeling Discrete Search ππππππ¬: what makes a state ππππ’π¨π§π¬ π : possible actions from state s Succ π, π : states that could result from taking action a from state s πππ°ππ«π π, π : reward for taking action a from state s ππππππ β π¬πππππ¬: starting state ππ¬πππ«π¦π’π§ππ₯(π): whether to stop πππ«π¦π’π§ππ₯πΌππππππ(π): the value of reaching a given stopping point Modeling Markov Decision ππππππ¬: what makes a state ππππ’π¨π§π¬ π : possible actions from state s π» π, π : probability distribution of states that could result from taking action a from state s πππ°ππ«π π, π : reward for taking action a from state s ππππππ β π¬πππππ¬: starting state ππ¬πππ«π¦π’π§ππ₯(π): whether to stop πππ«π¦π’π§ππ₯πΌππππππ(π): the value of reaching a given stopping point Algorithm? Hong Kong getAction expectedUtility expectedUtility Fly to Moscow Fly to US expectedUtility expectedUtility US Moscow expectedUtility US Slow Expectimax? Pyramid Solitaire Uniform Cost Search Heuristics and A* Building in Uncertainty End. (go do pset 1)