Intro to First

Download Report

Transcript Intro to First

Intro to Planning
Or, how to represent the planning
problem in logic
The Planning Problem
Input:
1. An “initial state”
2. A “goal state”
3. A set of actions, each of which can take you
from one state to another one
Output:
A sequence of actions that, when executed in order
starting in the initial state, guarantee reaching the
goal state
Sound Familiar?
Graph Traversal as a Planning Problem
1. “initial state” is the start node in the graph
2. “goal state” is the goal node in the graph
3. Each “action” is a traversal of one of the edges in
the graph, which takes you from an existing
state (a node in the graph) to another state
(another node in the graph).
The output is a sequence of actions (edges) that
take an agent from the start state to the goal state.
Problems with Graphs as
Representations
The algorithms we used for search in graphs work great: they are efficient, and they
find optimal paths.
However, some planning problems are difficult to represent as graphs. For example,
1. Uncertainty: the agent may not be omniscient (all-knowing), so it doesn’t know
the whole graph at each time step. We’ve talked about some sources of
uncertainty before:
1.
2.
3.
4.
5.
partial observability (agent doesn’t perceive the world fully/accurately)
Stochasticity (actions can have multiple outcomes)
multi-agent (other intelligent agents operate in the environment)
Dynamism (the world changes over time, without the agent doing anything)
computational limits, ignorance, laziness, storage limits, etc.
Consider the problem of planning a traversal of Tuttleman Hall to get from the entrance to room
305. If you didn’t know the building, you’d need to include actions for looking around the hall for
the right room number, or for determining whether there are stairs or elevators, and where they
are. Your subsequent actions would depend on the outcomes of these actions, so you can’t
represent them in the graph at the beginning. (partial observability)
Even if you know the building well, you still can’t plan your route out from the beginning, since you
don’t know if people will be in the way (multi-agent/stochasticity).
Problems with Graphs as
Representations
The algorithms we used for search in graphs work great: they are efficient,
and they find optimal paths.
However, some planning problems are difficult to represent as graphs. For
example,
2. Complexity: the complete graph might be enormous (or infinite), so it’s
unrealistic to assume that the whole thing is given as an input.
For example, consider the problem of corralling 100 sheep (s1 through s100)
into 10 pens (p1-p10). All sheep start in an open field (F). The objective is to
get s1-s10 into p1, s11-s20 into p2, etc. The allowed actions are moving one
sheep from one location (F or p1-p10) to another location.
Quiz: If we represent this as a graph, how many total nodes would there be?
How many total edges?
Answer: Problems with Graphs as
Representations
The algorithms we used for search in graphs work great: they are efficient, and they find optimal paths.
However, some planning problems are difficult to represent as graphs. For example,
2. Complexity: the complete graph might be enormous (or infinite), so it’s unrealistic to assume that
the whole thing is given as an input.
For example, consider the problem of routing corralling 100 sheep (s1 through s100) into 10 pens (p1p10). All sheep start in an open field (F). The objective is to get s1-s10 into p1, s11-s20 into p2, etc.
The allowed actions are moving one sheep from one location (F or p1-p10) to another location.
Quiz: If we represent this as a graph, how many total nodes would there be? How many total edges?
The number of nodes: A node represents a position for all 100 sheep. There’s 11 possible places for s1,
11 for s2, 11 for s3, …, and 11 for s100. So there are 11 x 11 x … x 11 (100 times) = 11100 = around 1.4 x
10104 nodes, or more than a googol (10100).
The number of edges: For every node, there are 100 possible sheep to move, and 11-1 = 10 possible
places to move it to, so 1000 edges per node. So there are a total of 11100 x 1000 = around 1.4 x 10107
edges.
Planning generalizes Graph Search
Planning lets us consider problems with more complexity
and uncertainty than graph search.
The main difference is that the input includes
“states” and “actions” rather than nodes and edges.
In very simple cases, these are the same thing, but not
always.
Note: The main difference is in representation, rather
than inference or learning.
Handling Complexity
with Better Representations
We’ll start by talking about representations that
don’t suffer (as much) from combinatorial
explosions.
Later, we’ll talk about handling partial
observability, stochasticity, and other causes of
uncertainty.
Example Planning Problem
Initial state: sheep are in
the field, as is the robot.
Goal: get sheep into the
corral.
Actions: L: fly left, from corral to field. R: fly right, from field to corral. G: grab a sheep.
U: ungrab, or let go of, a sheep.
Quiz: Planning Problem
Initial state:
Goal:
Which of the following is a plan?
And which of the plans actually
achieves the goal, starting from
the initial state?
1.
2.
3.
4.
5.
[L, L, L]
[U, G, U, G, M, K, Z]
[G, R, U]
[L, G, R, U, L, G, R, U, L, R]
[G, R, U, L, G, R, U, L]
Answers: Planning Problem
Initial state:
Goal:
Which of the following is a plan?
And which of the plans actually
achieves the goal, starting from
the initial state?
1. [L, L, L]
Plan, unsuccesful
2. [U, G, U, G, M, K, Z]
Not a plan (M, K, Z are not
actions in this planning
problem)
3. [G, R, U]
Plan, unsuccessful
4. [L, G, R, U, L, G, R, U, L, R]
Plan, successful
5. [G, R, U, L, G, R, U, L]
Plan, unsuccessful (robot
ends in the wrong spot)
Quiz: Describe States in Logic
Initial state:
Goal:
Using the following boolean
variables, come up with PL
formulas to describe the initial
state and goal state:
Robot_has_sheep_1
Robot_has_sheep_2
Robot_in_field
Sheep_1_in_field
Sheep_2_in_field
Answer: Describe States in Logic
Initial state:
Goal:
Using the following boolean
variables, come up with PL
formulas to describe the initial
state and goal state:
Robot_has_sheep_1
Robot_has_sheep_2
Robot_in_field
Sheep_1_in_field
Sheep_2_in_field
Initial: Robot_in_field ∧
Sheep_1_in_field ∧
Sheep_2_in_field
Goal: Robot_in_field ∧
Sheep_1_in_field ∧
Sheep_2_in_field
Generalizing with PL
Initial state:
Goal:
Suppose we don’t actually care where the
robot ends up, just that the sheep are in
the corral.
We can describe this goal just by
removing the variable Robot_in_field
from the goal description.
New Goal:
Sheep_1_in_field ∧ Sheep_2_in_field
So long as Sheep_1_in_field and
Sheep_2_in_field are both false, any
assignment of T or F to Robot_in_field will
make the goal formula true.
Quiz: Describe States in FOL
Initial state:
Goal:
Using the following constants and relations, write FOL
sentences to describe the initial and goal states.
Constants:
B (robot)
S1, S2 (sheep)
F (field)
C (corral)
Relations:
Sheep(x) – true if x is a sheep
Holding(x, y) – true if x is holding y
At(x, y) – true if x is at location y
Answer: Describe States in FOL
Initial state:
Goal:
Using the following constants and relations, write FOL
sentences to describe the initial and goal states.
Constants:
B (robot)
S1, S2 (sheep)
F (field)
C (corral)
Relations:
Sheep(x) – true if x is a sheep
Holding(x, y) – true if x is holding y
At(x, y) – true if x is at location y
Initial state: At(S1, F) ∧ At(S2, F) ∧ At(B, F)
Goal state: At(S1, C) ∧ At(S2, C)
Quiz: Generalizing with FOL
Initial state:
Like with PL, FOL lets us describe goal states that
include multiple possible worlds.
Unlike PL, it also has convenient ways of generalizing
further.
Goal:
Suppose there were 100 sheep instead of 2. Write an
FOL statement that describes the goal that all of the
sheep are in the corral.
Answer: Generalizing with FOL
Initial state:
Like with PL, FOL lets us describe goal states that
include multiple possible worlds.
Unlike PL, it also has convenient ways of generalizing
further.
Goal:
Suppose there were 100 sheep instead of 2. Write an
FOL statement that describes the goal that all of the
sheep are in the corral.
Answer:
∀s. Sheep(s) ⇒ At(s, C)
This formula succinctly captures the goal state,
regardless of how many sheep are involved.
Describing Actions
We’ve talked a bunch about how to represent
the start and goal states.
What about actions?
Let’s go over two commonly-used approaches.
STRIPS Actions
STRIPS is a language for representing the meaning of actions.
Here are some examples:
Move Left:
Pre: At(B, C)
Eff: At(B, F)  At(B, C)
Ungrab(x, y):
Pre: Holding(B, x)  At(B, y)
Eff: At(x, y)  Holding(B, x)
Each action has a list of arguments, a description of preconditions (what must be true
before the action can take place), and a list of effects (what is true after the action
takes place). Notice that the effects include things that become true, and things that
become false. Preconditions and effects CANNOT use quantifiers (in STRIPS).
Quiz: STRIPS Actions
Write STRIPS action descriptions for the Move
Right and Grab actions.
Answer: STRIPS Actions
Write STRIPS action descriptions for the Move Right and Grab actions. Make sure that the robot can’t
grab something if it’s already holding something.
Move Right:
Pre: At(B, F)
Eff: At(B, C)  At(B, F)
Grab(x, y):
Pre: Holding(B, S1)  Holding(B, S2)  At(B, y)  At(x, y)
Eff: At(x, y)  Holding(B, x)
Note: You need to modify At(sheep, location), either in the Grab/Ungrab actions’ effects, or in the Move
right/Move left actions’ effects. My version here modifies them in the Grab/Ungrab actions.
Note 2: If you want to avoid adding a conjunct to the precondition of Grab for each sheep in the world,
you can create a new boolean variable called handsFull. The Precondition for Grab would require this to
be false, and the effects would make it true. The preconditions for Ungrab would require handsFull to be
true, and the effects would make it false. The only other change is that the initial condition would need
to specify handsFull.
Quiz: State Changes with STRIPS
Initial state:
Initial state: At(S1, F) ∧ At(S2, F) ∧ At(B, F)
Given the initial state above, describe the state of the
world after each of the following actions takes place, in
order:
Goal:
G(s1, F)
R
U(s1, C)
L
U(s1, F)
Answer: State Changes with STRIPS
Initial state:
Initial state: At(S1, F) ∧ At(S2, F) ∧ At(B, F)
Given the initial state above, describe the state of the world
after each of the following actions takes place, in order:
After G(s1, F): At(S1, F) ∧ At(S2, F) ∧ At(B, F) ∧ Holding(B, S1)
After R: At(S2, F) ∧ At(B, F) ∧ Holding(B, S1) ∧ At(B, C)
After U(s1, C): At(S2, F) ∧ Holding(B, S1) ∧ At(B, C) ∧ At(S1, C)
After L: At(S2, F) ∧ At(B, C) ∧ At(S1, C) ∧ At(B, F)
After U(s1, F): Preconditions aren’t met (Holding(B, s1)), so
this action can’t be taken in the current state.
Search Strategies for Finding a Plan
Initial state:
G(s1)
R
Strategy 1: Forward (or progression) search
1. Keep a priority queue of states (each
described by FOL or PL)
2. When it’s time to explore a node, apply all
actions whose preconditions are met, and add
the resulting states to the priority queue
3. Stop when a state is taken from the queue
G(s2) that matches the goal state.
Goal:
Search Strategies for Finding a Plan
Initial state:
G(s1)
Strategy 1: Forward (or progression) search
Notice: this algorithm is very similar to our graph
search algorithms, but it doesn’t require the
complete graph as input.
R
Also notice: I haven’t (yet) specified how to
compute the priorities for the priority queue. But
G(s2) you can use cost (eg, number of actions), or
heuristics, or a combination of the two.
Goal:
Search Strategies for Finding a Plan
Initial state:
Strategy 2: Backward (or regression) search
1. Start by adding the goal state to the priority
queue, instead of the initial state.
2. At each iteration, find all actions whose effects
match the current node, and add the previous
states (before the action) to the queue.
3. Stop when you get a node that matches the
initial state.
U(s2)
R
Goal:
Search Strategies for Finding a Plan
Initial state:
Strategy 2: Backward (or regression) search
Note: this is basically the same, but there are
cases when it’s a lot more efficient than forward
search. Consider the case of 1000 sheep, and the
goal is to get s457 into the corral. Forward search
has 1001 possible actions to consider in the initial
state, while backward search only has to consider
a small number.
U(s2)
R
Goal:
Heuristics for Planning
A popular strategy is to automatically generate heuristics for a planning problem,
from the descriptions of the actions.
Here’s the general idea:
1. Create a relaxed planning problem by simplifying all of the actions.
2. For each node, use a depth-first or breadth-first search to solve the relaxed
planning problem.
3. Use the path cost for the plan from the relaxed problem as the heuristic value for
the node in the full planning problem.
To make this work out, we need to make sure that the relaxed planning problem is
much, much easier to solve than the original planning problem, since we need to
solve the relaxed planning problem many times (each time we explore a node).
Heuristics for Planning
Here’s an example of a strategy for generating a relaxed planning problem from
STRIPS action descriptions.
Start with your existing actions, e.g.:
Grab(x, y):
Pre: Holding(B, S1)  Holding(B, S2)  At(B, y)  At(x, y)
Eff: At(x, y)  Holding(B, x)
Start removing preconditions, to get relaxed action descriptions for a strictly easier
planning problem:
Grab(x):
Pre: At(B, y)  At(x, y)
Eff: At(x, y)  Holding(B, x)
In this version, the robot can hold as many sheep as it likes.
Heuristics for Planning
Here’s an example of a strategy for generating a relaxed planning problem from
STRIPS action descriptions.
Start with your existing actions, e.g.:
Grab(x, y):
Pre: Holding(B, S1)  Holding(B, S2)  At(B, y)  At(x, y)
Eff: At(x, y)  Holding(B, x)
Alternatively, or in addition, you can remove negative effects, e.g.:
Grab(x):
Pre:
Eff: Holding(B, x)
In this version, the robot can hold as many sheep as it wants, it doesn’t have to be in the
same square as the sheep.