1001 Ways to Skin a Planning Graph for Heuristic Fun and Profit Subbarao Kambhampati Arizona State University http://rakaposhi.eas.asu.edu WMD-in-the-toilet “After the flush, you may find that there.

Transcript 1001 Ways to Skin a Planning Graph for Heuristic Fun and Profit Subbarao Kambhampati Arizona State University http://rakaposhi.eas.asu.edu WMD-in-the-toilet “After the flush, you may find that there.

987
1001 Ways to Skin a Planning Graph
for Heuristic Fun and Profit
Subbarao Kambhampati
Arizona State University
http://rakaposhi.eas.asu.edu
WMD-in-the-toilet
“After the flush, you may find
that there were no bombs to begin with”
(With tons of help from Daniel Bryce, Minh Binh Do, Xuan Long Nguyen
Romeo Sanchez Nigenda, Biplav Srivastava, Terry Zimmerman)
Funding from NSF & NASA
Planning Graph and Projection
• Envelope of Progression
Tree (Relaxed Progression)
– Proposition lists: Union
of states at kth level
– Mutex: Subsets of
literals that cannot be
part of any legal state
• Lowerbound
reachability information
A2 pqr
A1 pq
pq
A3
A1
pqs
A2
pr
p
A1 psq
A3
A3
ps
ps
A4
pst
p
p
p
A1 q
A1
Planning Graphs can be used as qthe basis rfor
A2 r A2
heuristics!
s
A3
s
A3
A4 t
[Blum&Furst, 1995] [ECP, 1997]
And PG Heuristics for all..
– Classical (regression) planning
– AltAlt (AAAI 2000; AIJ 2002); AltAltp (JAIR 2003)
• Serial vs. Parallel graphs; Level and Adjusted heuristics; Partial
expansion
– Graphplan style search
– GP-HSP (AIPS 2000)
• Variable/Value ordering heuristics based on distances
– Partial order planning
– RePOP (IJCAI 2001)
• Mutexes used to detect Indirect Conflicts
– Metric Temporal Planning
– Sapa (ECP 2001; AIPS 2002; JAIR 2003)
• Propagation of cost functions; Phased relaxation
– Conformant Planning
– CAltAlt (ICAPS Uncertanity Wkshp, 2003)
Multiple graphs; Labelled graphs
And PG Heuristics for all..
– Classical (regression) planning
– AltAlt (AAAI 2000; AIJ 2002); AltAltp (JAIR 2003)
• Serial vs. Parallel graphs; Level and Adjusted heuristics; Partial
expansion
– Graphplan style search
– GP-HSP (AIPS 2000); PEGG (IJCAI 2003; AAAI 1999]
• Variable/Value ordering heuristics based on distances
– Partial order planning
– RePOP (IJCAI 2001)
• Mutexes used to detect Indirect Conflicts
– Metric Temporal Planning
– Sapa (ECP 2001; AIPS 2002; JAIR 2003)
• Propagation of cost functions; Phased relaxation
– Conformant Planning
– CAltAlt (ICAPS Uncertanity Wkshp, 2003)
• Multiple graphs; Labelled graphs
I. PG Heuristics for State-space
(Regression) planners
Problem: Given a set of subgoals (regressed state)
estimate how far they are from the initial state
[AAAI 2000; AIPS 2000; AIJ 2002; JAIR 2003]
Planning Graphs:
Optimistic Projection of Achievability
Grid Problem
Prop list
Level 0
Action list
Level 0
Prop list
Level 1
Action list
Level 1
Initial state
At(0,0)
noop
At(0,0)
x
Move(0,0,0,1)
x
0
1
2
Move(0,0,1,0)
At(0,1)
At(1,0)
Goal state
Key(0,1)
noop
key(0,1)
…...
…...
noop
noop
At(0,0)
Move(0,1,1,1)
xx
Prop list
Level 2
x
Move(1,0,1,1)
noop
Pick_key(0,1)
x
noop
x
At(0,1)
xx
x
At(1,1)
x
x
x
At(1,0)
x
Have_key
~Key(0,1)
Key(0,1)
x
x
Mutexes
•
•
0
1
2
Serial PG: PG where any pair of non-noop actions are marked mutex
lev(S): index of the first level where all props in S appear non-mutexed.
– If there is no such level, then
• If the graph is grown to level off, then 
• Else k+1 (k is the current length of the graph)
Prop list
Level 0
At(0,0)
Action list
Level 0
Cost of a Set of Literals
Prop list
Level 1
Action list
Level 1
At(0,0)
noop
x
Move(0,0,0,1)
x
noop
At(1,0)
key(0,1)
…...
…...
noop
noop
At(0,0)
Move(0,1,1,1)
xx
Move(0,0,1,0)
Key(0,1)
At(0,1)
Prop list
Level 2
x
Move(1,0,1,1)
noop
Pick_key(0,1)
x
noop
x
At(0,1)
xx
x
At(1,1)
x
x
x
At(1,0)
x
Have_key
~Key(0,1)
Key(0,1)
Admissible
x
x
Mutexes
h(S) = pS lev({p})
Sum
Partition-k
•
•
Adjusted Sum
h(S) = lev(S)
Set-Level
Combo
Set-Level
with memos
lev(p) : index of the first level at which p comes into the planning graph
lev(S): index of the first level where all props in S appear non-mutexed.
–
If there is no such level, then
If the graph is grown to level off, then 
Else k+1 (k is the current length of the graph)
Adjusting the Sum Heuristic
Prop list
Level 0
• Start with Sum heuristic and
adjust it to take subgoal
interactions into account
– Negative interactions in terms
of “degree of interaction”
– Positive interactions in terms
of co-achievement links
• Ignore negative interactions
when accounting for positive
interactions (and vice versa)
Action list
Level 0
At(0,0)
Prop list
Level 1
Action list
Level 1
At(0,0)
noop
x
Move(0,0,0,1)
x
…...
…...
noop
noop
At(0,0)
noop
At(1,0)
key(0,1)
x
At(0,1)
Move(0,1,1,1)
xx
Move(0,0,1,0)
Key(0,1)
At(0,1)
Prop list
Level 2
x
Move(1,0,1,1)
noop
Pick_key(0,1)
x
noop
xx
x
At(1,1)
x
x
x
At(1,0)
x
Have_key
~Key(0,1)
Key(0,1)
PROBLEM
Level
Sum
AdjSum2M
Gripper-25
-
69/0.98
67/1.57
Gripper-30
-
81/1.63
77/2.83
Tower-7
127/1.28
127/0.95
127/1.37
Tower-9
511/47.91
511/16.04
511/48.45
8-Puzzle1
31/6.25
39/0.35
31/0.69
8-Puzzle2
30/0.74
34/0.47
30/0.74
Mystery-6
-
-
16/62.5
Mistery-9
8/0.53
8/0.66
8/0.49
Mprime-3
4/1.87
4/1.88
4/1.67
Mprime-4
8/1.83
8/2.34
10/1.49
Aips-grid1
14/1.07
14/1.12
14/0.88
Aips-grid2
-
-
34/95.98
x
x
Mutexes
HAdjSum2M(S) = length(RelaxedPlan(S)) + max p,qS (p,q)
Where (p,q) = lev({p,q}) - max{lev(p), lev(q)} /*Degree of –ve Interaction */
[AAAI 2000]
Optimizations in Heuristic Computation
• Taming Space/Time costs
• Bi-level Planning Graph
representation
• Partial expansion of the PG
(stop before level-off)
– It is FINE to cut corners
when using PG for heuristics
(instead of search)!!
Goals C,D are present
Example:
•A
•B
x
•C
•A
•B
x
•C
•C
ff
x
•C
•D
x
x
•D
•E
x
x
•E
Heuristic extracted from partial graph vs. leveled graph
100000
Levels-off
Lev(S)
1000
Time(Seconds)
• Select actions in lev(S) vs Levels-off
x
•B
•D
10000
• Branching factor can still be quite high
– Use actions appearing in the PG
Trade-off
•B
x
els o
Lev
•A
•A
•A
•B
Discarded
100
10
1
1
11
21
31
41
51
61
71
81
91
0.1
Problems
101
111
121
131
141
151
161
AltAlt Performance
Schedule Domain (AIPS-00)
100000
10000
Logistics
STAN3.0
HSP2.0
AltAlt1.0
Time(Seconds)
1000
100
10
Scheduling
1
1
11
21
31
41
51
61
71
81
91
0.1
0.01
Problems
Problem sets from IPC 2000
101
111
121
131
141
151
1
Even Parallel Plans aren’t safe..
Serial graph over-estimates
• Use “parallel” rather than serial PG
as the basis for heuristics
Projection over sets of actions too costly
•Select the branch with the best action
and fatten it
• Use “push-up” to make the partial plans
more parallel
Action Templates
Parallel
Planning
Graph
Graphplan
Plan Extension Phase
(based on STAN)
Extraction of
Heuristics
Actions in the
Last Level
AltAltp
Heuristics
Problem Spec
(Init, Goal state)
Node
Expansion
(Fattening)
Node Ordering
and Selection
Plan
Compression
Algorithm
(PushUp)
ZenoTravel AIPS-02
Logistics AIPS-00
Solution Plan
60
90
80
50
70
40
Altalt-p
STAN
TP4
Blackbox
LPG 2nd
50
40
30
Steps
Steps
60
AltAlt
AltAlt-PostProc
AltAlt-p
30
20
20
10
10
0
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61
Problems
0
1
2
3
4
5
6
7
8
9
Problems
10
11
12
13
14
15
[JAIR 2003]
II. PG heuristics for Graphplan..
PG Heuristics for Graphplan(!)
• Goal/Action Ordering
Heuristics for Backward
Search
• Propositions are ordered for
consideration in decreasing
value of their levels.
• Actions supporting a
proposition are ordered for
consideration in increasing
values of their costs
– Cost of an action = 1 + Cost
of its set of preconditions
Prop list
Level 0
Action list
Level 0
At(0,0)
[AIPS 2000]
Action list
Level 1
At(0,0)
noop
x
Move(0,0,0,1)
x
Key(0,1)
At(0,1)
noop
At(1,0)
key(0,1)
Prop list
Level 2
…...
…...
noop
noop
At(0,0)
x
At(0,1)
Move(0,1,1,1)
xx
Move(0,0,1,0)
x
Move(1,0,1,1)
noop
Pick_key(0,1)
x
noop
xx
x
At(1,1)
x
x
x
At(1,0)
x
Have_key
~Key(0,1)
Key(0,1)
x
x
Mutexes
Problem
• Use of level heuristics
improves the performance
significantly.
– The heuristics are surprisingly
insensitive to the length of the
planning graph
Prop list
Level 1
MOP
+3levels
+5levels
+10 levels
L
T
L
T
L
T
bw-large-a
12/12
.007
12/12
.008
12/12
.01
bw-large-b
18/18
0.21
18/18
0.21
18/18
0.25
bw-large-c
28/28
4.13
28/28
4.18
28/28
7.4
huge-fct
18/18
0.01
18/18
0.02
18/18
0.02
bw-prob04
-
>30
-
>30
-
>30
rocket-a
8/29
.006
8/29
.007
8/29
.009
rocket-b
9/32
0.01
9/32
0.01
9/32
0.01
…And then state-space heuristics for Graphplan
(PEGG)
1: Capture a state space view of Graphplan’s search in a search trace
a4
W
W
T
T
X
R
S
W
-
Q
-
-
-
-
Init
State
A
C
E
E
F
F
R
K
E
Z
a3
Y
a2
action assignments
Goal
-
Y
E
R
Y
T
Q
X
Y
Z
No solution?
extend graph…
Regressed ‘states’
0
1
2
3
X
4
Planning Graph (proposition levels)
5
6
…And then state-space heuristics for Graphplan
W
R
W
E
W
Init
State
A
F
C
D
E
K
F
K
F
R
R
E
E
W
W
T
T
R
S
-
F
C
E
F
F
F
T
R
R
J
W
F
R
W
Q
Y
E
X
E
Y
E
R
F
T
R
Goal
X
-
Y
Z
E
Y
Q
-
A
0
1
2
3
4
5
6
7
PEGG now competitive with a heuristic state space planner
Graphplan
Problem
bw-large-b
PEGG
cpu sec
(steps/acts)
cpu sec (steps/acts)
heuristics:
adjusum2
combo
12.2
3.1 (18/18)
87.1 (/ 18 ) 20.5 (/28 )
1104
66.9 (28/28)
738 (/ 28) 114.9 (/38)
340 (38/38)
2350 (/ 36)
GP-e
13.4 (18/18)
bw-large-c
s
bw-large-d
s
Alt Alt (Lisp version)
PEGG-so
pe
*
rocket-ext-a
3.5 (7/36)
2.8 (7/34)
1.1 (7/34)
43.6 (/ 40)
1.26 (/ 34)
att-log-a
31.8 (11/79)
2.6 (11/72)
2.2 (11/62)
36.7 ( /56)
2.27( / 64)
47.5
16.7 (36/45)
14.1 (/ 45)
16.98 (/45)
110.8 (40/59)
38.2 (/ 59)
118
23.6 (511/511)
121(/511)
Gripper-15
s
Gripper-20
s
(511/511)
s
20.92 (/59)
Tower-9
s
8puzzle-1
95.2 (31/31)
31.1
9.2
(31/31)
143.7 ( / 31) 119.5 ( /39)
8puzzle-2
87.5 (30/30)
31.3
7.0
(30/30)
348.3 (/ 30) 50.5 (/ 48)
AIPS 1998
Alt Alt (Lisp version)
grid-y-1
16.7 (14/14)
16.8
mprime-1
4.8
3.6
(4/6)
(4/6)
16.8
(14/14)
739.4 (/14) 640.5 (/14)
2.1
(4/6)
722.6 (/ 4)
AIPS 2002
driverlog-2-3-6b
*
79.6 (/ 4)
Alt Alt (Lisp version)
27.5 (7/20)
1.9
1.9
(7/20)
232
[IJCAI 2003]
Then it was cruelly
UnPOPped
The good times
return with Re(vived)POP
III. PG Heuristics for
PO Planners
In the beginning it was all POP.
POP Algorithm
1. Plan Selection: Select a plan P from
the search queue
2. Flaw Selection: Choose a flaw f
(open cond or unsafe link)
3. Flaw resolution:
If f is an open condition,
choose an action S that achieves f
If f is an unsafe link,
choose promotion or demotion
Update P
Return NULL if no resolution exist
4. If there is no flaw left, return P
1. Initial plan:
g1
g2 Sin
S0
f
2. Plan refinement (flaw selection and resolution):
q1
S0
S1
p
oc1 S g2
2
oc2
~p
S3
g1
g2
Sinf
Choice points
• Flaw selection (open condition? unsafe link? Non-backtrack choice)
• Flaw resolution/Plan Selection (how to select (rank) partial plan?)
PG Heuristics for Partial Order Planning
• Distance heuristics to estimate
cost of partially ordered plans
(and to select flaws)
– If we ignore negative interactions,
then the set of open conditions
can be seen as a regression
state
q1
p
S1
S0
S3
g1
S5
q
r
S4
S2
g2
g2
Sinf
~p
• Mutexes used to detect indirect
conflicts in partial plans
– A step threatens a link if there is
a mutex between the link
condition and the steps’ effect or
precondition
– Post disjunctive precedences
and use propagation to
simplify
p
Si
Sj
q
Sk
r
if mutex( p, q) or mutex( p, r )
S k  Si  S j  S k
RePOP’s Performance
• RePOP implemented on
top of UCPOP
– Dramatically better
than any other partial
order planner
– Competitive with
Graphplan and AltAlt
– VHPOP carried the
torch at ICP 2002
Problem
UCPOP
RePOP
Graphplan
AltAlt
Gripper-8
-
1.01
66.82
.43
Gripper-10
-
2.72
47min
1.15
Gripper-20
-
81.86
-
15.42
Rocket-a
-
8.36
75.12
1.02
Rocket-b
-
8.17
77.48
1.29
Logistics-a
-
3.16
306.12
1.59
Logistics-b
-
2.31
262.64
1.18
Logistics-c
-
22.54
-
4.52
Logistics-d -
91.53
-
20.62
Bw-large-a
45.78
(5.23) -
14.67
4.12
Bw-large-b
-
(18.86) -
122.56
14.14
(137.84) -
-
116.34
Bw-large-c -
Written in Lisp, runs on Linux, 500MHz, 250MB
You see, pop, it is possible to Re-use all the old POP work!
[IJCAI, 2001]
IV. PG Heuristics for Metric
Temporal Planning
Planning Problem
Select state with
lowest f-value
Generate
start state
Queue of TimeStamped states
f can have both
Cost & Makespan
components
Satisfies
Goals?
No
Expand state
by applying
actions
Build RTPG
Propagate Cost
functions
Heuristic
Extract relaxed plan estimation
Adjust for
Mutexes; Resources
Yes
Partialize the
p.c. plan
Return
o.c and p.c plans
[ECP 2001; AIPS 2002; ICAPS 2003; JAIR 2003]
Multi-Objective Nature of MTP
• Plan quality in Metric Temporal
domains is inherently Multidimensional
– Temporal quality (e.g. makespan,
slack)
– Plan cost (e.g. cumulative action
cost, resource consumption)
• Necessitates multi-objective search
– Modeling objective functions
– Tracking different quality metrics
and heuristic estimation
 Challenge: Inter-dependencies
between different quality metrics
Typically cost will go down with
higher makespan…
Tempe
L.A
Phoenix
SAPA’s approach
• Use a temporal version of the
Planning Graph (Smith & Weld)
structure to track the time-sensitive
cost function:
•
– Estimation of the earliest time
(makespan) to achieve all goals.
– Estimation of the lowest cost to
achieve goals
– Estimation of the cost to achieve
goals given the specific makespan
value.
Use this information to calculate the
heuristic value for the objective
function involving both time and cost
 Challenge: How to propagate cost
over planning graphs?
Tempe
Los Angeles
Phoenix
Drive-car(Tempe,LA)
Heli(T,P) Airplane(P,LA)
Shuttle(T,P)
t=0
t = 0.5
t=1
t = 1.5
t = 10
Search through time-stamped states
• Goal Satisfaction:
S=(P,M,,Q,t)  G if <pi,ti> G
either:
–  <pi,tj>  P, tj < ti and no event in Q
deletes pi.
–  e  Q that adds pi at time te < ti.
Set <pi,ti> of
predicates pi and the
time of their last
achievement ti < t.
Set of protected
persistent conditions
(could be binary or resource conds).
S=(P,M,,Q,t)
Set of functions represent
resource values.
• Action Application:
– All instantaneous preconditions of A are
satisfied by P and M.
– A’s effects do not interfere with  and Q.
– No event in Q interferes with persistent
preconditions of A.
– A does not lead to concurrent resource
change
• When A is applied to S:
– P is updated according to A’s
instantaneous effects.
– Persistent preconditions of A are put in 
– Delayed effects of A are put in Q.
Event queue (contains resource as well
As binary fluent events).
(in-city ?airplane ?city1)
Action A is applicable in S if:
Time stamp of S.
(in-city ?airplane ?city2)
consume (fuel ?airplane)
Flying
(in-city ?airplane ?city1)
(fuel ?airplane) > 0
Search:
Pick a state S from the queue.
If S satisfies the goals, end
Else non-deterministically do one of
--Advance the clock
(by executing the earliest event in Qs
--Apply one of the applicable actions to S
Propagating Cost Functions
Drive-car(Tempe,LA)
Tempe
Hel(T,P)
Airplane(P,LA)
Airplane(P,LA)
Shuttle(T,P)
L.A
Phoenix
t=0

t = 0.5
t=1
t = 1.5
t = 2.0
t = 10
Shuttle(Tempe,Phx):
Cost: $20; Time: 1.0 hour
Helicopter(Tempe,Phx):
Cost: $100; Time: 0.5 hour
Car(Tempe,LA):
Cost: $100; Time: 10 hour
Airplane(Phx,LA):
Cost: $200; Time: 1.0 hour
$300
$220
$100
$20
0
0.5 1 1.5
2
Cost(At(LA))
10
time
Cost(At(Phx)) = Cost(Flight(Phx,LA))
Issues in Cost Propagation
Costing a set of literals
•
•
Cost(f,t) = min {Cost(A,t) : f Effect(A)}
Cost(A,t) = Aggregate(Cost(f,t): f Pre(A))
• Aggregate can be Sum or Max
• Set-level idea would entail tracking
costs of subsets of literals
Termination Criteria
• Deadline Termination: Terminate
at time point t if:
–  goal G: Deadline(G)  t
–  goal G: (Deadline(G) < t)
 (Cost(G,t) = 
• Fix-point Termination: Terminate
at time point t where we can not
improve the cost of any
proposition.
• K-lookahead approximation: At t
where Cost(g,t) < , repeat the
process of applying (set) of
actions that can improve the cost
functions k times.
Heuristics based on cost functions
Using Relaxed Plan
Direct
• If we want to minimize
makespan:
– h = t0
• If we want to minimize cost
– h = CostAggregate(G, t)
• If we want to minimize a
function f(time,cost) of cost
and makespan
– h = min f(t,Cost(G,t)) s.t. t0
 t  t
• E.g. f(time,cost) =
100.makespan + Cost then h
= 100x2 + 220 at t0  t = 2 
t
•
Extract a relaxed plan using
h as the bias
–
If the objective function is
f(time,cost), then action A (
to be added to RP) is
selected such that:
f(t(RP+A),C(RP+A)) +
f(t(Gnew),C(Gnew))
is minimal
Gnew = (G  Precond(A)) \ Effects)
cost
Time of Earliest
achievement

$300
Cost(At(LA))
$220
$100 Time of lowest cost
0
t0=1.5
2
t = 10
time
Phased Relaxation
The relaxed plan can be adjusted to take into account
constraints that were originally ignored
Adjusting for Mutexes:
Adjust the make-span estimate of the relaxed plan by
marking actions that are mutex (and thus cannot be
executed concurrently
Adjusting for Resource Interactions:
Estimate the number of additional resource-producing
actions needed to make-up for any resource short-fall
in the relaxed plan
C = C + R  (Con(R) – (Init(R)+Pro(R)))/R * C(AR)
Handling Cost/Makespan Tradeoffs
Cost variation
Planning Problem
Select state with
lowest f-value
Generate
start state
f can have both
Cost & Makespan
components
Makespan variation
60
Queue of TimeStamped states
Satisfies
Goals?
No
Expand state
by applying
actions
Build RTPG
Propagate Cost
functions
Heuristic
Extract relaxed plan estimation
Adjust for
Mutexes; Resources
Total Cost
50
Yes
Partialize the
p.c. plan
40
30
20
10
0
Return
o.c and p.c plans
0.1
0.2
0.3
0.4
0.5
0.6
0
0.8
0.9
0.95
1
Alpha
Results over 20 randomly generated
temporal logistics problems involve
moving 4 packages between different
locations in 3 cities:
O = f(time,cost) = .Makespan + (1- ).TotalCost
Planning Problem
Select state with
lowest f-value
Generate
start state
Queue of TimeStamped states
SAPA at IPC-2002
Satellite (complex setting)
Rover (time setting)
No
f can have both
Cost & Makespan
components
Satisfies
Goals?
Expand state
by applying
actions
Build RTPG
Propagate Cost
functions
Heuristic
Extract relaxed plan estimation
Adjust for
Mutexes; Resources
Satellite (complex setting)
Rover (time setting)
[JAIR 2003]
Yes
Partialize the
p.c. plan
Return
o.c and p.c plans
Input for
Clausal
States
Cond
Heuristics
Labels
(CUDD)
Gu
id
By ed
Sea
rch
es
A* Search Engine
(HSP-r)
Planning
Graph(s)
(IPP)
Off – The - Shelf
d
cte
tra
Ex m
Fro
ense
Input for IPC PDDL
Parser
Model
Checker
(NuSMV)
Validates
Custom
IV. PG Heuristics for
Conformant Planning
Conformant Planning as Regression
Actions:
A1: M
P => K
A2: M
Q => K
A3: M
R => L
A4: K => G
A5: L => G
Initially:
(P V Q V R) &
(~P V ~Q) &
(~P V ~R) &
(~Q V ~R) &
M
Goal State:
G
G
A4
(G V K)
G or K must be
true before A4
For G to be true
after A4
A5
(G V K V L)
A1
(G V K V L V P) &
M
A2
(G V K V L V P V Q) &
M
A3
Each Clause is Satisfied
by a Clause in the Initial
Clausal State -- Done!
(5 actions)
(G V K V L V P V Q V R)
R) &
&
M
Using a Single, Unioned Graph
P
P
M
Q
M
R
R
M
P
A1
P
A1
Q
A2
A3
Heuristic
Estimate = 2
Union literals from
all initial states into a
conjunctive initial
graph level
Q
A2
R
A3
R
M
M
K
K
A4
L
L
A5
G
•Easy to
implement
•Not effective
•Lose world
specific support
information
•Incorrect
mutexes
Using Multiple Graphs
P
A1
M
A1
M
K
P
M
Q
M
P
P
M
A4
K
•Accurate
Mutexes
•Moderate
Implementation
Difficulty
G
Q
A2
M
Q
A2
M
M
K
R
M
Q
A4
K
G
R
R
A3
M
R
A3
M
M
L
L
A5
G
•Memory
Intensive
•Heuristic
Computation
Can be costly
Unioning these
graphs a priori
would give
much savings
…
Using a Single, Labeled Graph
Label of a literal signifies the
set of worlds in which it is supported
--Full support means all init worlds
P
M
Q
M
R
M
P
Q
A1
A2
R
Action Labels:
Conjunction of Labels
of Supporting Literals
P
Q
A3
A3
M
Literal Labels:
Disjunction of Labels
Label Key Of Supporting Actions
K
~P & ~R
~P & ~Q
A2
R
M
True
~Q & ~R
A1
(~P & ~R) V (~Q & ~R)
(~P & ~R) V (~Q & ~R) V
(~P & ~Q)
L
A4
A5
Heuristic Value = 5
•Memory
Efficient
•Cheap
P Heuristics
•Scalable
Q •Extensible
R •Tricky to
Implement
M
Benefits
K from
L BDD’s and
a model
G checker
ATMS
CAltAlt Performance
• Label-graph based
heuristics make CAltAlt
competitive with the
current best approaches
30
Rovers Domain
1000000
25
100000
20
Plan Length
10000
Time(ms)
Logistics
15
1000
10
100
10
Single Sum
Multi Level
Multi RP Union
Label Level
HSCP
Label RP
GPT
CGP
KACMBP
5
Label RP
CGP
HSCP
GPT
KACMBP
1
0
1
2
Problem
3
4
1
2
Problem
3
4
The Damage until now..
– Classical (regression) planning
– AltAlt (AAAI 2000; AIJ 2002); AltAltp (JAIR 2003)
• Serial vs. Parallel graphs; Level and Adjusted heuristics; Partial
expansion
– Graphplan style search
– GP-HSP (AIPS 2000); PEGG (IJCAI 2003; AAAI 1999]
• Variable/Value ordering heuristics
based
onPG
distances
Still to
come:
Heuristics for—
– Partial order planning
• Probabilistic Conformant Planning
• Conditional
• Mutexes used to detect Indirect
Conflicts Planning
• Lifted Planning
– Metric Temporal Planning
– RePOP (IJCAI 2001)
– Sapa (ECP 2001; AIPS 2002; JAIR 2003)
• Trans-Atlantic camaraderie
• Propagation of cost functions;• Phased
relaxation
Post-war reconstruction
– Conformant Planning
• Middle-east peace…
– CAltAlt (ICAPS Uncertanity Wkshp, 2003)
• Multiple graphs; Labelled graphs
Meanwhile outside Tempe…
• Hoffman’s FF uses relaxed plans from PG
• Geffner & Haslum derive DP-versions of PGheuristics
• Gerevini & Serina’s LPG uses PG heuristics to
cost the various repairs
• Smith back-propagates (convolves) probability
distributions over PG to decide the
contingencies worth focusing on
• Trinquart proposes a PG-clone that directly
computes reachability in plan-space…
• …
Why do we love PG Heuristics?
• They work!
• They are “forgiving”
–
–
You don't like doing mutex? okay
You don't like growing the graph all the way? okay.
• Allow propagation of many types of information
– Level, subgoal interaction, time, cost, world support,
• Support phased relaxation
– E.g. Ignore mutexes and resources and bring them back later…
• Graph structure supports other synergistic uses
– e.g. action selection
• Versatility…
• PG Variations
–
–
–
–
Serial
Parallel
Temporal
Labelled
• Propagation Methods
–
–
–
–
Level
Mutex
Cost
Label
Versatility of PG Heuristics
• Planning Problems
– Classical
– Resource/Temporal
– Conformant
• Planners
–
–
–
–
Regression
Progression
Partial Order
Graphplan-style

1001 Ways to Skin a Planning Graph for Heuristic Fun and Profit Subbarao Kambhampati Arizona State University http://rakaposhi.eas.asu.edu WMD-in-the-toilet “After the flush, you may find that there.

Transcript 1001 Ways to Skin a Planning Graph for Heuristic Fun and Profit Subbarao Kambhampati Arizona State University http://rakaposhi.eas.asu.edu WMD-in-the-toilet “After the flush, you may find that there.

Directory