Factored Planning: How, When, and When Not

Download Report

Transcript Factored Planning: How, When, and When Not

Working with Preferences
Ronen Brafman
Computer Science Department
Ben-Gurion University
Goal of our Work: Provide tools and
build systems that can make, or can
help users make, good choices.
Such systems must understand the
user’s preferences to be useful.
Applications of Interest

optimal product/item selection


optimal product configuration


Configure a PC, a vacation, a car, …
personalization


Select a flight, a camera, a movie, …
Personalize content, interface, …
guiding program choices

Program will make choices based on a preference model provided
by the designer
We target lay users and application designers with no
background in decision theory
Focus of this Talk : Provide an overview
of an approach to preference handling
based on the xCP-net models and
algorithms, and discuss some applications
Preference
specification
Feedback 
Spec Revision
Updated
Choice
Optimal
automobile
Constrained
Optimization
Process
Manufacturer
Constraints
The Key Ideas

Preference elicitation:
Use natural language like statements – make it simple
 Support heterogeneous types of statements
 Make scaling up to quantitative assessment possible


Calculating optimal choices:
Efficient methods for ordering sets
 Algorithms for constrained optimization
 Efficient data-driven incremental elicitation

Elicitation and Modeling
Natural Preferences Statements

Value preferences:
Strict: I prefer an isle seat to a window seat
 Conditional: I prefer isle to window in economy class


Attribute importance:
Strict: Seat assignment is more important than airline
 Conditional: Seat assignment is less important than
airline in domestic flights

The Ceteris Paribus Semantics


Ceteris Paribus = Latin: all else being equal
“I prefer an isle seat to a window seat” 



Given two flights that differ only in seat assignment, I prefer
the one with an isle seat.
Otherwise, can’t infer a preference
“Seat assignment is less important than airline in
domestic flights” 


Given two domestic flights that differ in seat assignment and
airline only, I prefer the one with a better airline
Says nothing about two international flights
CP-nets (Boutilier, Brafman, Hoos, Poole 99, Boutilier,
Brafman, Domshlak, Hoos, Poole 04)
A qualitative, graphical model of preferences, that
captures and organizes statements of
conditional value preference.
 Each node represents a domain variable.
 Parents(v) are those variables that affect user’s
preference over the values of v


Parents(class) = {airline}
Conditional preference table (CPT) associated with
every node in the CP-net

Provides an ordering over the values of the node for every
possible parent context
Example of a CP-net
a
a
A
B
C
c:d
d
c :d
d
d: f
f
d:f
f
D
F
E
b
b
( a  b)  ( a  b ) : c
c
( a  b )  ( a  b) : c
c
c:e
e
c :e
e
Semantics and Consistency
Any acyclic CP-net defines a (consistent)
partial order over the outcome space.
abc
a a
A
B
C
worst
b b
abc
abc
abc
abc
abc
abc
( a  b)  ( a  b ) : c c
( a  b )  ( a  b) : c
c
abc
best
Importance Relations
a
a
A
b
abc
b

B
worst
abc
abc
abc
C
( a  b)  ( a  b ) : c
c
( a  b )  ( a  b) : c
c
abc
abc
abc
abc
best
More Complex Example
1d
2d
Day of the flight
ba
1d : m
2d : n
m : 1l
0l
n : 0l
1l
klm
n
m
Airline
Departure Time
Stop-overs
Class
m : b
e
n : e
b
1d
1d : m
2d : n
2d
Day of the flight
n
m
Departure Time


Airline
ba
Stop-over
Class
m : 1l
0l
m : b
e
n : 0l
1l
n : e
b
klm
1d
2d
1d : m
2d : n
Day of the flight
n
m
Departure Time


Airline
ba
Stop-overs
m : 1l
0l
n : 0l
1l
m  klm : S  C
n  ba : S  C
m  ba : C  S
klm
Class
m : b
e
n : e
b
nodes  variables
a a
a :b
b
a :b
b
b:c
c
b :c
c
cp-arcs (directed)
A
i-arcs (directed)
ci-arcs (undirected)
B


E
C
D
S (C, D)  {B, E}
be : C
D
be : D
C
be : D
C
e e
b:d
d
b :d
d
cp-tables
ci-tables
Role of Graphical Structure

Users need not be aware of the underlying
graphical structure
User employs statement templates or some other
input interface
 System constructs the network on the fly


Graphical structure important for:
Analysis – query complexity related to structure
 Algorithms – often use topological sort

Preferences  Plausibility
CP = Conditional Plausibility (Jerome Lang)
 “p is more plausible than –p given q” (ceteris paribus):


Given two worlds satisfying q that are identical except
for the value of p, the one satisfying p is more plausible
than the one satisfying –p.
“p is more important, plausibility wise, than q”
(ceteris paribus):

Worlds in which p has its more plausible value are more
plausible than worlds in which q has its more plausible
value
UCP-Nets
[Boutilier, Bacchus, & Brafman]
Quantified CPT
 Simplified semantics: sum
of utility factors; U(abcd)=
fA(a)+fB(b)+fC(abc)+fD(cd) =
5 + 4 + .6 + .9 = 10.5
 Linear time computation
 Linear time comparison

a a
5 2
ab
ab
ab
ab
c
.6
.2
.3
.9
A
c
.1
.8
.8
.3
B
b b
4 3
C
D
d d
c .9 .8
c .2 .3
Compiling Diverse
Information into UCP-nets

Diverse statement can be compiled into UCP-net:






Independence information: maintained by graph
Value preferences and variables preferences
Two-way comparisons of complete outcomes
Quantitative information
Simple compilation based on linear constraints
Good for incremental refinement:


Start with qualitative model – no cold start!
Refine with user feedback/additional observations
Using (T)CP-nets Algorithms
1.
2.
Selecting the best element
Ordering and constrained optimization
1. Preferential Optimization
Finding the preferentially optimal outcome for an acyclic
network is straightforward!
a a
A
B
C
c:d
d
D
c :d
d
d: f
f
d:f
f
F
E
b b
a
A
( a  b)  ( a  b ) : c
c
( a  b )  ( a  b) : c
c
c:e e
c :e
e
B
b
c
C
d
D
Ff
e
E
2. Preference-Based
Constrained Optimization (PCO)

Given:
User preferences
 An implicitly specified set of feasible options



Find one/all/k optimal, feasible element(s)
Applications:
Product configuration (PC, vacation)
 Optimal plan selection
 Content/display adaptation and personalization

Solving PCO:
Ordered Generate & Test

Generate outcomes in non-increasing order
= linearize the partial order over all possible elements


Test for feasibility
Check for optimality:
First feasible outcome is optimal!
 Need more?

Maintain set of optimal solutions
 New solution is optimal if not dominated by any
previously generated solution

Generating a Non-increasing
Sequence of Outcomes


Topologically sort the variables
Build an assignment (search) tree by
instantiating variables according to this order


Order variable values based on the CPT
Leaf nodes, ordered left to right (= depth-first
traversal of the tree) correspond to a nonincreasing sequence of outcomes
a
a
A
b


b
B
2
1
C
a
3
( a  b)  ( a  b ) : c
c
( a  b )  ( a  b) : c
c
ab
a
ab
ab
ab
abc abc ab c ab c a bc a bc a b c a b c
Sorting?



We just saw how we can order all outcomes
This method does not work for sorting a subset
of all possible assignments
Fortunately, there is a method that can be used
to sort n outcomes in time O(n log n)
Ordered Generate and Test 
Efficient Constrained Optimization



Tree search + favorite CSP pruning techniques
Equivalent to solving the CSP with meta-level
constrains on variable and value ordering
Branch and bound – eliminate sub-tree when:
We assign a variable to a less preferred value
 Current set of constraints as strong as for some
previous value of this variable

Constraints: a  b
a
ab
abc abc
a  b
a
ab
dominated
a bc a bc
Anytime Behavior

First feasible solution is optimal!



No theoretical overhead beyond standard CSP
solution
No item withdrawn from set of current
solutions
To obtain more than one solution dominance
testing required

Can lead to considerable computational overhead
Applications
Flight Selection
(Brafman, Domshlak, Kogan UAI’04)


Instance of “selecting optimal element” problem
Approach:

User supplies





Preferences compiled into UCP-net
Top k results shown
User provides feedback



constraints (source, destination, etc.)
initial preferences
Which flight is most preferred among k best
Any new preferences
Top results revised accordingly
Planning for Data Products
(Golden et. al.)








NASA collects much raw data about earth
Requires extensive processing to be useful
Each Earth scientists needs different processing
ImageBot generates data plans for such products
Data plans run scientific models, combine and
transform data in order to achieve data goals
Many plans can generate one data product
Each plan has different value
Using a simple preference language ImageBot planning
algorithm is biased towards more preferred plans
Adaptive, Personalized
Rich Media Presentations





Presentations with diverse media elements
requiring spatio-temporal synchronization
Target wide audiences
Audience members have different tastes, different
network connections, and use different devices
Goal: provide designers with tools for designing
presentations that adapt to user and user context
Demo
How Does It Work?





Variables denote different presentation elements (video,
ads, running text)
Additional variables denote context information
TCP-nets specify preference relation over presentations
Author may specify additional constraints
At download time combine:




TCP-net
Information about user and user context
Resulting constrained optimization problem solved
Output a SMIL presentation
CP-Net for ESPN Promo
Gender
 Ad1
Sports Video 
Nationality


Ad2
Text
Real-Time Content Selection for
Command and Control

Imagine a decision maker monitoring masses of
data in a real-time command & control center
for all rescue forces in Paris
Video streams
 Sensor data
 Results of relevant queries
 Results of data analysis (e.g., simulations, risk
assessment)


Task: Which data to show at each point in time?
Possible Information Sources






Cameras on fireman helmet
Fixed surveillance cameras
Heat sensors, smoke detectors, co2-levels
Area maps, building plans, driving distance,
number of residents
Simulation of structure strength, time to contain
a fire as function of wind and other weather
conditions, etc.
Demo
Our Proposal:
Decision Theoretic Control


Offline: build a preference model capturing the
value of different choices
Online: compute best choices
In our case: which data to show
 In general: whatever choices the system must make



Not new, but not too practical, so far
Main obstacle: preference handling
What Are the Challenges

ESPN solution (using CP-nets) is static


Our context is dynamic:




We know all the variables and their possible values at
specification time
Different cities will have different relevant information
streams
Same city will have different relevant information streams at
each time
A single specification should handle LA, Paris, NY, etc.
Our solution: relational preference rules



A relational model that generalizes of UCP-nets
Assumes concepts/classes are fixed (fireman, fire, fire truck)
Does not assume anything about specific instances
Modeling a Fire Department

Rules:
Fireman(x) Λ fire(y) Λ x.location = y.location
 x.camera.display [“on” 4, “off” 0]
Fireman(x) Λ x.co2-level=high
 x.camera.display [“on” 8, “off” 0]
Collaborators





Carmel Domshlak – Technion
Craig Boutilier – University of Toronto
Holger Hoos, David Poole – UBC
Doron Friedman – University College London
Solomon Shimony – Ben-Gurion University
Summary




Much recent interest in work on preference
representation, elicitation, and reasoning
xCP-nets offer a simple preference language and
convenient graphical and algorithmic tools
UCP-nets provide good target for knowledge
compilation
Many applications in e-commerce and user
interfaces
Compare
If dominated …
>
Otherwise …



