Transcript Slides

Probabilistic Abduction using
Markov Logic Networks
Rohit J. Kate Raymond J. Mooney
Department of Computer Science
The University of Texas at Austin
Abduction
• Abduction is inference to the best explanation for a given
set of evidence
• Applications include tasks in which observations need to
be explained by the best hypothesis
–
–
–
–
Plan recognition
Intent recognition
Medical diagnosis
Fault diagnosis
…
• Most previous work falls under two frameworks for
abduction
– First-order logic based Abduction
– Probabilistic abduction using Bayesian networks
2
Logical Abduction
Given:
• Background knowledge, B, in the form of a set of (Horn)
clauses in first-order logic
• Observations, O, in the form of atomic facts in first-order
logic
Find:
• A hypothesis, H, a set of assumptions (logical formulae)
that logically entail the observations given the theory
BH  O
• Typically, best explanation is the one with the fewest
assumptions, e.g. minimizes |H|
3
Sample Logical Abduction Problem
• Background Knowledge:
x y (Mosquito(x)  Infected(x,Malaria)  Bite(x,y) →
Infected(y,Malaria))
x y (Infected(x,Malaria)  Transfuse(Blood,x,y) →
Infected(y,Malaria))
• Observations:
Infected(John, Malaria)
Transfuse(Blood, Mary, John)
• Explanation:
Infected(Mary, Malaria)
4
Previous Work in Logical Abduction
• Several first-order logic based approaches [Poole et al. 1987;
Stickel 1988; Ng & Mooney 1991; Kakas et al. 1993]
• Perform first-order “backward” logical reasoning to
determine the set of assumptions sufficient to deduce
observations
• Unable to reason under uncertainty to find the most
probable explanation
Background Knowledge:
x y (Mosquito(x)  Infected(x,Malaria)  Bite(x,y) → Infected(y,Malaria))
Holds 80% of the times
x y (Infected(x,Malaria)  Transfuse(Blood,x,y) → Infected(y,Malaria))
Holds 40% of the times
Observation:
Infected(John, Malaria)
99% sure
Transfuse(Blood, Mary, John)
60% sure
5
Previous Work in Probabilistic Abduction
• An alternate framework is based on Bayesian
networks [Pearl 1988]
• Uncertainties are encoded in a directed graph
• Given a set of observations, probabilistic inference
over the graph computes the posterior probabilities
of explanations
• Unable to handle structured representations
because essentially based on propositional logic
6
Probabilistic Abduction using MLNs
• We present a new approach for probabilistic
abduction that combines first-order logic and
probabilistic graphical models
• Uses Markov Logic Networks (MLNs) [Richardson and
Domingos 2006], a theoretically sound framework for
combining first-order logic and probabilistic
graphical models
Rest of the talk:
–
–
–
–
MLNs
Our approach using MLNs
Experiments
Future Work and Conclusions
7
Markov Logic Networks (MLNs)
[Richardson and Domingos 2006]
• A logical knowledge base is a set of hard constraints
on the set of possible worlds
• An MLN is a set of soft constraints:
When a world violates a clause, it becomes less
probable, not impossible
• Give each clause a weight
(Higher weight  Stronger constraint)
P(world)  exp
weights of clauses it satisfies 
8
Sample MLN Clauses
x y (Mosquito(x)  Infected(x,Malaria)  Bite(x,y) →
Infected(y,Malaria)) 20
x y (Infected(x,Malaria)  Transfuse(Blood,x,y) →
Infected(y,Malaria)) 5
9
MLN Probabilistic Model
• MLN is a template for constructing a Markov network
– Ground literals correspond to nodes
– Ground clauses correspond to cliques connecting the ground
literals in the clause
• Probability of a world (truth assignments) x:
1


P( x)  exp   wi ni ( x) 
Z
 i

Weight of clause i
No. of true groundings of clause i in x
10
Sample MLN Probabilistic Model
• Clauses with weights:
x y (Mosquito(x)  Infected(x,Malaria)  Bite(x,y) → Infected(y,Malaria)) 20
x y (Infected(x,Malaria)  Transfuse(Blood,x,y) → Infected(y,Malaria)) 5
• Constants:
John, Mary, M
• Ground literals:
Mosquito(M)
Infected(M,Malaria)
Bite(M,John)
Bite(M,Mary)
Infected(John,Malaria)
Infected(Mary,Malaria)
Transfuse(Blood,John,Mary)
Transfuse(Blood,Mary,John)
11
Sample MLN Probabilistic Model
• Clauses with weights:
x y (Mosquito(x)  Infected(x,Malaria)  Bite(x,y) → Infected(y,Malaria)) 20
x y (Infected(x,Malaria)  Transfuse(Blood,x,y) → Infected(y,Malaria)) 5
• Constants:
John, Mary, M
• Ground literals:
Mosquito(M)
true
Infected(M,Malaria)
true
Bite(M,John)
false
Bite(M,Mary)
false
Infected(John,Malaria)
true
Infected(Mary,Malaria)
true
Transfuse(Blood,John,Mary) true
Transfuse(Blood,Mary,John) false
12
Sample MLN Probabilistic Model
• Clauses with weights:
x y (Mosquito(x)  Infected(x,Malaria)  Bite(x,y) → Infected(y,Malaria)) 20 
x y (Infected(x,Malaria)  Transfuse(Blood,x,y) → Infected(y,Malaria)) 5
• Constants:
John, Mary, M
• Ground literals:
Mosquito(M)
true
Infected(M,Malaria)
true
Bite(M,John)
false
Bite(M,Mary)
false
Infected(John,Malaria)
true
Infected(Mary,Malaria)
true
Transfuse(Blood,John,Mary) true
Transfuse(Blood,Mary,John) false
13
Sample MLN Probabilistic Model
• Clauses with weights:
x y (Mosquito(x)  Infected(x,Malaria)  Bite(x,y) → Infected(y,Malaria)) 20  
x y (Infected(x,Malaria)  Transfuse(Blood,x,y) → Infected(y,Malaria)) 5
• Constants:
John, Mary, M
• Ground literals:
Mosquito(M)
true
Infected(M,Malaria)
true
Bite(M,John)
false
Bite(M,Mary)
false
Infected(John,Malaria)
true
Infected(Mary,Malaria) true
Transfuse(Blood,John,Mary) true
Transfuse(Blood,Mary,John) false
14
Sample MLN Probabilistic Model
• Clauses with weights:
x y (Mosquito(x)  Infected(x,Malaria)  Bite(x,y) → Infected(y,Malaria)) 20  
x y (Infected(x,Malaria)  Transfuse(Blood,x,y) → Infected(y,Malaria)) 5 
• Constants:
John, Mary, M
• Ground literals:
Mosquito(M)
true
Infected(M,Malaria)
true
Bite(M,John)
false
Bite(M,Mary)
false
Infected(John,Malaria)
true
Infected(Mary,Malaria) true
Transfuse(Blood,John,Mary) true
Transfuse(Blood,Mary,John) false
15
Sample MLN Probabilistic Model
• Clauses with weights:
x y (Mosquito(x)  Infected(x,Malaria)  Bite(x,y) → Infected(y,Malaria)) 20  
x y (Infected(x,Malaria)  Transfuse(Blood,x,y) → Infected(y,Malaria)) 5  
• Constants:
John, Mary, M
• Ground literals:
Mosquito(M)
true
Infected(M,Malaria)
true
Bite(M,John)
false
Bite(M,Mary)
false
Infected(John,Malaria)
true
Infected(Mary,Malaria) true
Transfuse(Blood,John,Mary) true
Transfuse(Blood,Mary,John) false
16
Sample MLN Probabilistic Model
• Clauses with weights:
x y (Mosquito(x)  Infected(x,Malaria)  Bite(x,y) → Infected(y,Malaria)) 20  
x y (Infected(x,Malaria)  Transfuse(Blood,x,y) → Infected(y,Malaria)) 5  
• Constants:
John, Mary, M
• Ground literals:
Mosquito(M)
true
Infected(M,Malaria)
true
Bite(M,John)
false
Bite(M,Mary)
false
Infected(John,Malaria)
true
Infected(Mary,Malaria)
true
Transfuse(Blood,John,Mary) true
Transfuse(Blood,Mary,John) false
P(
1
)  exp 20*2 5*2
Z
17
MLNs Inference and Learning
• Using probabilistic inference techniques one can
determine the most probable truth assignment,
probability that a clause holds etc.
• Given a database of training examples, appropriate
weights of the formulae can be learned to
maximize the probability of the training data
• An open-source software package for MLNs
called Alchemy is available
18
Abduction using MLNs
• Given:
Infected(Mary,Malaria)  Transfuse(Blood,Mary,John) →
Infected(John,Malaria))
Transfuse(Blood, Mary, John)
Infected(John, Malaria)
• The clause is satisfied whether Infected(Mary,
Malaria) is true or false
• Given the observations, a world has the same
probability in MLN whether the explanation is
true or false, explanations cannot be inferred
• The MLN inference mechanism is inherently
deductive and not abductive
19
Adapting MLNs for Abduction
• Explicitly include the reverse implications
x y (Infected(x,Malaria)  Transfuse(Blood,x,y) →
Infected(y,Malaria))
y (Infected(y,Malaria) →
x (Transfuse(Blood,x,y)  Infected(x,Malaria)))
• Existentially quantify the universally quantified variables
which appear on the LHS but not on the RHS in the
original clause
• Now, given Transfuse(Blood, Mary, John) and
Infected(John, Malaria), the probability of the world in
which Infected(Mary,Malaria) is true will be higher
20
Adapting MLNs for Abduction
• However, there could be multiple explanations for the
same observations:
x y (Infected(x,Malaria)  Transfuse(Blood,x,y) → Infected(y,Malaria))
y (Infected(y,Malaria) →
x (Transfuse(Blood,x,y)  Infected(x,Malaria)))
x y (Mosquito(x)  Infected(x,Malaria)  Bite(x,y) → Infected(y,Malaria))
y (Infected(y,Malaria) →
x (Mosquito(x)  Infected(x,Malaria)  Bite(x,y)))
• An observation should be explained by one explanation
and not multiple explanations
• The system should support “explaining away”
[Pearl 1988]
21
Adapting MLNs for Abduction
• Add the disjunction clause and the mutual exclusivity
clause for the same RHS term
x y (Mosquito(x)  Infected(x,Malaria)  Bite(x,y) → Infected(y,Malaria))
x y (Infected(x,Malaria)  Transfuse(Blood,x,y) → Infected(y,Malaria))
y (Infected(y,Malaria) →
x (Transfuse(Blood,x,y)  Infected(x,Malaria))) v
x (Mosquito(x)  Infected(x,Malaria)  Bite(x,y)))
y (Infected(y,Malaria) →
( x (Transfuse(Blood,x,y)  Infected(x,Malaria))) v
(x (Mosquito(x)  Infected(x,Malaria)  Bite(x,y))))
• Since MLN clauses are soft constraints both explanations
can still be true
22
Adapting MLNs for Abduction
• In general, for the Horn clauses P1 → Q, P2 → Q , …,
Pn → Q in the background knowledge base, add:
– A reverse implication disjunction clause
Q → P1 v P2 v… v Pn
– A mutual exclusivity clause for every pair of
explanations
Q →  P1 v  P2
Q →  P1 v  Pn
…
Q →  P2 v  Pn
• Weights can be learned from training examples or can be
set heuristically
23
Adapting MLNs for Abduction
• There could be constants or variables on the RHS predicate
x y (Mosquito(x)  Infected(x,Malaria)  Bite(x,y) → Infected(y,Malaria))
x (Infected(x,Malaria)  Transfuse(Blood,x,John) → Infected(John,Malaria))
24
Adapting MLNs for Abduction
• There could be constants or variables on the RHS predicate
x y (Mosquito(x)  Infected(x,Malaria)  Bite(x,y) → Infected(y,Malaria))
x (Infected(x,Malaria)  Transfuse(Blood,x,John) → Infected(John,Malaria))
Infected(John,Malaria) →
x (Transfuse(Blood,x,John)  Infected(x,Malaria))) v
x (Mosquito(x)  Infected(x,Malaria)  Bite(x,John))
Infected(John,Malaria) →
( x (Transfuse(Blood,x,John)  Infected(x,Malaria))) v
(x (Mosquito(x)  Infected(x,Malaria)  Bite(x,John)))
25
Adapting MLNs for Abduction
• There could be constants or variables on the RHS predicate
x y (Mosquito(x)  Infected(x,Malaria)  Bite(x,y) → Infected(y,Malaria))
x (Infected(x,Malaria)  Transfuse(Blood,x,John) → Infected(John,Malaria))
Infected(John,Malaria) →
x (Transfuse(Blood,x,John)  Infected(x,Malaria))) v
x (Mosquito(x)  Infected(x,Malaria)  Bite(x,John))
Infected(John,Malaria) →
( x (Transfuse(Blood,x,John)  Infected(x,Malaria))) v
(x (Mosquito(x)  Infected(x,Malaria)  Bite(x,John)))
y (Infected(y,Malaria) →
x (Mosquito(x)  Infected(x,Malaria)  Bite(x,y)))
26
Adapting MLNs for Abduction
• There could be constants or variables on the RHS predicate
x y (Mosquito(x)  Infected(x,Malaria)  Bite(x,y) → Infected(y,Malaria))
x (Infected(x,Malaria)  Transfuse(Blood,x,John) → Infected(John,Malaria))
Infected(John,Malaria) →
x (Transfuse(Blood,x,John)  Infected(x,Malaria))) v
x (Mosquito(x)  Infected(x,Malaria)  Bite(x,John))
Infected(John,Malaria) →
( x (Transfuse(Blood,x,John)  Infected(x,Malaria))) v
(x (Mosquito(x)  Infected(x,Malaria)  Bite(x,John)))
y (Infected(y,Malaria) →
x (Mosquito(x)  Infected(x,Malaria)  Bite(x,y)))
• Formal algorithm is described in the paper, requires appropriate
27
unifications and variable re-namings
Experiments: Dataset
• Plan recognition dataset used to evaluate
abductive systems [Ng & Mooney 1991; Charniak & Goldman
1991]
• Character’s higher-level plans must be inferred to
explain their observed actions in a narrative text
– “Fred went to the supermarket. He pointed a gun at the
owner. He packed his bag.” => robbing
– “Jack went to the supermarket. He found some milk on
the shelf. He paid for it.” => shopping
• Dataset contains 25 development [Goldman 1990] and
25 test examples [Ng & Mooney 1992]
28
Experiments: Dataset contd.
• Background knowledge-base was constructed for the
ACCEL system [Ng and Mooney 1991] to work with the 25
development examples; 107 such rules
instance_shopping(s) ^ go_step(s,g)  instance_going(g)
instance_shopping(s) ^ go_step(s,g) ^ shopper(s,p)  goer(g,p)
• Narrative text is represented in first order logic; average
12.6 literals
– “Bill went to the store. He paid for some milk.”
instance_going(Go1) goer(Go1,Bill) destination_go(Store1)
instance_paying(Pay1) payer(Pay1,Bill) thing_paid(Pay1,Milk1)
• Assumptions explaining the above actions
instance_shopping(S1) shopper(S1,Bill) go_step(S1,Go1)
pay_step(S1,Pay1) thing_shopped_for(S1,Milk)
29
Experiments: Methodology
• Our algorithm automatically adds clauses to the
knowledge-base for performing abduction using
MLNs
• We found that 25 development examples were too
few to learn weights for MLNs, we heuristically
set the weights
– Small negative weights on unit clauses so that they are
not assumed for no reason
– Medium weights on reverse implication clauses
– Large weights on mutual exlcusivity clauses
• Given a set of observations, we use Alchemy’s
probabilistic inference to determine the most
likely truth assignment for the remaining literals
30
Experiments: Methodology contd.
• We compare with the ACCEL system [Ng & Mooney
1992], a purely logic-based system for abduction
• Selects the best explanation using a metric
– Simplicity metric: selects the explanation of smallest
size
– Coherence metric: selects the explanation that
maximally connects the observations (specifically
geared towards this task)
• “John took the bus. He bought milk.” => John took the bus to
the store where he bought the milk.
31
Experiments: Methodology contd.
• Besides finding the assumptions, a deductive
system like MLN also finds other facts that can be
deduced from the assumptions
• We deductively expand ACCEL’s output and goldstandard answers for a fair comparison
• We measure
– Precision: what fraction of the predicted ground literals
are in the gold-standard answers
– Recall: what fraction of the ground literals in the goldstandard answers were predicted
– F-measure: harmonic mean of precision and recall
32
Experiments: Results
Development Set
100
95
90
MLN
ACCEL-Simplicity
ACCEL-Coherence
85
80
75
70
F-measure
Recall
Precision
33
Experiments: Results contd.
Test Set
100
95
90
MLN
ACCEL-Simplicity
ACCEL-Coherence
85
80
75
70
F-measure
Recall
Precision
34
Experiments: Results contd.
• MLN performs better than ACCEL-simplicity
particularly on the development set
• ACCEL-coherence performs the best, but was
specifically tailored for narrative understanding
task
• The dataset used does not require full probabilistic
treatment because little uncertainty in the
knowledge-base or observations
• MLNs did not need any heuristic metric but
simply found the most probable explanation
35
Future Work
• Evaluate probabilistic abduction using MLNs on a
task in which uncertainty plays a bigger role
• Evaluate on a larger dataset on which the weights
could be learned to automatically adapt to a
particular domain
– Previous abductive systems like ACCEL have no
learning mechanism
• Perform probabilistic abduction using other
frameworks of combining first-order logic and
graphical models [Getoor & Taskar 2007], for example,
Bayesian Logic Programming [Kersting & De Raedt 2001]
and compare with the presented approach
36
Conclusions
• A general method for probabilistic first-order
logical abduction using MLNs
• Existing off-the-shelf deductive inference system
of MLNs is employed to do abduction by suitably
reversing the implications
• Handles uncertainties using probabilties and an
unbounded number of related entities using firstorder logic, capable of learning
• Experiments on a small plan recognition dataset
demonstrated that it compares favorably with
special-purpose logic-based abductive systems
37
Thanks!
Questions?
38