Transcript Slides
Probabilistic Abduction using Markov Logic Networks Rohit J. Kate Raymond J. Mooney Department of Computer Science The University of Texas at Austin Abduction • Abduction is inference to the best explanation for a given set of evidence • Applications include tasks in which observations need to be explained by the best hypothesis – – – – Plan recognition Intent recognition Medical diagnosis Fault diagnosis … • Most previous work falls under two frameworks for abduction – First-order logic based Abduction – Probabilistic abduction using Bayesian networks 2 Logical Abduction Given: • Background knowledge, B, in the form of a set of (Horn) clauses in first-order logic • Observations, O, in the form of atomic facts in first-order logic Find: • A hypothesis, H, a set of assumptions (logical formulae) that logically entail the observations given the theory BH O • Typically, best explanation is the one with the fewest assumptions, e.g. minimizes |H| 3 Sample Logical Abduction Problem • Background Knowledge: x y (Mosquito(x) Infected(x,Malaria) Bite(x,y) → Infected(y,Malaria)) x y (Infected(x,Malaria) Transfuse(Blood,x,y) → Infected(y,Malaria)) • Observations: Infected(John, Malaria) Transfuse(Blood, Mary, John) • Explanation: Infected(Mary, Malaria) 4 Previous Work in Logical Abduction • Several first-order logic based approaches [Poole et al. 1987; Stickel 1988; Ng & Mooney 1991; Kakas et al. 1993] • Perform first-order “backward” logical reasoning to determine the set of assumptions sufficient to deduce observations • Unable to reason under uncertainty to find the most probable explanation Background Knowledge: x y (Mosquito(x) Infected(x,Malaria) Bite(x,y) → Infected(y,Malaria)) Holds 80% of the times x y (Infected(x,Malaria) Transfuse(Blood,x,y) → Infected(y,Malaria)) Holds 40% of the times Observation: Infected(John, Malaria) 99% sure Transfuse(Blood, Mary, John) 60% sure 5 Previous Work in Probabilistic Abduction • An alternate framework is based on Bayesian networks [Pearl 1988] • Uncertainties are encoded in a directed graph • Given a set of observations, probabilistic inference over the graph computes the posterior probabilities of explanations • Unable to handle structured representations because essentially based on propositional logic 6 Probabilistic Abduction using MLNs • We present a new approach for probabilistic abduction that combines first-order logic and probabilistic graphical models • Uses Markov Logic Networks (MLNs) [Richardson and Domingos 2006], a theoretically sound framework for combining first-order logic and probabilistic graphical models Rest of the talk: – – – – MLNs Our approach using MLNs Experiments Future Work and Conclusions 7 Markov Logic Networks (MLNs) [Richardson and Domingos 2006] • A logical knowledge base is a set of hard constraints on the set of possible worlds • An MLN is a set of soft constraints: When a world violates a clause, it becomes less probable, not impossible • Give each clause a weight (Higher weight Stronger constraint) P(world) exp weights of clauses it satisfies 8 Sample MLN Clauses x y (Mosquito(x) Infected(x,Malaria) Bite(x,y) → Infected(y,Malaria)) 20 x y (Infected(x,Malaria) Transfuse(Blood,x,y) → Infected(y,Malaria)) 5 9 MLN Probabilistic Model • MLN is a template for constructing a Markov network – Ground literals correspond to nodes – Ground clauses correspond to cliques connecting the ground literals in the clause • Probability of a world (truth assignments) x: 1 P( x) exp wi ni ( x) Z i Weight of clause i No. of true groundings of clause i in x 10 Sample MLN Probabilistic Model • Clauses with weights: x y (Mosquito(x) Infected(x,Malaria) Bite(x,y) → Infected(y,Malaria)) 20 x y (Infected(x,Malaria) Transfuse(Blood,x,y) → Infected(y,Malaria)) 5 • Constants: John, Mary, M • Ground literals: Mosquito(M) Infected(M,Malaria) Bite(M,John) Bite(M,Mary) Infected(John,Malaria) Infected(Mary,Malaria) Transfuse(Blood,John,Mary) Transfuse(Blood,Mary,John) 11 Sample MLN Probabilistic Model • Clauses with weights: x y (Mosquito(x) Infected(x,Malaria) Bite(x,y) → Infected(y,Malaria)) 20 x y (Infected(x,Malaria) Transfuse(Blood,x,y) → Infected(y,Malaria)) 5 • Constants: John, Mary, M • Ground literals: Mosquito(M) true Infected(M,Malaria) true Bite(M,John) false Bite(M,Mary) false Infected(John,Malaria) true Infected(Mary,Malaria) true Transfuse(Blood,John,Mary) true Transfuse(Blood,Mary,John) false 12 Sample MLN Probabilistic Model • Clauses with weights: x y (Mosquito(x) Infected(x,Malaria) Bite(x,y) → Infected(y,Malaria)) 20 x y (Infected(x,Malaria) Transfuse(Blood,x,y) → Infected(y,Malaria)) 5 • Constants: John, Mary, M • Ground literals: Mosquito(M) true Infected(M,Malaria) true Bite(M,John) false Bite(M,Mary) false Infected(John,Malaria) true Infected(Mary,Malaria) true Transfuse(Blood,John,Mary) true Transfuse(Blood,Mary,John) false 13 Sample MLN Probabilistic Model • Clauses with weights: x y (Mosquito(x) Infected(x,Malaria) Bite(x,y) → Infected(y,Malaria)) 20 x y (Infected(x,Malaria) Transfuse(Blood,x,y) → Infected(y,Malaria)) 5 • Constants: John, Mary, M • Ground literals: Mosquito(M) true Infected(M,Malaria) true Bite(M,John) false Bite(M,Mary) false Infected(John,Malaria) true Infected(Mary,Malaria) true Transfuse(Blood,John,Mary) true Transfuse(Blood,Mary,John) false 14 Sample MLN Probabilistic Model • Clauses with weights: x y (Mosquito(x) Infected(x,Malaria) Bite(x,y) → Infected(y,Malaria)) 20 x y (Infected(x,Malaria) Transfuse(Blood,x,y) → Infected(y,Malaria)) 5 • Constants: John, Mary, M • Ground literals: Mosquito(M) true Infected(M,Malaria) true Bite(M,John) false Bite(M,Mary) false Infected(John,Malaria) true Infected(Mary,Malaria) true Transfuse(Blood,John,Mary) true Transfuse(Blood,Mary,John) false 15 Sample MLN Probabilistic Model • Clauses with weights: x y (Mosquito(x) Infected(x,Malaria) Bite(x,y) → Infected(y,Malaria)) 20 x y (Infected(x,Malaria) Transfuse(Blood,x,y) → Infected(y,Malaria)) 5 • Constants: John, Mary, M • Ground literals: Mosquito(M) true Infected(M,Malaria) true Bite(M,John) false Bite(M,Mary) false Infected(John,Malaria) true Infected(Mary,Malaria) true Transfuse(Blood,John,Mary) true Transfuse(Blood,Mary,John) false 16 Sample MLN Probabilistic Model • Clauses with weights: x y (Mosquito(x) Infected(x,Malaria) Bite(x,y) → Infected(y,Malaria)) 20 x y (Infected(x,Malaria) Transfuse(Blood,x,y) → Infected(y,Malaria)) 5 • Constants: John, Mary, M • Ground literals: Mosquito(M) true Infected(M,Malaria) true Bite(M,John) false Bite(M,Mary) false Infected(John,Malaria) true Infected(Mary,Malaria) true Transfuse(Blood,John,Mary) true Transfuse(Blood,Mary,John) false P( 1 ) exp 20*2 5*2 Z 17 MLNs Inference and Learning • Using probabilistic inference techniques one can determine the most probable truth assignment, probability that a clause holds etc. • Given a database of training examples, appropriate weights of the formulae can be learned to maximize the probability of the training data • An open-source software package for MLNs called Alchemy is available 18 Abduction using MLNs • Given: Infected(Mary,Malaria) Transfuse(Blood,Mary,John) → Infected(John,Malaria)) Transfuse(Blood, Mary, John) Infected(John, Malaria) • The clause is satisfied whether Infected(Mary, Malaria) is true or false • Given the observations, a world has the same probability in MLN whether the explanation is true or false, explanations cannot be inferred • The MLN inference mechanism is inherently deductive and not abductive 19 Adapting MLNs for Abduction • Explicitly include the reverse implications x y (Infected(x,Malaria) Transfuse(Blood,x,y) → Infected(y,Malaria)) y (Infected(y,Malaria) → x (Transfuse(Blood,x,y) Infected(x,Malaria))) • Existentially quantify the universally quantified variables which appear on the LHS but not on the RHS in the original clause • Now, given Transfuse(Blood, Mary, John) and Infected(John, Malaria), the probability of the world in which Infected(Mary,Malaria) is true will be higher 20 Adapting MLNs for Abduction • However, there could be multiple explanations for the same observations: x y (Infected(x,Malaria) Transfuse(Blood,x,y) → Infected(y,Malaria)) y (Infected(y,Malaria) → x (Transfuse(Blood,x,y) Infected(x,Malaria))) x y (Mosquito(x) Infected(x,Malaria) Bite(x,y) → Infected(y,Malaria)) y (Infected(y,Malaria) → x (Mosquito(x) Infected(x,Malaria) Bite(x,y))) • An observation should be explained by one explanation and not multiple explanations • The system should support “explaining away” [Pearl 1988] 21 Adapting MLNs for Abduction • Add the disjunction clause and the mutual exclusivity clause for the same RHS term x y (Mosquito(x) Infected(x,Malaria) Bite(x,y) → Infected(y,Malaria)) x y (Infected(x,Malaria) Transfuse(Blood,x,y) → Infected(y,Malaria)) y (Infected(y,Malaria) → x (Transfuse(Blood,x,y) Infected(x,Malaria))) v x (Mosquito(x) Infected(x,Malaria) Bite(x,y))) y (Infected(y,Malaria) → ( x (Transfuse(Blood,x,y) Infected(x,Malaria))) v (x (Mosquito(x) Infected(x,Malaria) Bite(x,y)))) • Since MLN clauses are soft constraints both explanations can still be true 22 Adapting MLNs for Abduction • In general, for the Horn clauses P1 → Q, P2 → Q , …, Pn → Q in the background knowledge base, add: – A reverse implication disjunction clause Q → P1 v P2 v… v Pn – A mutual exclusivity clause for every pair of explanations Q → P1 v P2 Q → P1 v Pn … Q → P2 v Pn • Weights can be learned from training examples or can be set heuristically 23 Adapting MLNs for Abduction • There could be constants or variables on the RHS predicate x y (Mosquito(x) Infected(x,Malaria) Bite(x,y) → Infected(y,Malaria)) x (Infected(x,Malaria) Transfuse(Blood,x,John) → Infected(John,Malaria)) 24 Adapting MLNs for Abduction • There could be constants or variables on the RHS predicate x y (Mosquito(x) Infected(x,Malaria) Bite(x,y) → Infected(y,Malaria)) x (Infected(x,Malaria) Transfuse(Blood,x,John) → Infected(John,Malaria)) Infected(John,Malaria) → x (Transfuse(Blood,x,John) Infected(x,Malaria))) v x (Mosquito(x) Infected(x,Malaria) Bite(x,John)) Infected(John,Malaria) → ( x (Transfuse(Blood,x,John) Infected(x,Malaria))) v (x (Mosquito(x) Infected(x,Malaria) Bite(x,John))) 25 Adapting MLNs for Abduction • There could be constants or variables on the RHS predicate x y (Mosquito(x) Infected(x,Malaria) Bite(x,y) → Infected(y,Malaria)) x (Infected(x,Malaria) Transfuse(Blood,x,John) → Infected(John,Malaria)) Infected(John,Malaria) → x (Transfuse(Blood,x,John) Infected(x,Malaria))) v x (Mosquito(x) Infected(x,Malaria) Bite(x,John)) Infected(John,Malaria) → ( x (Transfuse(Blood,x,John) Infected(x,Malaria))) v (x (Mosquito(x) Infected(x,Malaria) Bite(x,John))) y (Infected(y,Malaria) → x (Mosquito(x) Infected(x,Malaria) Bite(x,y))) 26 Adapting MLNs for Abduction • There could be constants or variables on the RHS predicate x y (Mosquito(x) Infected(x,Malaria) Bite(x,y) → Infected(y,Malaria)) x (Infected(x,Malaria) Transfuse(Blood,x,John) → Infected(John,Malaria)) Infected(John,Malaria) → x (Transfuse(Blood,x,John) Infected(x,Malaria))) v x (Mosquito(x) Infected(x,Malaria) Bite(x,John)) Infected(John,Malaria) → ( x (Transfuse(Blood,x,John) Infected(x,Malaria))) v (x (Mosquito(x) Infected(x,Malaria) Bite(x,John))) y (Infected(y,Malaria) → x (Mosquito(x) Infected(x,Malaria) Bite(x,y))) • Formal algorithm is described in the paper, requires appropriate 27 unifications and variable re-namings Experiments: Dataset • Plan recognition dataset used to evaluate abductive systems [Ng & Mooney 1991; Charniak & Goldman 1991] • Character’s higher-level plans must be inferred to explain their observed actions in a narrative text – “Fred went to the supermarket. He pointed a gun at the owner. He packed his bag.” => robbing – “Jack went to the supermarket. He found some milk on the shelf. He paid for it.” => shopping • Dataset contains 25 development [Goldman 1990] and 25 test examples [Ng & Mooney 1992] 28 Experiments: Dataset contd. • Background knowledge-base was constructed for the ACCEL system [Ng and Mooney 1991] to work with the 25 development examples; 107 such rules instance_shopping(s) ^ go_step(s,g) instance_going(g) instance_shopping(s) ^ go_step(s,g) ^ shopper(s,p) goer(g,p) • Narrative text is represented in first order logic; average 12.6 literals – “Bill went to the store. He paid for some milk.” instance_going(Go1) goer(Go1,Bill) destination_go(Store1) instance_paying(Pay1) payer(Pay1,Bill) thing_paid(Pay1,Milk1) • Assumptions explaining the above actions instance_shopping(S1) shopper(S1,Bill) go_step(S1,Go1) pay_step(S1,Pay1) thing_shopped_for(S1,Milk) 29 Experiments: Methodology • Our algorithm automatically adds clauses to the knowledge-base for performing abduction using MLNs • We found that 25 development examples were too few to learn weights for MLNs, we heuristically set the weights – Small negative weights on unit clauses so that they are not assumed for no reason – Medium weights on reverse implication clauses – Large weights on mutual exlcusivity clauses • Given a set of observations, we use Alchemy’s probabilistic inference to determine the most likely truth assignment for the remaining literals 30 Experiments: Methodology contd. • We compare with the ACCEL system [Ng & Mooney 1992], a purely logic-based system for abduction • Selects the best explanation using a metric – Simplicity metric: selects the explanation of smallest size – Coherence metric: selects the explanation that maximally connects the observations (specifically geared towards this task) • “John took the bus. He bought milk.” => John took the bus to the store where he bought the milk. 31 Experiments: Methodology contd. • Besides finding the assumptions, a deductive system like MLN also finds other facts that can be deduced from the assumptions • We deductively expand ACCEL’s output and goldstandard answers for a fair comparison • We measure – Precision: what fraction of the predicted ground literals are in the gold-standard answers – Recall: what fraction of the ground literals in the goldstandard answers were predicted – F-measure: harmonic mean of precision and recall 32 Experiments: Results Development Set 100 95 90 MLN ACCEL-Simplicity ACCEL-Coherence 85 80 75 70 F-measure Recall Precision 33 Experiments: Results contd. Test Set 100 95 90 MLN ACCEL-Simplicity ACCEL-Coherence 85 80 75 70 F-measure Recall Precision 34 Experiments: Results contd. • MLN performs better than ACCEL-simplicity particularly on the development set • ACCEL-coherence performs the best, but was specifically tailored for narrative understanding task • The dataset used does not require full probabilistic treatment because little uncertainty in the knowledge-base or observations • MLNs did not need any heuristic metric but simply found the most probable explanation 35 Future Work • Evaluate probabilistic abduction using MLNs on a task in which uncertainty plays a bigger role • Evaluate on a larger dataset on which the weights could be learned to automatically adapt to a particular domain – Previous abductive systems like ACCEL have no learning mechanism • Perform probabilistic abduction using other frameworks of combining first-order logic and graphical models [Getoor & Taskar 2007], for example, Bayesian Logic Programming [Kersting & De Raedt 2001] and compare with the presented approach 36 Conclusions • A general method for probabilistic first-order logical abduction using MLNs • Existing off-the-shelf deductive inference system of MLNs is employed to do abduction by suitably reversing the implications • Handles uncertainties using probabilties and an unbounded number of related entities using firstorder logic, capable of learning • Experiments on a small plan recognition dataset demonstrated that it compares favorably with special-purpose logic-based abductive systems 37 Thanks! Questions? 38