Probability theory - University of Mary Hardin–Baylor

Download Report

Transcript Probability theory - University of Mary Hardin–Baylor

Conditional Probability, Bayes’
Theorem, and Belief Networks
CISC 2315 Discrete Structures Spring2010
Professor William G. Tanner, Jr.
Conditional probability

P(A|B) = the conditional probability of A given that all we know is B.
P( A  B)
P( A | B) 
, P( B)  0
P( B)
Once we receive some evidence concerning a proposition, prior probabilities
are no longer applicable. We need to assess the conditional probability of that
proposition given that all we know is the available evidence.
e.g., in the picture, P(A) = 0.25; P(B) = 0.5;
P(A & B) = 0.25; P(A|B) = 0.5.
A
B
B
~B
~B
Examples of conditional probabilities

Example: P(Cavity=true|Toothache=true) = 0.8 is
a conditional probability statement.
We could also have more evidence, e.g.,
P(Cavity|Toothache, Earthquake). This evidence could be
irrelevant,
e.g., P(Cavity|Toothache, Earthquake) =
P(Cavity|Toothache) = 0.8.
Also, P(Cavity|Toothache, Cavity) = 1.
Bayes’ Rule
P( A | B) P( B)
P( B | A) 
P( A)
Here is the derivation of the rule:
P( A  B)
1. P( B | A) 
, defn of conditional probability
P( A)
P( A  B)
2. P( A | B) 
, defn of conditional probability
P( B)
3. P( A  B)  P( A | B) P( B), from2.
4. P( B | A) 
P( A | B) P( B)
, substituting 3. into1.
P( A)
The significance of Bayes’ Rule


Bayes’ Rule underlies many probabilistic reasoning
systems in artificial intelligence (AI).
It is useful because, in practice, we often know the
probabilities on the right hand side of Bayes’ Rule and
wish to estimate the probability on the left.
Example of the use of Bayes Rule

Bayes’ Rule is particularly useful for assessing disease
hypotheses from symptoms:


P(Cause|Effect) = P(Effect|Cause) P(Cause) / P(Effect)
From knowledge of conditional probabilities on causal
relationships in medicine, we can derive probabilities of
diagnoses. Let S be the proposition that a patient has a stiff
neck, and M the proposition that the patient has meningitis.
Suppose we want to know P(M|S).




P(S|M)=0.5
P(M)=1/50000
P(S)=1/20
Using Bayes’ Rule, P(M|S)=P(S|M)P(M)/P(S)=0.0002
Belief networks


Belief networks represent dependence between variables.
A belief network B = (V,E) is a directed acyclic graph with nodes
V and directed edges E where
 Each node in V corresponds to a random variable.
 There is a directed edge from node X to node Y if variable X
has a direct influence on variable Y.
 Each node in V has a conditional probability table (CPT)
associated with it. The CPT specifies the conditional
distribution of the node given its parents, i.e.,
P(Xi|Parents(Xi)). The parents of a node are all those nodes
that have arrows pointing to it.
A simple belief network
Earthquake
Burglary
Alarm
JohnCalls
MaryCalls
A simple belief net with CPTs
P(E)=0.002
Earthquake
P(B)=0.001
Burglary
Alarm
JohnCalls
P(J|A)=0.9
P(J|not A)=0.05
P(A|B,E)=0.95
P(A|B,not E)=0.94
P(A|not B,E)=0.29
P(A|not B, not E)=0.001
MaryCalls
P(M|A)=0.7
P(M|not A)=0.01
An Example

We can compute P(Alarm|Burglary),
using probabilistic inference (taught in
the AI class).
A Successful Belief Net Application





PATHFINDER is a diagnostic expert system for lymph-node
diseases, built by the Stanford Medical Computer Science
program in the 1980’s.
The system deals with over 60 diseases.
Four versions have been built, and PATHFINDER IV uses a belief
network.
PATHFINDER IV was tested on 53 actual cases of patients
referred to a lymph-node specialist, and it scored highly.
A recent comparison between medical experts and PATHFINDER
IV shows the system outperforming the experts, some of who are
among the world’s leading pathologists, and some of who were
consulted to build the system in the first place!