Transcript talk

Analysis of uncertain data:
Evaluation of Given Hypotheses
Selection of probes for information gathering
Anatole Gershman, Eugene Fink,
Bin Fu, and Jaime G. Carbonell
Analysis of uncertain data:
Evaluation of Given Hypotheses
Selection of probes for information gathering
Anatole Gershman, Eugene Fink,
Bin Fu, and Jaime G. Carbonell
Example
The analyst has to distinguish
between two hypotheses:
0.4
Retires
0.6
Joins
Vikings
Example
Observations:
Without the tearful public ceremony that accompanied his
retirement announcement from the Green Bay Packers just
11 months ago, quarterback Brett Favre has told the New
York Jets he is retiring.
Minnesota coach Brad Childress, jilted at the altar Tuesday
afternoon by Brett Farve telling him he wasn’t going to play
for the Vikings in 2009.
According to many rumors, quarterback Brett Favre has
closed on the purchase of a home in Eden Prairie, MN,
where the Minnesota Vikings' team facility is located.
Example
Observation distributions:
Without the tearful public ceremony that accompanied his retirement
announcement from the Green Bay Packers just 11 months ago,
quarterback Brett Favre has told the New York Jets he is retiring.
P(says retire | Retires) = 0.9
P(says retire | Joins Vikings) = 0.6
Bayesian induction:
P (Retire|says retire) = P (Retire) ∙ P(says retire|Retire) /
(P (Retire) ∙ P(says retire|Retire)
+ P (Joins Vikings) ∙ P(Joins Vikings|Retire))
= 0.5
General problem
We have to distinguish among n mutually
exclusive hypotheses, denoted H1, H2,…, Hn.
0.4
0.6
For every hypothesis, we know its prior;
thus, we have an array of n of priors,
P(H1), P(H2), P(Hn)
General problem
We
base hypothesis,
the analysiswe
on
observable
features,of denoted
For every
observation,
OBS
know
, we
the
know
related
the probability
number
its
am
distribution
of
observation.
P(o
possible
values,
have
num[1..m]
withthe
thethat
OBS
…,num[a].
OBS
Each we
observation
is a variable
a,j | H
i) represents
1, OBS
2, each
m. Thus,
number
probabilities
of of
values
of
possible
for discrete
each
values
observation.
of OBSa.
takes
one
several
values.
OBS1
I will
RETIRE!
0.9
0.4
I won’t
RETIRE!
0.1
0.6
num[1] = 2
We know a specific value of each observation val [1..m].
General problem
We have to evaluate the posterior
probabilities of the n given hypotheses,
denoted Post(H1), Post(H2), Post(Hn)
0.5
0.5
Extension #1
Prior:
0.6
0.35
0.05
Something else H0
(“surprise”)
Extension #1
After discovering val, Posterior probability of H0:
Post(H0)
= P(H0) ∙ P(val | H0) / P(val)
= P(H0) ∙ P(val | H0)
/ (P(H0) ∙ P(val | H0) + likelihood(val)).
Bad news: We do not know P(val | H0).
Good news: Post(H0) monotonically depends
on P(val | H0); thus, if we obtain lower and
upper bounds for P(val | H0), we also get
bounds for Post(H0).
Plausibility principle
Unlikely events normally do not happen; thus,
if we have observed val, then its likelihood
must not be too small.
Plausibility threshold: We use a global constant
plaus, which must be between 0.0 and 1.0. If we
have observed val, we assume that
P(val) ≥ plaus / num.
We use it to obtains bounds for P(val | H0), :
Lower: (plaus / num − likelihood(val)) / prior[0].
Upper: 1.0.
Plausibility principle
We use it to obtains bounds for P(val | H0):
Lower: (plaus / num − likelihood(val)) / P(H0) .
Upper: 1.0.
We substitute these bounds into the dependency
of Post(H0) on P(val | H0), , thus obtaining the
bounds for Post(H0):
Lower: 1.0 − likelihood(val) ∙ num / plaus.
Upper: P(H0) / (P(H0) + likelihood(val)).
We have derived bounds for the probability
that none of the given hypotheses is correct.
Extension #2
Multiple observations:
Which one(s) to Use?
Bayesian analysis: Use their joint distribution?
Difficult to get.
Independence assumption: usually does
not work.
We identify the highest-utility observation and
do not use other observations to corroborate
it.
Extension #2
Utility Function
0.5 0.5
0.4 0.6
0.35 0.65
Which one is “better”?
Extension #2
Utility Function
0.5 0.5
0.4 0.6
0.35 0.65
Shannon’s Entropy (negation)
Extension #2
Utility Function
0.5 0.5
0.4 0.6
0.35 0.65
KL-divergence
Extension #2
Utility Function
0.5 0.5
0.4 0.6
0.35 0.65
Self-defined function
Analysis of uncertain data:
Evaluation of Given Hypotheses
Selection of probes for information gathering
Anatole Gershman, Eugene Fink,
Bin Fu, and Jaime G. Carbonell
Example
The analyst has to distinguish
between two hypotheses:
0.4
Retires
0.6
Joins
Vikings
Example
I will
RETIRE!
Ask
0.4
0.5
0.6
0.5
Probe: Execute external action and observe
its response, to gather more information.
Example
Probe:
Gain (utility
function)
Probe
Cost
Observation
Probability
Probe Selection
single-obs-gain(probej)
Utility
Function
= visible[i, a, j]
· (likelihood(1) · probe-gain(1)
+ … + likelihood(num[a]) · probe-gain(num[a]))
+ (1.0 − visible[i, a, j]) · cost[j]
Observation
Probability
Probe
Cost
gain(probj)
= max (single-obs-gain(probj, obs1),…,
single-obs-gain(probj, obsm))
Experiment
Task: Evaluating hypothesizes (H1, H2, H3, H4).
No Probe, Accuracy of distinguishing between H1 and other hypotheses
Experiment
Task: Evaluating hypothesizes (H1, H2, H3, H4).
Probe Selection to distinguish H1 and other hypotheses
Experiment
Task: Evaluating hypothesizes (H1, H2, H3, H4).
Probe Selection to distinguish four hypotheses
Summary
Use Bayesian inference to distinguish among
mutually exclusive hypotheses.
 H0 hypothesis
 Multiple observations
Use Probe to gather more information for
better analysis
 Cost, Utility function, Observation Probability,...
Thank you