Metody i algorytmy sieci bayesowskich w systemach

Download Report

Transcript Metody i algorytmy sieci bayesowskich w systemach

Impact of Structuring on
Bayesian Network Learning
and Reasoning
Mieczysław.A..Kłopotek
Institute of Computer Science,
Polish Academy of Sciences,
Warsaw, Poland,
First Warsaw International Seminar on Soft
Computing Warsaw, September 8th, 2003 1
Agenda
 Definitions
 Approximate Reasoning
 Bayesian networks
 Reasoning in Bayesian networks
2
 Learning Bayesian networks from data
 Structured Bayesian networks (SBN)
 Reasoning in SBN
 Learning SBN from data
 Concluding remarks
2
Approximate Reasoning
 One possible method of expressing
uncertainty: Joint Probability Distribution
 Variables: causes, effects, observables
 Reasoning: How probable is that a variable
takes a given value if we kniow the values of
some other variables
3


Given: P(X,Y,....,Z)
Find: P(X=x | T=t,...,W=w)
 Difficult, if more than 40 variables have to be
taken into account

hard to represent,
 hard to reason,
 hard to collect data)
3
Bayesian Network
4
The method of choice for representing
uncertainty in AI.
Many efficient reasoning methods and
learning methods
Utilize explicit representation of structure to:
 provide a natural and compact
representation of large probability
distributions.
 allow for efficient methods for answering a
wide range of queries.
4
Bayesian Network
 Efficient and effective representation of a
5
probability distribution
 Directed acyclic graph
 Nodes - random variables of interests
 Edges - direct (causal) influence
 Nodes are statistically independent of their
non descendants given the state of their
parents
5
A Bayesian network
S
T
R
6
Y
Z
Pr(r,s,x,z,y)=
Pr(z) . Pr(s|z) .
Pr(y|z)
. Pr(x|y) . Pr(r|y,s)
X
6
Applications of Bayesian
networks
 Genetic optimization algorithms with


7




probabilistic mutation/crossing mechanism
Classification, including text classification
Medical diagnosis (PathFinder, QMR), other
decision making tasks under uncertainty
Hardware diagnosis (Microsoft
troubleshooter, NASA/Rockwell Vista project)
Information retrieval (Ricoh helpdesk)
Recommender systems
other
7
Reasoning – the problem with
a Bayesian network
 Fusion algorithm of Pearl elaborated for tree-
like networks only
 For other types of networks transformations
to trees:

8


transformation to Markov tree (MT) is needed
(Shafer/Shenoy, Spiegelhalter/Lauritzen) – except
for trees and polytrees NP hard
Cutset reasoning (Pearl) – finding cutsets difficult,
the reasoning complexity grows exponentially with
cutset size needed
evidence absorption reasoning by edge reversal
(Shachter) – not always possible in a simple way
8
Towards MT – moral graph
T
S
R
9
Z
Y
Parents of a node in BN
connected, edges not oriented
X
9
Towards MT – triangulated graph
T
S
R
1
0
Z
Y
All cycles with more than 3 nodes
have at least one link between nonneighboring nodes of the cycle.
X
10
Towards MT – Hypertree
T
S
R
1
1
Z
Y
X
Hypertree = acyclic hypergraph
11
The Markov tree
Z,T,Y
1
2
T,Y,S
Y,S,R
Y,X
Hypernodes of hypertree are
nodes of the Markov tree
12
Junction tree – alternative
representation of MT
Z,T,S
Z,S
Z,Y,S
Y,S
Y,S,R
Y
1
3
Y,X
Common BN nodes assigned to
edges joining MT nodes
13
Efficient reasoning in Markov
trees, but ....
msg
Y,S
msg
Z,S
Z,T,S
1
4
msg
Y
MT node contents
projected onto common
variables are passed to
the neighbors
Y,S,R
Z,Y,S
Y,X
14
Triangulability test Triangulation not always
possible
1
5
All
neighbors
need to be
connected
15
Evidence absorption
reasoning
T
T
S
S
R
R
Z
Z
Y
Y
X
1
6
X
T
Evidence
absorption
S
Edge reversal
R
Z
Y
X
Efficient only for good-luck selection
of conditioning variables
16
Cutset reasoning – fixing
values of some nodes creates
a (poly)tree
Node
fixed
S
T
1
7
Hence edge
ignorable
Z
R
Y
X
17
How to overcome the difficulty
when reasoning with BN
 Learn directly a triangulated graph or Markov
tree from data (Cercone N., Wong S.K.M.,
Xiang Y)

1
8
Hard and inefficient for long dependence chains,
danger of large hypernodes
 Learn only tree-structured/polytree structured
BN (e.g. In Goldberg’s Bayesian Genetic
Algorithms, TAN text classifiers etc.)

Oversimplification, long dependence chains lost
 Our approach: Propose a more general class
of Bayesian networks that is still efficient for
reasoning
18
What is a structured Bayesian
network
 An analogon of well-structured
1
9
programs
 Graphical structure: nested sequences
and alternatives
 By collapsing sequences and
alternatives to single nodes, one single
node obtainable
 Efficient reasoning possible
19
Structured Bayesian Network
(SBN), an example
For comparison: a treestructured BN
2
0
20
SBN collapsing
2
1
21
SBN construction steps
2
2
means
0,1 or 2
arrows
22
Reasoning in SBN
 Either directly in the structure
 Or easily transformable to Markov tree
 Direct reasoning consisting of
2
3
 Forward
step (leave node/root node
valuation calculation)
 Backward step (intermediate node
valuation calculation
23
Reasoning in SBN forward
step
A
E
A
C
B
P(B|A)
B
P(B|C,E)
2
4
means
0,1 or 2
arrows
24
Reasoning in SBN backward
step: local context
..... ..... ..... .....
Joint
distribution of
A,B
known,
joint C,D
or C
sought
A
A
A
C
C
A
C
C
2
.....
.....
.....
.....
5
D
D
B
B
B
(a)
B
(b)
(c)
(d)
25
Reasoning in SBN – backward
step: local reasoning
P(A)*P(B|A,D)
Msg2(A,B)
A,B,............
2
6
A,B
A,B,C,D
Msg1(A,B)
Not needed
26
SBN –towards a MT
2
7
27
SBN –towards a MT
2
8
28
SBN –towards a MT
2
9
29
Towards a Markobv tree – an
example
R
I
K
A
F
3
0
L
B
M
G
C
D
N
O
E
H
P
J
S
30
Towards a Markobv tree – an
example
R
I
K
A
F
3
1
L
B
M
G
C
D
N
O
E
H
P
J
S
31
Markov tree from SBN
K,L,R
A,B,I
B,C,D,I
3
2
L,M,N,R
F,G,I
C,D,E,I
G,H,I
D,E,I
I,H,E,R
E,H,R,J
M,N,O,R
N,O,R
O,P,R
H,R,J
R,J,P
P,J,S
32
Structured Bayesian network – a Hierarchical
(Object-Oriented) Bayesian network
R
I
K
A
F
3
3
L
B
M
G
C
D
N
O
E
H
P
J
S
33
Learning SBN from Data
 Define the DEP() measure as follows:
3
4
DEP(Y,X)=P(x|y)-P(x|y).
 Define DEP[](Y,X)= (DEP(Y,X) )2
 Construct a tree according to Chow/Liu
algorithm using DEP[](Y,X) with Y
belonging to the tree and X not.
34
Continued ....
 Let us call all the edges obtained by the
3
5
previous algorithm “free edges”.
 During the construction process the following
type of edges may additionally appear “node
X loop unoriented edge”, “node X loop
oriented edge”, “node X loop transient edge”.
 Do in a loop (till termination condition below is
satisfied):
 For each two properly connected nonneighboring nodes identify the unique
connecting path between them.
35
Continued ....
 Two nodes are properly connected if the path
3
6
between them consists either of edges having
the status of free edges or of oriented,
unoriented (but not suspended) edges of the
same loop, with no pair of oriented or
transient oriented edges pointing in different
directions and no transient edge pointing to
one of the two connected points.
 Note that in this sense there is at most one
path properly connecting two nodes.
36
Continued ....
 Connect that a pair of non-neighboring
3
7
nodes X,Y by an edge, that maximizes
DEP[](X,Y),
the
minimum
of
unconditional DEP and
conditional
DEP given a direct successor of X on
the path to Y.
 Identify the loop that has emerged from
this operation.
37
Continued ....
 We can have one of the following cases:
 (1)it consists entirely of free edges
 (2)it contains some unoriented loop edges,
3
8
but no oriented edge.
 (3)It contains at least one oriented edge.
 Depending on this, give a proper status to
edges contained in a loop: “node X loop
unoriented edge”, “node X loop oriented
edge”, “node X loop transient edge”.
 (details
in written presentation).
38
Places of edge insertion
E
E
Y
G
Y
X
Y
D
X
B
X
C
C
3
9
D
D
C
H
E
E
G
X
X
Y
D
X
D
D
Y
Y
B
C
C
C
39
Concluding Remarks
 new class of Bayesian networks defined
 completely new method of reasoning in Bayesian

4
0



networks outlined
Local computation – at most 4 nodes involved
applicable to a more general class of networks
then known reasoning methods
new class Bayesian networks easily transfornmed
to Markov trees
new class Bayesian networks – a kind of
hierarchical or object-oriented Bayesian networks
 Can be learned from data
40
THANK YOU
4
1
41