Part II - Department of Computer Science and Engineering
Download
Report
Transcript Part II - Department of Computer Science and Engineering
AAAI 2014 Tutorial
Latent Tree Models
Part II: Definition and Properties
Nevin L. Zhang
Dept. of Computer Science & Engineering
The Hong Kong Univ. of Sci. & Tech.
http://www.cse.ust.hk/~lzhang
Part II: Concept and Properties
Latent Tree Models
Definition
Relationship with finite mixture models
Relationship with phylogenetic trees
Basic Properties
AAAI 2014 Tutorial Nevin L. Zhang HKUST
2
Basic Latent Tree Models (LTM)
Bayesian network
All variables are discrete
Structure is a rooted tree
Leaf nodes are observed (manifest
variables)
Internal nodes are not observed
(latent variables)
Parameters:
Also known as Hierarchical
latent class (HLC) models, HLC
models (Zhang. JMLR 2004)
P(Y1), P(Y2|Y1),P(X1|Y2), P(X2|Y2), …
Semantics:
AAAI 2014 Tutorial Nevin L. Zhang HKUST
3
Joint Distribution over Observed Variables
Marginalizing out the latent variables in
get a joint distribution over the observed variables
, we
.
In comparison with Bayesian network without latent variables, LTM:
Is computationally very simple to work with.
Represent complex relationships among manifest variables.
What does the structure look like without the latent variables?
AAAI 2014 Tutorial Nevin L. Zhang HKUST
4
Pouch Latent Tree Models (PLTM)
An extension of basic LTM
(Poon et al. ICML 2010)
Rooted tree
Internal nodes represent discrete latent variables
Each leaf node consists of one or more continuous observed variable,
called a pouch.
AAAI 2014 Tutorial Nevin L. Zhang HKUST
5
More General Latent Variable Tree Models
Some internal nodes can be observed
Internal nodes can be continuous
Forest
Primary focus of this tutorial: the basic LTM
(Choi et al. JMLR 2011)
AAAI 2014 Tutorial Nevin L. Zhang HKUST
6
Part II: Concept and Properties
Latent Tree Models
Definition
Relationship with finite mixture models
Relationship with phylogenetic trees
Basic Properties
AAAI 2014 Tutorial Nevin L. Zhang HKUST
7
Finite Mixture Models (FMM)
Gaussian Mixture Models (GMM): Continuous attributes
Graphical model
AAAI 2014 Tutorial Nevin L. Zhang HKUST
8
Finite Mixture Models (FMM)
GMM with independence assumption
Block diagonal co-variable matrix
Graphical Model
AAAI 2014 Tutorial Nevin L. Zhang HKUST
9
Finite Mixture Models
Latent class models (LCM): Discrete attributes
Graphical Model
Distribution for cluster k:
Product multinomial distribution:
All FMMs
One latent variable
Yielding one partition of data
AAAI 2014 Tutorial Nevin L. Zhang HKUST
10
From FMMs to LTMs
Start with several GMMs,
Each based on a distinct subset of attributes
Each partitions data from a certain
perspective.
Different partitions are independent of each
other
Link them up to form a tree model
Get Pouch LTM
Consider different perspectives in a single
model
Multiple partitions of data that are correlated.
AAAI 2014 Tutorial Nevin L. Zhang HKUST
11
From FMMs to LTMs
Start with several LCMs,
Each based on a distinct subset of attributes
Each partitions data from a certain perspective.
Different partitions are independent of each other
Link them up to form a tree model
Get LTM
Consider different perspectives in a single model
Multiple partitions of data that are correlated.
Summary: An LTM can be viewed as a collections of FMMs, with their
latent variables linked up to form a tree structure.
AAAI 2014 Tutorial Nevin L. Zhang HKUST
12
Part II: Concept and Properties
Latent Tree Models
Definition
Relationship with finite mixture models
Relationship with phylogenetic trees
Basic Properties
AAAI 2014 Tutorial Nevin L. Zhang HKUST
13
Phylogenetic trees
TAXA (sequences) identify species
Edge lengths represent evolution time
Usually, bifurcating tree topology
Durbin, et al. (1998). Biological Sequence Analysis: Probabilistic Models of Proteins and
Nucleic Acids. Cambridge University Press.
AAAI 2014 Tutorial Nevin L. Zhang HKUST
14
Probabilistic Models of Evolution
Two assumptions
There are only substitutions, no
insertions/deletions (aligned)
Each site evolves independently and
identically
P(x|y, t) = Pi=1 to m P(x(i) | y(i), t)
One-to-one correspondence between sites in
different sequences
m is sequence length
P(x(i)|y(i), t)
Jukes-Cantor (Character Evolution) Model [1969]
Rate of substitution a
AAAI 2014 Tutorial Nevin L. Zhang HKUST
15
Phylogenetic Trees are Special LTMs
When focus on one site, phylogenetic trees are special latent tree
models
The structure is a binary tree
The variables share the same state space.
Each conditional distribution is characterized by only one parameters, i.e., the
length of the corresponding edge
AAAI 2014 Tutorial Nevin L. Zhang HKUST
16
Hidden Markov Models
Hidden Markov models are also special latent tree models
All latent variables share the same state space.
All observed variables share the same state space.
P(yt |st ) and P(st+1 | st )
are the same for different t ’s.
AAAI 2014 Tutorial Nevin L. Zhang HKUST
17
Part II: Concept and Basic Properties
Latent Tree Models
Definition
Relationship with finite mixture models
Relationship with phylogenetic trees
Basic Properties
AAAI 2014 Tutorial Nevin L. Zhang HKUST
18
Two Concepts of Models
So far, a model consists of
Observed and latent variables
Connections among the variables
Probability values
For the rest of Part II, a model consists of
Observed and latent variables
Connections among the variables
Probability parameters
AAAI 2014 Tutorial Nevin L. Zhang HKUST
19
Model Inclusion
AAAI 2014 Tutorial Nevin L. Zhang HKUST
20
Model Equivalence
If m includes m’ and vice versa, then they are marginally
equivalent.
If they also have the same number of free parameters, then
they are equivalent.
It is not possible to distinguish between equivalent models
based on data.
AAAI 2014 Tutorial Nevin L. Zhang HKUST
21
Root Walking
AAAI 2014 Tutorial Nevin L. Zhang HKUST
22
Root Walking Example
Root walks to X2;
Root walks to X3
AAAI 2014 Tutorial Nevin L. Zhang HKUST
23
Root Walking
Theorem: Root walking leads to equivalent latent tree models.
(Zhang, JMLR 2004)
Special case of covered arc reversal in general Bayesian network,
Chickering, D. M. (1995). A transformational characterization of equivalent
Bayesian network structures. UAI.
AAAI 2014 Tutorial Nevin L. Zhang HKUST
24
Implication
Edge orientations in latent tree models are not identifiable.
Technically, better to start with alternative definition of LTM:
A latent tree model (LTM) is
a Markov random field over an undirected tree, or tree-structured Markov
network
where variables at leaf nodes are observed and variables at internal nodes
are hidden.
AAAI 2014 Tutorial Nevin L. Zhang HKUST
25
Implication
For technical convenience, we often root an LTM at one of its latent
nodes and regard it as a directed graphical model.
Rooting the model at different latent nodes lead to equivalent
directed models.
This is why we introduced LTM as directed models.
AAAI 2014 Tutorial Nevin L. Zhang HKUST
26
Regularity
|X|: Cardinality of variable X, i.e., the number of states.
AAAI 2014 Tutorial Nevin L. Zhang HKUST
27
Regularity
Can focus on regular models only
Irregular models can be made regular
Regularized models better than irregular models
(Zhang, JMLR 2004)
Theorem: The set of all regular models for a given set of observed
variables is finite.
AAAI 2014 Tutorial Nevin L. Zhang HKUST
28