Dependency networks Sushmita Roy BMI/CS 576 [email protected] Nov 25th, 2014 RECAP • Probabilistic graphical models provide a natural way to represent biological networks • So far we.

Download Report

Transcript Dependency networks Sushmita Roy BMI/CS 576 [email protected] Nov 25th, 2014 RECAP • Probabilistic graphical models provide a natural way to represent biological networks • So far we.

Dependency networks
Sushmita Roy
BMI/CS 576
[email protected]
Nov 25th, 2014
• Probabilistic graphical models provide a natural way
to represent biological networks
• So far we have see Bayesian networks:
– Sparse candidates
– Module networks
• Today we will focus on dependency networks
What you should know
• What are dependency networks?
• How they differ from Bayesian networks?
• GENIE3 algorithm for learning a dependency network
from expression data
• Different ways to represent conditional distributions
• Evaluation of various network inference methods
Graphical models for representing regulatory
• Bayesian networks
• Dependency networks
Random variables
encode expression levels
Edges correspond to some form of statistical dependencies
Dependency network
• A type of probabilistic graphical model
• As in Bayesian networks has
– A graph component describing the dependency structure
between random variables
– Each variable Xj is associated with a prediction function fj
to predict Xj from the state of its neighbors
• Unlike Bayesian network
– Can have cyclic dependencies
Dependency Networks for Inference, Collaborative Filtering and Data visualization
Heckerman, Chickering, Meek, Rounthwaite, Kadie 2000
Xi: ith random variable
X={X1,.., Xp}: set of p random variables
xik: An assignment of Xi in the kth sample
x-ik: Set of assignments to all variables other than Xi
in the kth sample
Learning dependency networks
•fj can be of different types.
•Learning requires estimation of each of
the fj functions
•In all cases learning requires us to
minimize an error of predicting Xj from
its neighborhood:
Different representations of the fj function
• If Xj is continuous
– fj can be a linear function
– fj can be a regression tree
– fj can be a random forest
• An ensemble of trees
• If Xj is discrete
– fj can be a conditional probability table
– fj can be a conditional probability tree
GENIE3: GEne Network Inference with
Ensemble of trees
• Solves a set of regression problems
– One per random variable
• Uses an Ensemble of regression trees to represent fj
– Models non-linear dependencies
• Outputs a directed, cyclic graph with a confidence of
each edge
• Focus on generating a ranking over edges rather than
a graph structure and parameters
Inferring Regulatory Networks from Expression Data Using Tree-Based Methods Van Anh
Huynh-Thu, Alexandre Irrthum, Louis Wehenkel, Pierre Geurts, Plos One 2010
Recall our very simple regression tree example
X2 > e1
X2 > e2
An Ensemble of trees
• A single tree is prone to “overfitting”
• Instead of learning a single tree, Ensemble models
make use of a collection of trees
A Random forest: An Ensemble of Trees
leaf nodes
split nodes
tree t1
tree tT
– Prediction is
Taken from ICCV09 tutorial by Kim, Shotton and Stenger:
GENIE3 algorithm sketch
• For each Xj, generate learning samples of
input/output pairs
LSj={(x-jk,xjk), k=1..N}
On each LSj learn fj to predict the value of Xj
fj is either a Random forest or Extra trees
Estimate wij for all genes i ≠ j
• wij quantifies the confidence of the edge between Xi and Xj
• Generate a global ranking of edges based on each wij
Note that depending of the interpretation of the weights wi,j ,
their aggregation to a get a global ranking of regulatory links isnot
trivial. We will see in the context of tree-based methods that it
requires to normalize each expression vector appropriately.
variable (selected in x ), trying to reduce as
variance of the output variable (x j ) in the
samples. Candidate splits for numerical
compare the input variable values with a
determined during the tree growing.
GENIE3 algorithm sketch
Predictor ranking
Figure 1. GENIE3 procedure. For each gene j~ 1, . . . ,p, a learning sample L Sj is generated with expression levels of j
expression levels of all other genes as input values. A function f j is learned from L Sj and a local ranking of all genes except j is
rankings are then aggregated to get a global ranking of all regulatory links.
Figure from Huynh-Thu et al.
Learning fj in GENIE3
• Random forest or Extra Trees to represent the fj
• Learning the Random forest
– Generate M=1000 bootstrap samples
– At each node to be split, search for best split among K randomly
selected variables
– K was set to p-1 or (p-1)1/2, where p is the number of
• Learning the Extra-Trees
– Learn 1000 trees
– Each tree is built from the original learning sample
– At each test node, the best split is determined among K random
splits, each determined by randomly selecting one input
(without replacement) and a threshold
Computing the importance weight of a
• Importance is computed at each interior node
• Remember there can be multiple interior nodes per
• For an interior node, importance is given by the
reduction in variance if we make a split on that node
Interior node
Set of data samples that reach this node
#S: Size of the set S
Var(S): variance of the output variable in set S
St: subset of S when a test at N is true
Sf: subset of S when a test at N is false
Computing the importance weight of a
• For a single tree the overall importance is then sum
over all points in the tree where this node is used to
• For an ensemble the importance is averaged over all
• To avoid bias towards highly variable genes,
normalize the expression genes to all have unit
Computational complexity of GENIE3
• Complexity per variable
O(TKNlog N)
T is the number of trees
K is the number of random attributes selected per split
N is the learning sample size
Evaluation of network inference methods
• Assume we know what the “right” network is
• One can use Precision-Recall curves to evaluate the
predicted network
• Area under the PR curve (AUPR) curve quantifies
# of correct edges
# of correct edges
# of predicted edges
# of true edges
zed. One apparent
that it does not take
alization. Indeed since
hts satisfy equation (4)
qual weights to all tree
distribution, represented by a (single) regression tree.
Finally, although we exploited tree-based ensemble methods,
our framework is general and other feature selection techniques
could have been used as well. Actually, several existing methods
AUPR based performance comparison
Some comments about expression-based
network inference methods
• We have seen two types of algorithms to learn these
– Per-gene methods
• Sparse candidate: learn regulators for individual genes
– Per-module methods
• Module networks: learn regulators for sets of genes/modules
– Other implementations of module networks exist
• LIRNET: Learning a Prior on Regulatory Potential from eQTL Data
– Su In Lee et al, Plos genetics 2009
• LeMoNe: Learning Module Networks
– Michoel et al 2007 (
Many implementations of per-gene methods
• Mutual Information
– Context Likelihood of relatedness (CLR)
• Probabilistic methods
– Bayesian network: Sparse Candidates
• Regression
DREAM: Dialogue for reverse engineeting
assessments and methods
Community effort to assess regulatory network inference
DREAM 5 challenge
Previous challenges: 2006, 2007, 2008, 2009, 2010
Marbach et al. 2012, Nature Methods
Marbach et al., 2010
Where do different methods rank?
Comparing module (LeMoNe) and per-gene
(CLR) methods