1. Introduction to Molecular Biology

Transcript 1. Introduction to Molecular Biology

6. Gene Regulatory Networks
EECS 600: Systems Biology & Bioinformatics
Instructor: Mehmet Koyuturk
6. Gene Regulatory Networks
Regulation of Gene Expression
Transcriptional Regulation of telomerase protein component gene hTERT
2
EECS 600: Systems Biology & Bioinformatics
6. Gene Regulatory Networks
Genetic Regulation & Cellular Signaling
3
EECS 600: Systems Biology & Bioinformatics
6. Gene Regulatory Networks
Organization of Genetic Regulation
Negative ligand-independent
repression at chromatin level
Up-regulation
Gene
Down-regulation
4
Genetic network that controls
flowering time in A. thaliana
(Blazquez et al, EMBO Reports, 2001)
EECS 600: Systems Biology & Bioinformatics
6. Gene Regulatory Networks
Gene Regulatory Networks
Transcriptional Regulatory Networks

5

Nodes with outgoing edges are limited to transcription factors

Can be reconstructed by identifying regulatory motifs (through
clustering of gene expression & sequence analysis) and finding
transcription factors that bind to the corresponding promoters
(through structural/sequence analysis)
EECS 600: Systems Biology & Bioinformatics
6. Gene Regulatory Networks
Gene Regulatory Networks
Gene expression networks




General model of genetic regulation
Identify the regulatory effects of genes on each other,
independent of the underlying regulatory mechanism
Can be inferred from correlations in gene expression data,
time-series gene expression data, and/or gene knock-out
experiments
Observation
6
Inference
EECS 600: Systems Biology & Bioinformatics
6. Gene Regulatory Networks
Boolean Network Model
Binary model, a gene has only two states



ON (1): The gene is expressed
OFF (0): The gene is not expressed
Each gene’s next state is determined by a boolean
function of the current states of a subset of other genes





7
A boolean network
is specified by two sets
Set of nodes (genes)
State of a gene:
Collection of boolean functions
EECS 600: Systems Biology & Bioinformatics
6. Gene Regulatory Networks
Logic Diagram

Cell cycle regulation




8
Retinoblastma (Rb)
inhibits DNA synthesis
Cyclin Dependent
Kinase 2 (cdk2) &
cyclin E inactivate Rb
to release cell into S
phase
Up-regulated by CAK
complex and downregulated by p21/WAF1
p53
EECS 600: Systems Biology & Bioinformatics
6. Gene Regulatory Networks
Wiring Diagram
9
EECS 600: Systems Biology & Bioinformatics
6. Gene Regulatory Networks
Dynamics of Boolean Networks

Gene activity profile (GAP)

Collection of the states of individual genes in the genome
(network)


The system ultimately transitions into attractor states





The number of possible GAPs is 2n
Steady state (point) attractors
Dynamic attractors: state cycle
Each transient state is associated with an attractor (basins of
attraction)
In practice, only a small number of GAPs correspond to
attractors
What is the biological meaning of an attractor?
10
EECS 600: Systems Biology & Bioinformatics
6. Gene Regulatory Networks
State Space of Boolean Networks


Equate cellular with
attractors
Attractor states are
stable under small
perturbations


Most perturbations cause
the network to flow back
to the attractor
Some genes are more
important and changing
their activation can cause
the system to transition to
a different attractor
This slide is taken from the presentation by I. Shmulevich
11
EECS 600: Systems Biology & Bioinformatics
6. Gene Regulatory Networks
Identification of Boolean Networks

We have the “truth table” available


Binarize time-series gene expression data
REVEAL

Use mutual information to derive logical rules that determine
each variable


For each variable, consider functions consisting of 1 variable,
then 2, then 3, …, then i…, until one is found


12
If the mutual information between a set of variables and the target
variable is equal to the entropy of that variable, then that set of
variables completely determines the target variable
Once the minimum set of variables that determine a variable is found,
we can infer the function from the truth table
In general, the indegrees of genes in the network is small
EECS 600: Systems Biology & Bioinformatics
6. Gene Regulatory Networks
REVEAL
13
EECS 600: Systems Biology & Bioinformatics
6. Gene Regulatory Networks
Limitations of Boolean Networks



The effect of intermediate gene expression levels is
ignored
It is assumed that the transitions between states are
synchronous
A model incorporates only a partial description of a
physical system



One may wish to model an open system


Noise
Effects of other factors
A particular external condition may alter the parameters of
the system
Boolean networks are inherently deterministic
14
EECS 600: Systems Biology & Bioinformatics
6. Gene Regulatory Networks
Probabilistic Models

Stochasticity can account for




Noise
Variability in the biological system
Aspects of the system that are not captured by the model
Random variables include

Observed attributes


Hidden attributes

15
Expression level of a particular gene in a particular sample
The boolean function assigned to a gene?
EECS 600: Systems Biology & Bioinformatics
6. Gene Regulatory Networks
Probabilistic Boolean Networks

Each gene is associated with multiple boolean functions
Each function is associated with a probility

Can characterize the stochastic behavior of the system

16
EECS 600: Systems Biology & Bioinformatics
6. Gene Regulatory Networks
Bayesian Networks


A Bayesian network is a representation of a joint
probability distribution
A Bayesian network B=(G, ) is specified by two
components


A directed acyclic graph G, in which directed edges represent
the conditional dependence between expression levels of
genes (represented by nodes of the graph)
A function  that specifies the conditional distribution of the
expression level of each gene, given the expression levels of its
parents


17
Gene A is gene B’s parent if there is a directed edge from A to B
P(B | Pa(B)) = (B, Pa(B))
EECS 600: Systems Biology & Bioinformatics
6. Gene Regulatory Networks
Conditional Independence

In a Bayesian network, if no direct between two genes,
then these genes are said to be conditionally independent

The probability of observing a cellular state (configuration
of expression levels) can be decomposed into product
form
18
EECS 600: Systems Biology & Bioinformatics
6. Gene Regulatory Networks
Variables in Bayesian Network

Discrete variables



Continuous variables




Again, genes’ expression levels are modeled as ON and OFF
(or more discrete levels)
If a gene has k parents in the network, then the conditional
distribution is characterized by rk parameters (r is the number
of discrete levels)
Real valued expression levels
We have to specify multivariate continuous distribution
functions
Linear Gaussian distribution:
Hybrid networks
19
EECS 600: Systems Biology & Bioinformatics
6. Gene Regulatory Networks
Equivalence Classes of Bayesian Nets

Observe that each network structure implies a set of
independence assumptions


More than one graph can imply exactly the same set of
independencies (e.g., X->Y and Y->X)


Given its parents, each variable is independent of its nondescendants
Such graphs are said to be equivalent
By looking at observations of a distribution, we cannot
distinguish between equivalent graphs

20
An equivalence class can be uniquely represented by a partially
directed graph (some edges are undirected)
EECS 600: Systems Biology & Bioinformatics
6. Gene Regulatory Networks
Learning Bayesian Networks

Given a training set D = {x1, x2, …, xn} of m independent
instances of the n random variables, find an equivalence
class of networks B=(G, ) that best matches D


x’s are the gene expression profiles
Based on Bayes’ formula, the posterior probability of a
network given the data can be evaluated as
where C is a constant (independent of G) and
is the marginal likelihood that averages the probability of data
over all possible parameter assignments to G
21
EECS 600: Systems Biology & Bioinformatics
6. Gene Regulatory Networks
Learning Algorithms


The Bayes score S(G : D) depends on the particular
choice of priors P(G) and P( | G)
The priors can be chosen to be



structure equivalent, so that equivalent networks will have the
same score
decomposable, so that the score can be represented as the
superposition of contributions of each gene
The problem becomes finding the optimal structure (G)


22
We can estimate the gain associated with addition, removal,
and reversal of an edge
Then, we can use greedy-like heuristics (e.g., hill climbing)
EECS 600: Systems Biology & Bioinformatics
6. Gene Regulatory Networks
Causal Patterns

Bayesian networks model dependencies between multiple
measurements


Causal network model: Flow of causality




How about the mechanism that generated these
measurements?
Model not only the distribution of observations, but also the
effect of observations
If gene X codes for a transcription factor of gene Y,
manupilating X will affect Y, but not vice versa
But in Bayesian networks, X->Y and Y->X are equivalent
Intervention experiments (as compared to passive
observation): Knock X out, then measure Y
23
EECS 600: Systems Biology & Bioinformatics
6. Gene Regulatory Networks
Dynamic Bayesian Networks

Dependencies do not
uncover temporal
relationships


Gene expression varies
over time
Dynamic Bayesian
Networks model the
dependency between a
gene’s expression level
at time t and expression
levels of parent genes at
time t-1
24
EECS 600: Systems Biology & Bioinformatics
6. Gene Regulatory Networks
Linear Additive Regulation Model

The expression level of a gene at a certain time point can
be calculated by the weighted sum of the expression
levels of all genes in the network at a previous time point






ei : expression level of gene i
wij : effect of gene j on gene i
uk: kth external variable
nik: effect of kth external variable on gene j
bi : gene-specific bias
Can be fitted using linear regression
25
EECS 600: Systems Biology & Bioinformatics