Transcript Document
Systems biology / Reconstruction and
modeling large biological networks
Richard Notebaart
Seminar
• What is systems biology?
• How to reconstruct large biological networks/systems
• Methods to analyze large biological networks/systems
• Applying systems biology approaches to answer biological
questions
What is systems biology:
•
fashionable catchword?
•
a real new (philosophical) concept?
•
new discipline in biology?
•
just biology?
•
...
Systems concept
•
A system represents a set of components together with the relations
connecting them to form a unity
•
Defining a system divides reality into the system itself and its
environment
•
The number of interconnections within a system is larger than the
number of connections with the environment
•
Systems can include other systems as part of their construction
concept of modularity!
•
allows complex systems to be put together from known simple
ones (system of systems)
•
concept of modularity!
Systems levels
Ecosystem
Multicellular organisms
Organs
Tissues
Cells
Pathways
Proteins/genes
Systems theory
• The behavior of a system depends on:
• (Properties of the) components of the system
• The interactions between the components
THUS:
You cannot understand a system via pure reductionism (studying
the components in isolation)
Systems biology
• New? NO and YES
• Systems theory and theoretical biology are old
• Experimental and computational possibilities are
new
(publications of von Bartalanffy, 1933-1970)
Omics-revolution shifts paradigm to
large systems
- Integrative bioinformatics
- (Network) modeling
Reconstruction of networks from ~omics
for systems analysis
• Gene expression networks: based on micro-array data and
clustering of genes with similar expression values over
different conditions (i.e. correlations).
• Protein-protein interaction networks: based on yeast-twohybrid approaches.
• Metabolic networks: network of interacting metabolites
through biochemical reactions.
How to reconstruct metabolic networks?
• Genome annotation allows for reconstruction:
• If an annotated gene codes for an enzyme it can (in most
cases) be associated to a reaction
Genome-scale
network
genome
transcriptome
proteome
metabolome
Reconstructed genome-scale networks
Species
#Reactions
#Genes
Reference
Escherichia coli
2077
1260
Feist AM. et al. (2007),
Mol. Syst. Biol.
Saccharomyces cerevisiae
1175
708
Förster J. et al. (2003),
Genome Res.
Bacillus subtilis
1020
844
Oh YK. et al. (2007), J.
Biol. Chem.
Lactobacillus plantarum
643
721
Teusink B. et al., (2006),
J. Bio. Chem.
Human
3673
1865
Duarte NC. et al., (2007),
PNAS
…>30
Data visualization via Gene-Protein-Reaction
relations (formalized knowledge)
From network to model
The Modeling Ideal - A complete kinetic description
• Flux*Rxn1 = f(pH, temp, concentration,
regulators,…)
• Can model fluxes and concentrations over
time
• Drawbacks
•
•
•
•
Lots of parameters
Measured in vitro (valid in vivo?)
Can be complex, ‘nasty’ equations
Nearly impossible to get all parameters at genomescale
*measure of turnover rate of substrates through a reaction (mmol.h-1.gDW-1)
Theory vs. Genome-scale modeling
For genome-scale networks there is no detailed kinetic
description -> too many reactions involved!
A
B
C
Theory
Genome-scale
• Complete knowledge
• Incomplete knowledge
• Solution is a single point
• Solution is a space
Flux B
Flux B
Flux A
Flux C
Flux A
Flux C
Genome-scale modeling
• How to model genome-scale networks?
• We need:
• A metabolic reaction network
• Exchange reactions: link between environment and
reaction network (systems boundary)
• Constraints that limit network function:
• Mass balancing (conservation) of metabolites in
the systems
• Exchange fluxes with environment
• ……
• Goal: prediction of growth and reaction fluxes
From network to constraint-based model
Mass balancing
•
A system represents a set of components together with
the relations connecting them to form a whole unity
•
Defining a system divides reality into the system itself
and its environment
Constraint-based modeling - Data structure
• Stoichiometric matrix S (Mass balancing):
1: metabolite produced in reaction
-1: metabolite consumed by reaction
0: metabolite not involved in reaction
Principles of Constraint-Based Analysis
• Steady-state assumption: for each metabolite in network, write a
balance equation
V1
V2
Xi
Flux balance on component Xi:
V1 = V 2 + V3 V1 - V 2 - V3 = 0
V3
•
Result is a system of m equations (number of metabolites) and n
unknowns (fluxes)
Matrix notation: S.v = 0
S = Stoichiometric matrix (m x n)
v = Metabolic reaction fluxes (n)
Normally, n>m so the system is underdetermined
No unique solution!
What is underdetermined?
• Determined System (2 equations, 2 unknowns):
X+Y=2
2X-Y=1
• Solution X=1, Y=1
• Underdetermined System (1 equation, 2 unknowns)
X+Y=2
• Infinite Solutions!
In metabolism more fluxes (unknowns) than metabolites
(equations)
Impose constraints
A
B
Flux C
Flux C
C
Constraints
(i) Stoichiometry (mass conservation)
(ii) Exchange fluxes (capacity)
(iii) …
Unbounded
Solution Space
Bounded
Convex Subset
Flux B
Flux B
x
Flu
A
ux
Fl
A
Exchange reactions allow nutrients to be taken up from environment
with a certain maximum flux, e.g. -2≤vexchange≤0
Interpretation of the convex cone
A
B
C
Convex cone, Flux cone, Solution space
C
A
B
One allowable functional state (flux
distribution) of network given
constraints
Flux balance analysis (FBA)
C
A
Constraints set bounds on solution space,
but where in this space does the “real”
solution lie?
B
FBA: optimize for that flux distribution that
maximizes an objective function (e.g. biomass
flux) – subject to S.v=0 and αj≤vj≤βj
Thus, it is assumed that organisms are evolved
for maximal growth -> efficiency!
Prediction of microbial evolution by flux
balance analysis (in E. coli)
Prediction of growth fails with flux
balance analysis (in L. plantarum)
Teusink B. et al., 2006, J. Bio. Chem.
glucose
D = 0.32 h-1
pyruvate
2 ATP/Glc
lactate
2.5 ATP/Glc
acetate + formate + ethanol
experiment
simulation
lactate
0
13.8
Pyruvate
0
0.1
formate
8.0
0.9
acetate
15.2
1.9
acetoin
0
0
3.4
0.45
citric acid
-0.7
-0.7
succinate
0
0.9
-8.4
-8.4
EtOH
Glucose
FBA predicts mixed acid fermentation with 40% too
high biomass formation -> thus L. plantarum is not
efficient!
Some other constraint-based methods
Robustness analysis: examining the effect of changing the flux through a
reaction on the objective function (i.e. growth)
Some other constraint-based methods
Flux variability analysis: compute minimum and maximum flux values
through each reaction without changing the optimal solution (i.e. maximum
growth / phenotype)
FBA is performed to determine the optimal solution and is used as constraint.
Example of application: if one wants to change the optimal solution it is
relevant to know which reactions have wide and narrow flux ranges
Available software – COBRA toolbox
Designed for matlab and freely available!
Flux coupling / correlations
•
Genome-scale analysis to determine whether two fluxes (v1 and
v2) are:
•
Fully coupled: a non-zero flux of v1 implies a non-zero fixed flux for
v2 (and vice versa)
•
Directionally coupled: a non-zero flux v1 implies a non-zero flux for
v2, but not necessarily the reverse
•
Uncoupled: a non-zero flux v1 does not imply a non-zero flux for v2
(and vice versa)
Flux coupling / correlations
A and B: directionally
B and C: fully
C and D: uncoupled
Measured Vs. In silico flux correlations
Emmerling M. et al. J Bacteriol. 2002
Segre D. et al. PNAS, 2002
(p < 10-14)
In silico and measured flux correlations are in agreement
Notebaart RA. et al. (2007), PLoS Comput Biol (in press)
Flux coupling for data analysis
•
Does flux coupling relate to transcriptional co-regulation of
genes?
Notebaart RA. et al. (2007), PLoS Comput Biol (in press)
Flux coupling for data analysis
Pal C. et al. (2005), Nature Genetics
Coupling type
Event
#Events
OR* (95% c.i.)
Fully coupled
Transfer
59
64.6 (24.2–
168.8)
Fully coupled
Loss
1,624
50.0 (41.8–59.6)
Directionally
coupled
Transfer
78
60.3 (24.3–
147.2)
Directionally
coupled
Loss
2,833
9.6 (8.3–11.1)
Flux coupled genes in the E. coli metabolism are more likely lost or
gained together over evolution
*odd
ratio (OR): how much more likely is an event X relative to event Y
Gene dispensability in metabolism of yeast
• Studies have shown that many metabolic genes are dispensable
(80% of yeast genes appear not to be essential for growth)
• Main question: why are most genes dispensable?
• ‘Forces’ that explain dispensability:
• The impact of gene deletions may depend on the
environment (plasticity)
• The presence of mutational robustness (compensatory
mechanisms) alternative pathways
• Or both…
• Objective: explore the interaction between the two forces.
Harrison R and Papp B. et al. (2007), Proc Natl Acad Sci USA
Gene dispensability in metabolism
• A ’model’ of mutational robustness and environment:
i) Simulate metabolism in different environments and
ii) identify genes in alternative pathways by synthetic lethality
Harrison R and Papp B. et al. (2007), Proc Natl Acad Sci USA
Gene dispensability – single gene deletion
Gene is essential when a deletion is lethal (i.e. no growth):
Delete the gene and apply FBA optimization equals zero gene is essential!
Harrison R and Papp B. et al. (2007), Proc Natl Acad Sci USA
Effect of environment and alternative pathways
BUT, single gene deletion does not supply direct information on alternative
pathways and its role in gene dispensability
Method: Identify synthetic lethality between gene A and B:
i) Delete only gene A and apply FBA optimization unequal to zero gene is not essential
ii) Delete only gene B and apply FBA optimization unequal to zero gene is not essential
iii) Delete both gene A and B and apply FBA optimization equals zero either A or B
must be present thus alternative pathway which explains gene dispensability!
Harrison R and Papp B. et al. (2007), Proc Natl Acad Sci USA
Effect of environment and alternative pathways
Alternative paths in all environments: 14.3%
Alternative paths (SL) in 1 or 2 environments: 50%
50% of genes in alternative pathways provide mutational robustness in only 1 or 2
environments thus the environment plays an important role in gene
dispensability!
Harrison R and Papp B. et al. (2007), Proc Natl Acad Sci USA
Summary / conclusions
• Systems biology: studying living cells/tissues/etc by
exploring their components and their interactions
• Even without detailed knowledge of kinetics, genome-scale
modeling is still possible
• Genome-scale modeling has shown to be relevant in studying
evolution and to interpret ~omics data
• Major challenge is to integrate knowledge of kinetics and
genome-scale networks
Assignment
• Read the following article: Pal C., Papp B., Lercher MJ., Csermely
P., Oliver SG. and Hurst LD. (2006), Chance and necessity in the
evolution of minimal metabolic networks, Nature
• Write a report of 2 / 3 pages and include/consider at least the
following points:
• What is the main hypothesis and scientific question?
• What do you think about the hypothesis? Will it have important
implications?
• Do the authors ask other scientific (sub)questions (related to the
main question) and if so, what are they and was it necessary to
address them?
• What methods have been used and explain them (in your own
words!).
• What are the major findings/results?
• Summarize the conclusions and describe if you agree with it
based on the described results.