Transcript PROM - National e
Nathan Price
Department of Chemical & Biomolecular Engineering Center for Biophysics & Computational Biology Institute for Genomic Biology University of Illinois, Urbana-Champaign
Metabolic Pathways Workshop Edinburgh, Scotland April 7, 2011
INSTITUTE for GENOMIC BIOLOGY
Interactions between metabolic and regulatory networks
Milne, Eddy, Kim, Price,
Biotechnology Journal
, 2009 INSTITUTE for GENOMIC BIOLOGY
Biochemical Reaction Networks
Phylogenetic Data Physiological Data Data Sources Literature
Statistical Inference Networks
Genome Annotation Interactomics More detail (biochemistry, etc.) Reaction Stoichiometry Less detail Interaction Networks Rxn. 1 Rxn. 2 Rxn. 3 Enz -1 0 +1 A -1 0 0 Enz-A +1 -1 0 B 0 -1 0 Enz-A-B 0 +1 -1 C 0 0 +1 Application of Constraints Constraint-Based Model S · v = 0 v ≤ v max v 2 v 1 D 0 0 +1 Integrated Network Data Mathematical Model •Transcriptomics •Proteomics •Metabolomics Protein-Metabolite Protein-Protein DNA-Protein DNA-DNA Network Inference Statistical Inference Network C =
f
(A,B,D) Activation Inhibition Indirect v 3 Eddy and Price,
Encyclopedia of complexity and systems science
(2009) INSTITUTE for GENOMIC BIOLOGY
Need for automated reconstruction methods
1000
Genomes GEMs
100 10 1 1995 1997 1999 2001 2003 Year 2005 2007 2009 C Milne, JA Eddy, PJ Kim, ND Price,
Biotechnology Journal
, 2009 INSTITUTE for GENOMIC BIOLOGY
Automated reconstruction of metabolic networks
Automated reconstruction of
computable
metabolic network models Demonstrated on 130 genomes Provide advanced starting point for virtually any organism Accuracy from genomics: 65% With biolog and optimization: 87% Henry, C. DeJongh, M, Best, AA, Frybarger, PM, and Stevens, RL,
Nature Biotechnology
, 2010 INSTITUTE for GENOMIC BIOLOGY
Integrated automated reconstructions
INSTITUTE for GENOMIC BIOLOGY
Integration of automatically learned statistics-based regulatory networks and biochemistry-based metabolic networks Sriram Chandrasekaran Bozena Sawicka Amit Ghosh
Example of Current State-of-the-Art: rFBA
Motivated by data limitations Regulatory network represented by Boolean rules Rules taken from literature curation Only subset of network available under different environmental conditions Metabolic flux analysis performed with available reactions Covert, MW et al.,
Nature
, 2004 INSTITUTE for GENOMIC BIOLOGY
PROM models integrating TRN and metabolic network
Automated Comprehensive Probabilistic Boolean vs Boolean Higher accuracy Chandrasekaran and Price,
Proc. Natil. Acad. Sci. USA, 2010
INSTITUTE BIOLOGY for GENOMIC
PROM MODEL - PROBABILITIES
PROM's
novelty
lies in the introduction of
probabilities
states and gene - transcription factor (TF) interactions.
to represent gene
IF (B) THEN A P(A|B) = 0.95
P(A = 1|B = 0) -
The probability of gene A being ON when its transcription factor B is OFF
P(A = 1|B = 1) -
probability of A being ON when B is ON. Chandrasekaran and Price,
Proc. Natil. Acad. Sci. USA, 2010
INSTITUTE for GENOMIC BIOLOGY
CONSTRAINING FLUXES USING PROBABILITIES
TF p(mRNA|TF) Flux Bound p*Vmax Optimal Flux State Chandrasekaran and Price,
Proc. Natil. Acad. Sci. USA, 2010
INSTITUTE for GENOMIC BIOLOGY
PROM: Basis is a constraint-based metabolic model
Constraint-based analysis involves solving the linear optimization problem:
max w
T
v
subject to constraints
S.v
= 0
lb
≤
v
≤
ub
where
S
is the stoichiometric matrix,
v
is a flux vector representing a particular flux configuration,
w
T
v
is the linear objective function, and
lb,ub
are vectors containing the minimum and maximum fluxes through each reaction.
INSTITUTE for GENOMIC BIOLOGY
PROM Approach
PROM finds a flux distribution that satisfies the same constraints as FBA plus
additional constraints due to the transcriptional regulation
min (κ.α + κ.β)
β
subject to constraints
lb
’
– α ≤ v ≤ ub
’
+ β α, β ≥ 0 α
Where
lb
’ ,
ub
’ are constraints based on transcriptional regulation ( the flux bound cues), α,β are positive constants which represent deviation from those constraints and κ represents the penalty for such deviations.
INSTITUTE for GENOMIC BIOLOGY
Data used for the E. coli PROM model
Metabolic Model Metabolic Reactions Regulatory data Regulatory Interactions Microarrays Total Genes in the model Validation Data set
E. coli
IAF1260 2382 RegulonDB 1773 907 1400 1875 growth phenotypes Feist A et al, Molecular Systems Biology, 2007 Chandrasekaran, S., and Price, N.D.,
PNAS
, 2010 INSTITUTE for GENOMIC BIOLOGY
Automated PROM model has similar accuracy to RFBA
lT gl R gnt R xy R as nC rbs R ilv Y gl G rha S cp xR cyt R so R me lR 1,2Propanediol 2Deoxy Adenosine aDGlucose aDLactose aKetoGlutaric Acid Acetic Acid Acetoacetic Acid Adenosine Citric Acid D,LMalic Acid DAlanine DFructose DGalactose DGalacturonic Acid DGluconic Acid DGlucose6Phosphate DGlucuronic Acid DMannitol DMannose DMelibiose DRibose DSerine DSorbitol DTrehalose DXylose Formic Acid Fumaric Acid Glycerol Glycolic Acid Inosine LAlanine LArabinose LAsparagine LAspartic Acid LFucose LGlutamic Acid LGlutamine LLactic Acid LMalic Acid LProline LRhamnose LSerine LThreonine Maltose Maltotriose NAcetylbDMannosamine NAcetylDGlucosamine Pyruvic Acid Succinic Acid Sucrose Thymidine Uridine Butyric Acid D,LCarnitine Dihydroxy Acetone gAmino Butyric Acid Glycine LArginine LHistidine LIsoleucine LLeucine LLysine LMethionine LOrnithine LPhenylalanine LTartaric Acid LValine NAcetylNeuraminic Acid Putrescine Adenine Adenosine AlaAsp AlaGln AlaGlu AlaGly AlaHis AlaLeu AlaThr Allantoin Ammonia Cytidine Cytosine DAlanine DGlucosamine DSerine GlyAsn GlyGln GlyGlu GlyMet Glycine Guanine Guanosine Inosine LAlanine LArginine LAsparagine LAspartic Acid LCysteine LGlutamic Acid LGlutamine LHistidine LIsoleucine LLeucine LLysine LMethionine LOrnithine LPhenylalanine LProline LSerine LThreonine LTryptophan LTyrosine LValine MetAla NAcetylDGlucosamine NAcetylDMannosamine Nitrate Nitrite Putrescine Thymidine Uracil Urea Uridine Xanthine Xanthosine Covert MW et al,
Nature
, 2004 Chandrasekaran S, and Price ND,
PNAS
, 2010 COMPARISON WITH RFBA Non Lethal, both PROM,RFBA are right Lethal, both PROM,RFBA are right PROM wrong ,RFBA right PROM right, RFBA wrong Lethal, both wrong Non lethal, both wrong
PROM – 85% , RFBA – 81% AUTOMATED (PROM) Vs MANUAL (RFBA)
INSTITUTE for GENOMIC BIOLOGY
Increased comprehensiveness to previous RFBA model
Interactions Regulated metabolic genes PROM E. coli model E. coli iMC1010 Transcription Factors 0 200 400 600 800 1000 1200 1400 1600 1800 2000
Automated learning from high-throughput data improves comprehensiveness
Covert MW,
Nature
, 2004 Chandrasekaran, S, and Price, ND, In review, 2010 INSTITUTE for GENOMIC BIOLOGY
Results: Quantitative Growth Prediction
Growth rate prediction by PROM Culture WT + O2 WT - O2 ΔarcA + O2 ΔarcA - O2 Δfnr + O2 Δfnr - O2 Δfnr/ΔarcA + O2 Δfnr/ΔarcA - O2 ΔappY + O2 ΔappY - O2 ΔoxyR + O2 ΔoxyR - O2 ΔsoxS + O2 ΔsoxS - O2 Actual 0.71
0.49
0.69
0.38
0.63
0.41
0.65
0.3
0.64
0.48
0.64
0.48
0.72
0.46
PROM 0.7382
0.385
0.7651
0.3224
0.5635
0.2181
0.6596
0.204
0.7152
0.3287
0.7876
0.3287
0.7687
0.379
Experimental data taken from MW Covert et al,
Nature
, 2004 Chandrasekaran, S., and Price, N.D.,
PNAS
, 2010 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0 0,2 0,3 0,4 0,5 0,6 0,7 0,8 Experimental growth rate Overall correlation with experimental data: R = 0.95
Function of both oxygen switch (dominant) and regulation INSTITUTE for GENOMIC BIOLOGY
PROM Model Inputs for M. tuberculosis
Metabolic Model Metabolic Reactions Regulatory data Regulatory Interactions Microarrays Total Genes in the model Validation Data set
M. tuberculosis
iNJ661 1028 Balazsi
et al
218 437 691 30 TF knockout Jamshidi NJ, and Palsson, BO,
BMC Systems Biology
, 2007 Balazsi G et al,
Molecular Systems Biology
, 2008; Boshoff HI et al,
JBC
, 2004 INSTITUTE for GENOMIC BIOLOGY
Accuracy in predicting essentiality of TF for optimal growth
Accuracy Sensitivity % Specificity % Chandrasekaran and Price,
95%
83 100
TF
dnaA Rv0485 crp sigD kdpE ideR Rv1395 argR sigC sigH lrpA Rv3575c oxyS nadR hspR regX3 Rv0586 narL sigE furA Rv1931c furB lexA pknK dosR birA sigF kstR cyp143 embR
Predicted Growth rate 0.03
0.042
0.03
0.05
0.052
0.038
0.028
0.047
0.024
0.05
0.032
0.026
0.052
0.052
0.052
0.052
0.052
0.052
0.052
0.052
0.052
0.052
0.052
0.052
0.052
0.052
0.052
0.052
0.052
0.052
Proc. Natil. Acad. Sci. USA, 2010
INSTITUTE for GENOMIC BIOLOGY
Legend Correct Prediction
Essential gene Non essential gene Candidate essential
Incorrect Prediction
Non essential gene Essential gene
PROM Model Inputs for S. cerevisiae
Metabolic Model Metabolic Reactions Regulatory data Regulatory Interactions Microarrays Total Genes in the model Validation Data set S. cerevisiae iMM904 1577 YEASTRACT 4200 904, M3D 904 136 TF knockout
Duarte NC
et al BMC Genomics
2004 Steinmetz LM
et al. Nature Genetics
2002 Ghosh, Chandrasekaran, Zhao, and Price, 2010 (in preparation) INSTITUTE for GENOMIC BIOLOGY
Increased comprehensiveness to previous RFBA model
Transcription Factors Regulated Metabolic Genes Interactions RFBA model iMH805/775 55 348 775 PROM model 136 904 4200 Interactions
Ghosh, Chandrasekaran, Zhao, and Price, 2010 (in preparation) Herrgard
et al
.,
Regulated Metabolic Genes Transcription Factors 0 1000 2000 3000 4000 5000 PROM model RFBA model iMH805/775
Genome Res
, 2006 INSTITUTE for GENOMIC BIOLOGY
Accuracy in predicting essentiality of TF for optimal growth
Predicts correctly 135/136 of lethal/non lethal calls Identifies 8 lethal TF KOs, with only 1 false positive Lone miss (Gcn4) is a very slow grower (multiple days)
Ghosh, Chandrasekaran, Zhao, and Price, 2010 (in preparation) INSTITUTE for GENOMIC BIOLOGY
Validation: Quantitative Growth Prediction
WT
adr1
cat8
mig2
sip4
gal4
rtg1
mth1
nrg1
mig1
gcr2
Culture
Growth rate prediction by PROM Glucose
SUR = 6.3; OUR = 2.5
Galactose
SUR = 2.1, OUR = 3.9
Actual 0.21
PROM 0.22
Actual 0.13
PROM 0.15
Fructose
SUR = 2.6, OUR = 6.2
Actual PROM 0.2
0.23
0.21
0.21
0.21
0.22
0.22
0.22
0.13
0.13
0.13
0.15
0.15
0.15
0.2
0.2
0.2
0.23
0.23
0.23
0.21
0.21
0.21
0.21
0.21
0.21
0.17
0.22
0.22
0.22
0.21
0.20
0.21
0.15
0.13
0.03
0.08
0.11
0.13
0.13
0.13
0.12
0.01
0.07
0.15
0.14
0.15
0.15
0.2
0.2
0.2
0.2
0.2
0.2
0.16
0.19
0.23
0.23
0.23
0.22
0.21
0.17
0.25
0.2
0.15
0.1
0.05
0 0 0.1
0.15
0.05
Experimental growth rate 0.2
Overall correlation with experimental data: R = 0.96
0.25
Experimental data taken from MJ Herrgard
et al
,
Genome Res
2006 Ghosh, Chandrasekaran, Zhao, and Price, 2010 (in preparation) Driven by both substrate (dominant) and regulation INSTITUTE for GENOMIC BIOLOGY
Quantitative Growth Prediction for 77 TF knockout Phenotypes with Galactose
0.25
0.2
0.15
0.1
0.05
0 0 0.05
0.1
0.15
Experimental Growth Rate
0.2
0.25
Overall correlation with experimental data: R = 0.90
(based only on regulation – metabolic model alone would be flat line) Experimental data taken from SM Fendt
et al
,
Molecular Systems Biology
2010 Ghosh, Chandrasekaran, Zhao, and Price, 2010 (in preparation) INSTITUTE for GENOMIC BIOLOGY
Prediction of Metabolic flux for ∆Gcn4 mutant strain
Reaction
G6P <=> F6P (net) PEP -> P5P <=> EC2 + G3P (net) F6P <=> EC2 + E4P (net) S7P <=> EC3 + E4P (net) PYR -> ACA + CO2 ETH -> ETHOUT ACE -> ACCOA OAAMIT+ACCOAMIT-> CITMIT OAAMIT <=> OAA (net) CITMIT <=> CIT (net) SER -> CYS SER <=> GLY + METTHF (net) OAA -> ASP PYR -> AKG -> GLU GLU -> ORN CHOR -> PPHN
WT (expt)
492.89
1280.46
127.82
-60.99
66.82
1162.09
662.32
504.22
515.08
-243.97
51.48
2.84
8.92
27.41
4.32
27.48
5.81
5.83
GCN4 (expt)
362.99
1290.54
213.63
-103.85
109.79
1035.51
666.92
371.79
496.43
-156.85
72.45
2.07
14.02
21.02
3.32
25.89
3.72
5.94
WT (model)
500 17.72
0.263
-0.182
8.661
17.78
15.82
0.147
0.529
-498.08
0.035
0.230
0.09709
500 0.67622
0.0285
0.04626
0.03111
GCN4 (model)
500 18.94
0.217
-0.19
9.19
15.94
17.72
0.067
0.811
-464.36
0.076
0.0019
0.0448
474.91
0.1471
0.0131
0.0258
0.0679
flux (expt)
flux (model)
Experimental data taken from SM Fendt
et al
, Moxley
et al
,
PNAS
2009 Ghosh, Chandrasekaran, Zhao, and Price, 2010 (in preparation) INSTITUTE for GENOMIC BIOLOGY
PROM Highlights
PROM is a new approach for integrating the transcriptional network with metabolism Automated and comprehensive We compared it with state-of-the art metabolic-regulatory models of E. coli Comparable accuracy More comprehensive (automated from HT data) We constructed the first genome-scale integrated regulatory-metabolic model for
M. tuberculosis
We compared it with state-of-the art metabolic-regulatory models of
S. cerevisiae
Much more accurate Much more comprehensive (automated from HT data) PROM can accurately predict the effect of perturbations to transcriptional regulators and subsequently be used to predict microbial growth phenotypes quantitatively Chandrasekaran and Price,
Proc. Natil. Acad. Sci. USA, 2010
INSTITUTE for GENOMIC BIOLOGY
Constraint-based Reconstruction and Analysis Conference Confirmed Speakers
Eivind Almaas Ronan Fleming Vassily Hatzimanikatis Christopher Henry Hermann Georg Holzhütter Costas Maranas Jens Nielsen Bernhard Palsson Jason Papin Balázs Papp Nathan Price Eytan Ruppin Uwe Sauer Stefan Schuster Daniel Segre Ines Thiele
Key Dates
April 7, 2011 - Abstract Deadline for oral & poster presentations (WILL EXTEND) June 24-26, 2011 - COBRA conference INSTITUTE for GENOMIC BIOLOGY
Acknowledgments
Nathan D. Price Lab @ the University of Illinois, Urbana-Champaign
Postdocs
Nick Chia Cory Funk Amit Ghosh Pan-Jun Kim Charu Gupta Kumar Younhee Ko Vineet Sangar
Graduate Students
Daniel Baker Matthew Benedict Sriram Chandrasekaran John Earls Piyush Labhsetwar Shuyi Ma Bozena Sawicka Jaeyun Sung James Eddy Andrew Magis Chunjing Wang Funding Sources NIH / National Cancer Institute Howard Temin Pathway to Independence Award NSF CAREER Department of Defense – TATRC Department of Energy Energy Biosciences Institute (BP) Luxembourg-ISB Systems Medicine Program Roy J. Carver Charitable Trust Young Investigator Award Matthew Gonnerman Seyfullah Kotil Caroline Milne Matthew Richards Yuliang Wang