PROM - National e

Download Report

Transcript PROM - National e

Nathan Price

Department of Chemical & Biomolecular Engineering Center for Biophysics & Computational Biology Institute for Genomic Biology University of Illinois, Urbana-Champaign

Metabolic Pathways Workshop Edinburgh, Scotland April 7, 2011

INSTITUTE for GENOMIC BIOLOGY

Interactions between metabolic and regulatory networks

Milne, Eddy, Kim, Price,

Biotechnology Journal

, 2009 INSTITUTE for GENOMIC BIOLOGY

Biochemical Reaction Networks

Phylogenetic Data Physiological Data Data Sources Literature

Statistical Inference Networks

Genome Annotation Interactomics More detail (biochemistry, etc.) Reaction Stoichiometry Less detail Interaction Networks Rxn. 1 Rxn. 2 Rxn. 3 Enz -1 0 +1 A -1 0 0 Enz-A +1 -1 0 B 0 -1 0 Enz-A-B 0 +1 -1 C 0 0 +1 Application of Constraints Constraint-Based Model S · v = 0 v ≤ v max v 2 v 1 D 0 0 +1 Integrated Network Data Mathematical Model •Transcriptomics  •Proteomics  •Metabolomics  Protein-Metabolite Protein-Protein DNA-Protein DNA-DNA Network Inference Statistical Inference Network C =

f

(A,B,D) Activation Inhibition Indirect v 3 Eddy and Price,

Encyclopedia of complexity and systems science

(2009) INSTITUTE for GENOMIC BIOLOGY

Need for automated reconstruction methods

1000

Genomes GEMs

100 10 1 1995 1997 1999 2001 2003 Year 2005 2007 2009 C Milne, JA Eddy, PJ Kim, ND Price,

Biotechnology Journal

, 2009 INSTITUTE for GENOMIC BIOLOGY

Automated reconstruction of metabolic networks

     Automated reconstruction of

computable

metabolic network models Demonstrated on 130 genomes Provide advanced starting point for virtually any organism Accuracy from genomics: 65% With biolog and optimization: 87% Henry, C. DeJongh, M, Best, AA, Frybarger, PM, and Stevens, RL,

Nature Biotechnology

, 2010 INSTITUTE for GENOMIC BIOLOGY

Integrated automated reconstructions

INSTITUTE for GENOMIC BIOLOGY

Integration of automatically learned statistics-based regulatory networks and biochemistry-based metabolic networks Sriram Chandrasekaran Bozena Sawicka Amit Ghosh

Example of Current State-of-the-Art: rFBA

     Motivated by data limitations Regulatory network represented by Boolean rules Rules taken from literature curation Only subset of network available under different environmental conditions Metabolic flux analysis performed with available reactions Covert, MW et al.,

Nature

, 2004 INSTITUTE for GENOMIC BIOLOGY

PROM models integrating TRN and metabolic network

    Automated Comprehensive Probabilistic Boolean vs Boolean Higher accuracy Chandrasekaran and Price,

Proc. Natil. Acad. Sci. USA, 2010

INSTITUTE BIOLOGY for GENOMIC

PROM MODEL - PROBABILITIES

PROM's

novelty

lies in the introduction of

probabilities

states and gene - transcription factor (TF) interactions.

to represent gene

IF (B) THEN A P(A|B) = 0.95

P(A = 1|B = 0) -

The probability of gene A being ON when its transcription factor B is OFF

P(A = 1|B = 1) -

probability of A being ON when B is ON. Chandrasekaran and Price,

Proc. Natil. Acad. Sci. USA, 2010

INSTITUTE for GENOMIC BIOLOGY

CONSTRAINING FLUXES USING PROBABILITIES

TF p(mRNA|TF) Flux Bound p*Vmax Optimal Flux State Chandrasekaran and Price,

Proc. Natil. Acad. Sci. USA, 2010

INSTITUTE for GENOMIC BIOLOGY

PROM: Basis is a constraint-based metabolic model

Constraint-based analysis involves solving the linear optimization problem:

max w

T

v

subject to constraints

S.v

= 0

lb

v

ub

where

S

is the stoichiometric matrix,

v

is a flux vector representing a particular flux configuration,

w

T

v

is the linear objective function, and

lb,ub

are vectors containing the minimum and maximum fluxes through each reaction.

INSTITUTE for GENOMIC BIOLOGY

PROM Approach

PROM finds a flux distribution that satisfies the same constraints as FBA plus

additional constraints due to the transcriptional regulation

min (κ.α + κ.β)

β

subject to constraints

lb

– α ≤ v ≤ ub

+ β α, β ≥ 0 α

Where

lb

’ ,

ub

’ are constraints based on transcriptional regulation ( the flux bound cues), α,β are positive constants which represent deviation from those constraints and κ represents the penalty for such deviations.

INSTITUTE for GENOMIC BIOLOGY

Data used for the E. coli PROM model

Metabolic Model Metabolic Reactions Regulatory data Regulatory Interactions Microarrays Total Genes in the model Validation Data set

E. coli

IAF1260 2382 RegulonDB 1773 907 1400 1875 growth phenotypes Feist A et al, Molecular Systems Biology, 2007 Chandrasekaran, S., and Price, N.D.,

PNAS

, 2010 INSTITUTE for GENOMIC BIOLOGY

Automated PROM model has similar accuracy to RFBA

lT gl R gnt R xy R as nC rbs R ilv Y gl G rha S cp xR cyt R so R me lR 1,2Propanediol 2Deoxy Adenosine aDGlucose aDLactose aKetoGlutaric Acid Acetic Acid Acetoacetic Acid Adenosine Citric Acid D,LMalic Acid DAlanine DFructose DGalactose DGalacturonic Acid DGluconic Acid DGlucose6Phosphate DGlucuronic Acid DMannitol DMannose DMelibiose DRibose DSerine DSorbitol DTrehalose DXylose Formic Acid Fumaric Acid Glycerol Glycolic Acid Inosine LAlanine LArabinose LAsparagine LAspartic Acid LFucose LGlutamic Acid LGlutamine LLactic Acid LMalic Acid LProline LRhamnose LSerine LThreonine Maltose Maltotriose NAcetylbDMannosamine NAcetylDGlucosamine Pyruvic Acid Succinic Acid Sucrose Thymidine Uridine Butyric Acid D,LCarnitine Dihydroxy Acetone gAmino Butyric Acid Glycine LArginine LHistidine LIsoleucine LLeucine LLysine LMethionine LOrnithine LPhenylalanine LTartaric Acid LValine NAcetylNeuraminic Acid Putrescine Adenine Adenosine AlaAsp AlaGln AlaGlu AlaGly AlaHis AlaLeu AlaThr Allantoin Ammonia Cytidine Cytosine DAlanine DGlucosamine DSerine GlyAsn GlyGln GlyGlu GlyMet Glycine Guanine Guanosine Inosine LAlanine LArginine LAsparagine LAspartic Acid LCysteine LGlutamic Acid LGlutamine LHistidine LIsoleucine LLeucine LLysine LMethionine LOrnithine LPhenylalanine LProline LSerine LThreonine LTryptophan LTyrosine LValine MetAla NAcetylDGlucosamine NAcetylDMannosamine Nitrate Nitrite Putrescine Thymidine Uracil Urea Uridine Xanthine Xanthosine Covert MW et al,

Nature

, 2004 Chandrasekaran S, and Price ND,

PNAS

, 2010 COMPARISON WITH RFBA Non Lethal, both PROM,RFBA are right Lethal, both PROM,RFBA are right PROM wrong ,RFBA right PROM right, RFBA wrong Lethal, both wrong Non lethal, both wrong

PROM – 85% , RFBA – 81% AUTOMATED (PROM) Vs MANUAL (RFBA)

INSTITUTE for GENOMIC BIOLOGY

Increased comprehensiveness to previous RFBA model

Interactions Regulated metabolic genes PROM E. coli model E. coli iMC1010 Transcription Factors 0 200 400 600 800 1000 1200 1400 1600 1800 2000

Automated learning from high-throughput data improves comprehensiveness

Covert MW,

Nature

, 2004 Chandrasekaran, S, and Price, ND, In review, 2010 INSTITUTE for GENOMIC BIOLOGY

Results: Quantitative Growth Prediction

Growth rate prediction by PROM Culture WT + O2 WT - O2 ΔarcA + O2 ΔarcA - O2 Δfnr + O2 Δfnr - O2 Δfnr/ΔarcA + O2 Δfnr/ΔarcA - O2 ΔappY + O2 ΔappY - O2 ΔoxyR + O2 ΔoxyR - O2 ΔsoxS + O2 ΔsoxS - O2 Actual 0.71

0.49

0.69

0.38

0.63

0.41

0.65

0.3

0.64

0.48

0.64

0.48

0.72

0.46

PROM 0.7382

0.385

0.7651

0.3224

0.5635

0.2181

0.6596

0.204

0.7152

0.3287

0.7876

0.3287

0.7687

0.379

Experimental data taken from MW Covert et al,

Nature

, 2004 Chandrasekaran, S., and Price, N.D.,

PNAS

, 2010 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0 0,2 0,3 0,4 0,5 0,6 0,7 0,8 Experimental growth rate Overall correlation with experimental data: R = 0.95

Function of both oxygen switch (dominant) and regulation INSTITUTE for GENOMIC BIOLOGY

PROM Model Inputs for M. tuberculosis

Metabolic Model Metabolic Reactions Regulatory data Regulatory Interactions Microarrays Total Genes in the model Validation Data set

M. tuberculosis

iNJ661 1028 Balazsi

et al

218 437 691 30 TF knockout Jamshidi NJ, and Palsson, BO,

BMC Systems Biology

, 2007 Balazsi G et al,

Molecular Systems Biology

, 2008; Boshoff HI et al,

JBC

, 2004 INSTITUTE for GENOMIC BIOLOGY

Accuracy in predicting essentiality of TF for optimal growth

Accuracy Sensitivity % Specificity % Chandrasekaran and Price,

95%

83 100

TF

dnaA Rv0485 crp sigD kdpE ideR Rv1395 argR sigC sigH lrpA Rv3575c oxyS nadR hspR regX3 Rv0586 narL sigE furA Rv1931c furB lexA pknK dosR birA sigF kstR cyp143 embR

Predicted Growth rate 0.03

0.042

0.03

0.05

0.052

0.038

0.028

0.047

0.024

0.05

0.032

0.026

0.052

0.052

0.052

0.052

0.052

0.052

0.052

0.052

0.052

0.052

0.052

0.052

0.052

0.052

0.052

0.052

0.052

0.052

Proc. Natil. Acad. Sci. USA, 2010

INSTITUTE for GENOMIC BIOLOGY

Legend Correct Prediction

Essential gene Non essential gene Candidate essential

Incorrect Prediction

Non essential gene Essential gene

PROM Model Inputs for S. cerevisiae

Metabolic Model Metabolic Reactions Regulatory data Regulatory Interactions Microarrays Total Genes in the model Validation Data set S. cerevisiae iMM904 1577 YEASTRACT 4200 904, M3D 904 136 TF knockout

Duarte NC

et al BMC Genomics

2004 Steinmetz LM

et al. Nature Genetics

2002 Ghosh, Chandrasekaran, Zhao, and Price, 2010 (in preparation) INSTITUTE for GENOMIC BIOLOGY

Increased comprehensiveness to previous RFBA model

Transcription Factors Regulated Metabolic Genes Interactions RFBA model iMH805/775 55 348 775 PROM model 136 904 4200 Interactions

Ghosh, Chandrasekaran, Zhao, and Price, 2010 (in preparation) Herrgard

et al

.,

Regulated Metabolic Genes Transcription Factors 0 1000 2000 3000 4000 5000 PROM model RFBA model iMH805/775

Genome Res

, 2006 INSTITUTE for GENOMIC BIOLOGY

Accuracy in predicting essentiality of TF for optimal growth

  

Predicts correctly 135/136 of lethal/non lethal calls Identifies 8 lethal TF KOs, with only 1 false positive Lone miss (Gcn4) is a very slow grower (multiple days)

Ghosh, Chandrasekaran, Zhao, and Price, 2010 (in preparation) INSTITUTE for GENOMIC BIOLOGY

Validation: Quantitative Growth Prediction

WT

adr1

cat8

mig2

sip4

gal4

rtg1

mth1

nrg1

mig1

gcr2

Culture

Growth rate prediction by PROM Glucose

SUR = 6.3; OUR = 2.5

Galactose

SUR = 2.1, OUR = 3.9

Actual 0.21

PROM 0.22

Actual 0.13

PROM 0.15

Fructose

SUR = 2.6, OUR = 6.2

Actual PROM 0.2

0.23

0.21

0.21

0.21

0.22

0.22

0.22

0.13

0.13

0.13

0.15

0.15

0.15

0.2

0.2

0.2

0.23

0.23

0.23

0.21

0.21

0.21

0.21

0.21

0.21

0.17

0.22

0.22

0.22

0.21

0.20

0.21

0.15

0.13

0.03

0.08

0.11

0.13

0.13

0.13

0.12

0.01

0.07

0.15

0.14

0.15

0.15

0.2

0.2

0.2

0.2

0.2

0.2

0.16

0.19

0.23

0.23

0.23

0.22

0.21

0.17

0.25

0.2

0.15

0.1

0.05

0 0 0.1

0.15

0.05

Experimental growth rate 0.2

Overall correlation with experimental data: R = 0.96

0.25

Experimental data taken from MJ Herrgard

et al

,

Genome Res

2006 Ghosh, Chandrasekaran, Zhao, and Price, 2010 (in preparation) Driven by both substrate (dominant) and regulation INSTITUTE for GENOMIC BIOLOGY

Quantitative Growth Prediction for 77 TF knockout Phenotypes with Galactose

0.25

0.2

0.15

0.1

0.05

0 0 0.05

0.1

0.15

Experimental Growth Rate

0.2

0.25

Overall correlation with experimental data: R = 0.90

(based only on regulation – metabolic model alone would be flat line) Experimental data taken from SM Fendt

et al

,

Molecular Systems Biology

2010 Ghosh, Chandrasekaran, Zhao, and Price, 2010 (in preparation) INSTITUTE for GENOMIC BIOLOGY

Prediction of Metabolic flux for ∆Gcn4 mutant strain

Reaction

G6P <=> F6P (net) PEP -> P5P <=> EC2 + G3P (net) F6P <=> EC2 + E4P (net) S7P <=> EC3 + E4P (net) PYR -> ACA + CO2 ETH -> ETHOUT ACE -> ACCOA OAAMIT+ACCOAMIT-> CITMIT OAAMIT <=> OAA (net) CITMIT <=> CIT (net) SER -> CYS SER <=> GLY + METTHF (net) OAA -> ASP PYR -> AKG -> GLU GLU -> ORN CHOR -> PPHN

WT (expt)

492.89

1280.46

127.82

-60.99

66.82

1162.09

662.32

504.22

515.08

-243.97

51.48

2.84

8.92

27.41

4.32

27.48

5.81

5.83

GCN4 (expt)

362.99

1290.54

213.63

-103.85

109.79

1035.51

666.92

371.79

496.43

-156.85

72.45

2.07

14.02

21.02

3.32

25.89

3.72

5.94

WT (model)

500 17.72

0.263

-0.182

8.661

17.78

15.82

0.147

0.529

-498.08

0.035

0.230

0.09709

500 0.67622

0.0285

0.04626

0.03111

GCN4 (model)

500 18.94

0.217

-0.19

9.19

15.94

17.72

0.067

0.811

-464.36

0.076

0.0019

0.0448

474.91

0.1471

0.0131

0.0258

0.0679

flux (expt)

                  

flux (model)

                 Experimental data taken from SM Fendt

et al

, Moxley

et al

,

PNAS

2009 Ghosh, Chandrasekaran, Zhao, and Price, 2010 (in preparation) INSTITUTE for GENOMIC BIOLOGY

PROM Highlights

     PROM is a new approach for integrating the transcriptional network with metabolism  Automated and comprehensive We compared it with state-of-the art metabolic-regulatory models of E. coli  Comparable accuracy  More comprehensive (automated from HT data) We constructed the first genome-scale integrated regulatory-metabolic model for

M. tuberculosis

We compared it with state-of-the art metabolic-regulatory models of

S. cerevisiae

  Much more accurate Much more comprehensive (automated from HT data) PROM can accurately predict the effect of perturbations to transcriptional regulators and subsequently be used to predict microbial growth phenotypes quantitatively Chandrasekaran and Price,

Proc. Natil. Acad. Sci. USA, 2010

INSTITUTE for GENOMIC BIOLOGY

Constraint-based Reconstruction and Analysis Conference Confirmed Speakers

Eivind Almaas Ronan Fleming Vassily Hatzimanikatis Christopher Henry Hermann Georg Holzhütter Costas Maranas Jens Nielsen Bernhard Palsson Jason Papin Balázs Papp Nathan Price Eytan Ruppin Uwe Sauer Stefan Schuster Daniel Segre Ines Thiele

Key Dates

April 7, 2011 - Abstract Deadline for oral & poster presentations (WILL EXTEND) June 24-26, 2011 - COBRA conference INSTITUTE for GENOMIC BIOLOGY

Acknowledgments

Nathan D. Price Lab @ the University of Illinois, Urbana-Champaign

Postdocs

Nick Chia Cory Funk Amit Ghosh Pan-Jun Kim Charu Gupta Kumar Younhee Ko Vineet Sangar

Graduate Students

Daniel Baker Matthew Benedict Sriram Chandrasekaran John Earls Piyush Labhsetwar Shuyi Ma Bozena Sawicka Jaeyun Sung James Eddy Andrew Magis Chunjing Wang Funding Sources NIH / National Cancer Institute Howard Temin Pathway to Independence Award NSF CAREER Department of Defense – TATRC Department of Energy Energy Biosciences Institute (BP) Luxembourg-ISB Systems Medicine Program Roy J. Carver Charitable Trust Young Investigator Award Matthew Gonnerman Seyfullah Kotil Caroline Milne Matthew Richards Yuliang Wang