Computational Methods in Systems Biology and Synthetic Biology François Fages, Constraint Programming Group INRIA Paris-Rocquencourt mailto:[email protected] http://contraintes.inria.fr/ 06/11/2015 François Fages C2-19 MPRI.
Download ReportTranscript Computational Methods in Systems Biology and Synthetic Biology François Fages, Constraint Programming Group INRIA Paris-Rocquencourt mailto:[email protected] http://contraintes.inria.fr/ 06/11/2015 François Fages C2-19 MPRI.
Computational Methods in Systems Biology and Synthetic Biology François Fages, Constraint Programming Group INRIA Paris-Rocquencourt mailto:[email protected] http://contraintes.inria.fr/ 06/11/2015 François Fages C2-19 MPRI 1 Overview of the Lectures 1. 2. 3. 4. 5. Formal molecules and reaction models in BIOCHAM Kinetics Qualitative properties formalized in temporal logic CTL Quantitative properties formalized in LTL(R) and PLTL(R) … 06/11/2015 François Fages C2-19 MPRI 2 References A wonderful textbook: Molecular Cell Biology. 5th Edition, 1100 pages+CD, Freeman Publ. Lodish, Berk, Zipursky, Matsudaira, Baltimore, Darnell. Nov. 2003. The Biochemical Abstract Machine BIOCHAM. http://contraintes.inria.fr/BIOCHAM Formal Cell Biology in BIOCHAM (tutorial). François Fages and Sylvain Soliman. 8th International School on Computational Systems Biology. ISpringer-Verlag, LNCS 5016. Mar. 2008.(pdf) Modeling dynamic phenomena in molecular and cellular biology. Segel. Cambridge Univ. Press. 1987. 06/11/2015 François Fages C2-19 MPRI 3 Cell Molecules • Small molecules: covalent bonds 50-200 kcal/mol – 70% water – 1% ions – 6% amino acids (20), nucleotides (5), – fats, sugars, ATP, ADP, … • Macromolecules: hydrogen bonds, ionic, hydrophobic, Waals 1-5 kcal/mol Stability and bindings determined by the number of weak bonds: 3D shape – 20% proteins (50-104 amino acids) – RNA (102-104 nucleotides AGCU) – DNA (102-106 nucleotides AGCT) 06/11/2015 François Fages C2-19 MPRI 4 Proteins 1) Primary structure: word of n amino acids residues (20n possibilities) linked with C-N bonds 06/11/2015 François Fages C2-19 MPRI 5 Proteins 1) Primary structure: word of n amino acids residues (20n possibilities) linked with C-N bonds Example: MPRI Methionine-Proline-Arginine-Isoleucine 06/11/2015 François Fages C2-19 MPRI 6 Proteins 1) Primary structure: word of n amino acids residues (20n possibilities) linked with C-N bonds Example: MPRI Methionine-Proline-Arginine-Isoleucine 2) Secondary: word of m a-helix, b-strands, random coils,… (3m-10m) stabilized by hydrogen bonds H---O 06/11/2015 François Fages C2-19 MPRI 7 Proteins 1) Primary structure: word of n amino acids residues (20n possibilities) linked with C-N bonds Example: MPRI Methionine-Proline-Arginine-Isoleucine 2) Secondary: word of m a-helix, b-strands, random coils,… (3m-10m) stabilized by hydrogen bonds H---O 3) Tertiary 3D structure: spatial folding stabilized by hydrophobic interactions explains the protein interaction capabilities 06/11/2015 François Fages C2-19 MPRI 8 Formal Proteins • Cyclin dependent kinase 1 Cdk1 (free, inactive) 06/11/2015 François Fages C2-19 MPRI 9 Formal Proteins • Cyclin dependent kinase 1 Cdk1 (free, inactive) • Complex Cdk1-Cyclin B (low activity) 06/11/2015 Cdk1–CycB François Fages C2-19 MPRI 10 Formal Proteins • Cyclin dependent kinase 1 Cdk1 (free, inactive) • Complex Cdk1-Cyclin B (low activity) • Phosphorylated form Cdk1–CycB Cdk1~{thr161}-CycB at site threonine 161 (high activity) 06/11/2015 François Fages C2-19 MPRI 11 Formal Proteins • Cyclin dependent kinase 1 Cdk1 (free, inactive) • Complex Cdk1-Cyclin B (low activity) Cdk1–CycB Nuclear division • Phosphorylated form Cdk1~{thr161}-CycB at site threonine 161 (high activity) “Mitosis-Promoting Factor” phosphorylates actin in microtubules 06/11/2015 François Fages C2-19 MPRI 12 Complexation and Phosphorylation Rules • Complexation Cdk1 + CycB => Cdk1-CycB • Decomplexation Cdk1-CycB => Cdk1 + CycB • Phosphorylation Cdk1-CycB =[Myt1]=> Cdk1~{thr161}-CycB • Dephosphorylation Cdk1~{thr14,tyr15}-CycB =[Cdc25~{Nterm}]=> Cdk1-CycB 06/11/2015 François Fages C2-19 MPRI 13 DNA Deoxyribonucleic Acid 1) Primary structure: word over 4 nucleotides Adenine, Guanine, Cytosine, Thymine 06/11/2015 François Fages C2-19 MPRI 14 DNA Deoxyribonucleic Acid 1) Primary structure: word over 4 nucleotides Adenine, Guanine, Cytosine, Thymine 2) Secondary structure: double helix of pairs A--T and C---G stabilized by hydrogen bonds Size of DNA = number of pairs Genes are parts of DNA Nobel Prizes Watson and Crick 06/11/2015 François Fages C2-19 MPRI 15 DNA: Genome Size Species Genome size Chromosomes Coding DNA E. Coli (bacteria) 5 Mb 1 circular 100 % S. Cerevisae (yeast) 12 Mb 16 70 % 06/11/2015 … 3 Gb … 15 Gb … 140 Gb François Fages C2-19 MPRI 16 DNA: Genome Size Species Genome size Chromosomes Coding DNA E. Coli (bacteria) 5 Mb 1 circular 100 % S. Cerevisae (yeast) 12 Mb 16 70 % Mouse, Human 3 Gb 20, 23 15 % … 15 Gb … 140 Gb 3,200,000,000 pairs of nucleotides single nucleotide polymorphism 1 / 2kb 06/11/2015 François Fages C2-19 MPRI 17 Genome Size Species Genome size Chromosomes Coding DNA E. Coli (bacteria) 4 Mb 1 100 % S. Cerevisae (yeast) 12 Mb 16 70 % Mouse, Human 3 Gb 20, 23 15 % Onion 15 Gb 8 1% … 140 Gb 06/11/2015 François Fages C2-19 MPRI 18 Genome Size Species Genome size Chromosomes Coding DNA E. Coli (bacteria) 4 Mb 1 100 % S. Cerevisae (yeast) 12 Mb 16 70 % Mouse, Human 3 Gb 20, 23 15 % Onion 15 Gb 8 1% Lungfish 140 Gb 06/11/2015 François Fages C2-19 MPRI 0.7 % 19 DNA Replication 1. Separation of the two helices 2. Production of one complementary strand for each copy (from one or several starting points of replication) 06/11/2015 François Fages C2-19 MPRI 20 Transcription and Translation 1. Activation (Inhibition): Nobel prize Jakob and Monod transcription factors bind to the regulatory region of the gene 2. Transcription: RNA polymerase copies the DNA from start to stop positions into a single stranded pre-mature messenger pRNA 3. (Alternative) splicing: non coding regions of pRNA are removed giving mature messenger mRNA 4. Translation: mRNA moves to cytoplasm and binds to ribosome to assemble a protein 06/11/2015 François Fages C2-19 MPRI 21 Formal Genes • Part of DNA, unique • Activation binding of promotion factor • Repression (inhibition) binding of another molecule 06/11/2015 #E2 #E2-E2f13-DP12 #E2-Rep François Fages C2-19 MPRI 22 Transcription and Translation Rules Activation #E2 + E2f13-DP12 => #E2-E2f13-DP12 Repression #E2 + Rep => #E2-Rep 06/11/2015 François Fages C2-19 MPRI 23 Transcription and Translation Rules Activation #E2 + E2f13-DP12 => #E2-E2f13-DP12 Repression #E2 + Rep => #E2-Rep Transcription _ =[#E2-E2F13-DP12]=> pRNAcycA 06/11/2015 François Fages C2-19 MPRI 24 Transcription and Translation Rules Activation #E2 + E2f13-DP12 => #E2-E2f13-DP12 Repression #E2 + Rep => #E2-Rep Transcription _ =[#E2-E2F13-DP12]=> pRNAcycA (Alternative) Splicing pRNAcycA => mRNAcycA pRNAcycA => mRNAcycA2 06/11/2015 François Fages C2-19 MPRI 25 Transcription and Translation Rules Activation #E2 + E2f13-DP12 => #E2-E2f13-DP12 Repression #E2 + Rep => #E2-Rep Transcription _ =[#E2-E2F13-DP12]=> pRNAcycA (Alternative) Splicing pRNAcycA => mRNAcycA pRNAcycA => mRNAcycA2 Translation mRNAcycA => mRNAcycA::cyt mRNAcycA::cyt + ribosome::cyt => cycA::cyt + ribosome::cyt 06/11/2015 François Fages C2-19 MPRI 26 BIOCHAM Syntax of Objects E == compound | E-E | E~{p1,…,pn} • compound: molecule, #gene binding site • - : binding operator for protein complexes, gene binding sites, … Associative and commutative. • ~{…}: modification operator for phosphorylated sites, … Set of modified sites (Associative, Commutative, Idempotent). O == E | E::location • location: symbolic compartment (nucleus, cytoplasm, membrane, …) 06/11/2015 François Fages C2-19 MPRI 27 BIOCHAM Syntax of Rules S == _ | O+S + : solution operator (Associative, Commutative, Neutral _) R ::= S => S | kinetic-expression for R Abbreviations A =[C]=> B stands for A+C => B+C A <=> B stands for A=>B and B=>A, Compatible with the Systems Biology Markup Language SBML exchange format for reaction models 06/11/2015 François Fages C2-19 MPRI 28 Elementary Rule Schemas • Complexation: A + B => A-B. cdk1+cycB => cdk1–cycB Decomplexation A-B => A + B. • Phosphorylation: A =[C]=> A~{p}. Dephosphorylation A~{p} =[C]=> A. Cdk1-CycB =[Myt1]=> Cdk1~{thr161}-CycB Cdk1~{thr14,tyr15}-CycB =[Cdc25~{Nterm}]=> Cdk1-CycB • Synthesis: _ =[C]=> A. _=[#E2-E2f13-Dp12]=>cycA Degradation: A =[C]=> _. cycE =[@UbiPro]=> _ (not for cycE-cdk2 which is stable) • Transport: A::L1 => A::L2. Cdk1~{p}-CycB::cytoplasm=>Cdk1~{p}-CycB::nucleus 06/11/2015 François Fages C2-19 MPRI 29 From Syntax to Semantics 1. Boolean Semantics: presence-absence of molecules – Concurrent Transition System (asynchronous, non-deterministic) 2. Continuous Semantics: concentrations – Ordinary Differential Equations – Hybrid system (deterministic) 3. Stochastic Semantics: numbers of molecules – Continuous time Markov chain 06/11/2015 François Fages C2-19 MPRI 30 Two-stroke Engine with ATP fuel Myosin + ATP => Myosin-ATP Myosin-ATP => Myosin + ADP • http://www.sci.sdsu.edu/movie s 06/11/2015 François Fages C2-19 MPRI 31 Two-stroke Engine with ATP fuel Myosin + ATP => Myosin-ATP Myosin-ATP => Myosin + ADP http://www.sci.sdsu.edu/movies 06/11/2015 François Fages C2-19 MPRI 32 Two-stroke Engine with ATP fuel Myosin + ATP => Myosin-ATP Myosin-ATP => Myosin + ADP http://www.sci.sdsu.edu/movies http://www-rocq.inria.fr/sosso/icema2 06/11/2015 François Fages C2-19 MPRI 33 Two-stroke Engine with ATP fuel Myosin + ATP => Myosin-ATP Myosin-ATP => Myosin + ADP Actin-Myosin microtubule contraction controlled by MPF that phosphorylates myosin 06/11/2015 François Fages C2-19 MPRI 34 Two-stroke Engine with ATP fuel Myosin + ATP => Myosin-ATP Myosin-ATP => Myosin + ADP Actin-Myosin muscle contraction controlled by ion Ca2+ http://www-rocq.inria.fr/sosso/icema2 06/11/2015 François Fages C2-19 MPRI 35 Cell Signaling • Signals: – hormones: insulin, adrenaline, steroids, EGF, …, – neighboring cell membrane proteins: Delta – nutriments, light, pressure, … • Receptors: – Tyrosine kinases, – G protein-coupled, – TGFβ, – Notch, – … 06/11/2015 François Fages C2-19 MPRI 36 Receptor Tyrosine Kinase RTK L + R <=> L-R L-R + L-R => L-R-L-R RAS-GDP =[L-R-L-R]=> RAS-GTP 06/11/2015 François Fages C2-19 MPRI 37 Five MAP Kinase Pathways in Budding Yeast (Saccharomyces Cerevisiae) 06/11/2015 François Fages C2-19 MPRI 38 MAPK Signaling Pathways • Input: RAS activated by the receptor activates RAF RAS-GTP + RAF-P14-3-3 => RAS-GDP + RAF + P14-3-3 • Output: active MAPK moves to the nucleus phosphorylates a transcription factor which stimulates gene expression RAF + … => … … => MAPK~{T183,Y185} 06/11/2015 François Fages C2-19 MPRI 39 Three Levels MAPK Cascade RAF RAF~{p1} MEK MEK~{p1} MEK~{p1,p2} MAPK 06/11/2015 MAPK~{p1} MAPK~{p1,p2} François Fages C2-19 MPRI 40 MAPK Signaling Cascade in BIOCHAM RAF + RAFK <=> RAF-RAFK. Pattern variables $P for RAF-RAFK => RAFK + RAF~{p1}. - Phosphorylation sites RAF~{p1} + RAFPH <=> RAF~{p1}-RAFPH. - Molecules RAF~{p1}-RAFPH => RAF + RAFPH. MEK~$P + RAF~{p1} <=> MEK~$P-RAF~{p1} with symbolic constraints where p2 not in $P. MEK~{p1}-RAF~{p1} => MEK~{p1,p2} + RAF~{p1}. BIOCHAM rules with MEK-RAF~{p1} => MEK~{p1} + RAF~{p1}. patterns are expanded MEKPH + MEK~{p1}~$P <=> MEK~{p1}~$P-MEKPH. in BIOCHAM rules MEK~{p1}-MEKPH => MEK + MEKPH. without patterns MEK~{p1,p2}-MEKPH => MEK~{p1} + MEKPH. MAPK~$P + MEK~{p1,p2} <=> MAPK~$P-MEK~{p1,p2} where p2 not in $P. MAPKPH + MAPK~{p1}~$P <=> MAPK~{p1}~$P-MAPKPH. MAPK~{p1}-MAPKPH => MAPK + MAPKPH. MAPK~{p1,p2}-MAPKPH => MAPK~{p1} + MAPKPH. MAPK-MEK~{p1,p2} => MAPK~{p1} + MEK~{p1,p2}. MAPK~{p1}-MEK~{p1,p2} => MAPK~{p1,p2}+MEK~{p1,p2}. 06/11/2015 François Fages C2-19 MPRI 41 Bipartite Proteins-Reactions Graph of MAPK GraphViz http://www.research.att.co/sw/tools/graphviz 06/11/2015 François Fages C2-19 MPRI 42 Influence Graph Influence Graph inferred from the syntactical reaction model of the MAPK “cascade” Negative feedback loops [Fages Soliman CMSB 06] Possibility of oscillations [Qiao et al. PLOS 07] 06/11/2015 François Fages C2-19 MPRI 43 Reaction Model of the MAPK Cascade [Levchenko et al. PNAS 2000] (MA(1), MA(0.4)) for RAF + RAFK <=> RAF-RAFK. (MA(0.5),MA(0.5)) for RAF~{p1} + RAFPH <=> RAF~{p1}-RAFPH. (MA(3.3),MA(0.42)) for MEK~$P + RAF~{p1} <=> MEK~$P-RAF~{p1} where p2 not in $P. (MA(10),MA(0.8)) for MEKPH + MEK~{p1}~$P <=> MEK~{p1}~$P-MEKPH. (MA(20),MA(0.7)) for MAPK~$P + MEK~{p1,p2} <=> MAPK~$P-MEK~{p1,p2} where p2 not in $P. (MA(5),MA(0.4)) for MAPKPH + MAPK~{p1}~$P <=> MAPK~{p1}~$P-MAPKPH. MA(0.1) for RAF-RAFK => RAFK + RAF~{p1}. MA(0.1) for RAF~{p1}-RAFPH => RAF + RAFPH. MA(0.1) for MEK~{p1}-RAF~{p1} => MEK~{p1,p2} + RAF~{p1}. MA(0.1) for MEK-RAF~{p1} => MEK~{p1} + RAF~{p1}. MA(0.1) for MEK~{p1}-MEKPH => MEK + MEKPH. MA(0.1) for MEK~{p1,p2}-MEKPH => MEK~{p1} + MEKPH. MA(0.1) for MAPK-MEK~{p1,p2} => MAPK~{p1} + MEK~{p1,p2}. MA(0.1) for MAPK~{p1}-MEK~{p1,p2} => MAPK~{p1,p2} + MEK~{p1,p2}. MA(0.1) for MAPK~{p1}-MAPKPH => MAPK + MAPKPH. MA(0.1) for MAPK~{p1,p2}-MAPKPH => MAPK~{p1} + MAPKPH. 06/11/2015 François Fages C2-19 MPRI 44 Differential Simulation 06/11/2015 François Fages C2-19 MPRI 45 Stochastic Simulation 06/11/2015 François Fages C2-19 MPRI 46 Boolean Simulation 06/11/2015 François Fages C2-19 MPRI 47 Automatic Generation of CTL Properties • • • • • • • • • • • • • • reachable(MAPK~{p1})) reachable(!(MAPK~{p1}))) oscil(MAPK~{p1})) … reachable(MAPKPH-MAPK~{p1})) reachable(!(MAPKPH-MAPK~{p1}))) oscil(MAPKPH-MAPK~{p1})) AG(!(MAPKPH-MAPK~{p1})->checkpoint(MAPKPH,MAPKPH-MAPK~{p1}))) AG(!(MAPKPH-MAPK~{p1})->checkpoint(MAPK~{p1},MAPKPH-MAPK~{p1}))) … reachable(MAPK~{p1,p2})) reachable(!(MAPK~{p1,p2}))) oscil(MAPK~{p1,p2})) … 06/11/2015 François Fages C2-19 MPRI 48 A Logical Paradigm for Systems Biology Biological process model = Concurrent Transition System Biological property = Temporal Logic Formula Biological validation = Model-checking • [Lincoln et al. PSB’02] [Chabrier Fages CMSB’03] [Bernot et al. TCS’04] … • • • • • • Model: BIOCHAM - Boolean - simulation - Differential - query evaluation - Stochastic - rule learning (SBML) - parameter search Types: static analyses 06/11/2015 Biological Properties: - Temporal logic CTL - LTL with constraints - PCTL with constraints François Fages C2-19 MPRI 49 The Biochemical Abstract Machine BIOCHAM Modeling environment based on two formal languages: 1. Rule Language for Modeling Biochemical Systems – Syntax of molecules, compartments and reactions – Semantics at 3 abstraction levels: Boolean, Concentrations, Populations 2. Temporal Logic for Formalizing Biological Properties – CTL for Boolean semantics – Constraint LTL for concentration semantics, – PCTL for stochastic semantics Learning from Temporal Properties – Learning reaction rules from CTL specification – Learning kinetic parameter values from Constraint-LTL specification 06/11/2015 François Fages C2-19 MPRI 50 Language Approach to Cell Systems Biology Qualitative models: from diagrammatic notation to • • • • • • Boolean networks [Thomas 73] Petri Nets [Reddy 93] Milner’s π–calculus [Regev-Silverman-Shapiro 99-01, Nagasali et al. 00] Bio-ambients [Regev-Panina-Silverman-Cardelli-Shapiro 03] Pathway logic [Eker-Knapp-Laderoute-Lincoln-Meseguer-Sonmez 02] Transition systems [Chabrier-Chiaverini-Danos-Fages-Schachter 04] BIOCHAM-1, Kappa Quantitative models: from differential equation systems to • • • • • Hybrid Petri nets [Hofestadt-Thelen 98, Matsuno et al. 00] Hybrid automata [Alur et al. 01, Ghosh-Tomlin 01] Hybrid concurrent constraint languages [Bockmayr-Courtois 01] Stochastic π–calculus [Priami 03, Cardelli 04] Rules with continuous/stochastic dynamics BIOCHAM-2, BioNetGen 06/11/2015 François Fages C2-19 MPRI 51