Computational Methods in Systems Biology and Synthetic Biology François Fages, Constraint Programming Group INRIA Paris-Rocquencourt mailto:[email protected] http://contraintes.inria.fr/ 06/11/2015 François Fages C2-19 MPRI.

Download Report

Transcript Computational Methods in Systems Biology and Synthetic Biology François Fages, Constraint Programming Group INRIA Paris-Rocquencourt mailto:[email protected] http://contraintes.inria.fr/ 06/11/2015 François Fages C2-19 MPRI.

Computational Methods in
Systems Biology and Synthetic Biology
François Fages,
Constraint Programming Group
INRIA Paris-Rocquencourt
mailto:[email protected]
http://contraintes.inria.fr/
06/11/2015
François Fages C2-19 MPRI
1
Overview of the Lectures
1.
2.
3.
4.
5.
Formal molecules and reaction models in BIOCHAM
Kinetics
Qualitative properties formalized in temporal logic CTL
Quantitative properties formalized in LTL(R) and PLTL(R)
…
06/11/2015
François Fages C2-19 MPRI
2
References
A wonderful textbook:
Molecular Cell Biology. 5th Edition, 1100 pages+CD, Freeman Publ.
Lodish, Berk, Zipursky, Matsudaira, Baltimore, Darnell. Nov. 2003.
The Biochemical Abstract Machine BIOCHAM. http://contraintes.inria.fr/BIOCHAM
Formal Cell Biology in BIOCHAM (tutorial). François Fages and Sylvain Soliman.
8th International School on Computational Systems Biology.
ISpringer-Verlag, LNCS 5016. Mar. 2008.(pdf)
Modeling dynamic phenomena in molecular and cellular biology.
Segel. Cambridge Univ. Press. 1987.
06/11/2015
François Fages C2-19 MPRI
3
Cell Molecules
• Small molecules: covalent bonds 50-200 kcal/mol
– 70% water
– 1% ions
– 6% amino acids (20), nucleotides (5),
– fats, sugars, ATP, ADP, …
• Macromolecules: hydrogen bonds, ionic, hydrophobic, Waals 1-5 kcal/mol
Stability and bindings determined by the number of weak bonds: 3D shape
– 20% proteins (50-104 amino acids)
– RNA (102-104 nucleotides AGCU)
– DNA (102-106 nucleotides AGCT)
06/11/2015
François Fages C2-19 MPRI
4
Proteins
1) Primary structure: word of n amino acids residues (20n possibilities)
linked with C-N bonds
06/11/2015
François Fages C2-19 MPRI
5
Proteins
1) Primary structure: word of n amino acids residues (20n possibilities)
linked with C-N bonds
Example: MPRI
Methionine-Proline-Arginine-Isoleucine
06/11/2015
François Fages C2-19 MPRI
6
Proteins
1) Primary structure: word of n amino acids residues (20n possibilities)
linked with C-N bonds
Example: MPRI
Methionine-Proline-Arginine-Isoleucine
2) Secondary: word of m a-helix, b-strands, random coils,… (3m-10m)
stabilized by hydrogen bonds H---O
06/11/2015
François Fages C2-19 MPRI
7
Proteins
1) Primary structure: word of n amino acids residues (20n possibilities)
linked with C-N bonds
Example: MPRI
Methionine-Proline-Arginine-Isoleucine
2) Secondary: word of m a-helix, b-strands, random coils,… (3m-10m)
stabilized by hydrogen bonds H---O
3) Tertiary 3D structure: spatial folding
stabilized by hydrophobic interactions
explains the protein interaction capabilities
06/11/2015
François Fages C2-19 MPRI
8
Formal Proteins
• Cyclin dependent kinase 1
Cdk1
(free, inactive)
06/11/2015
François Fages C2-19 MPRI
9
Formal Proteins
• Cyclin dependent kinase 1
Cdk1
(free, inactive)
• Complex Cdk1-Cyclin B
(low activity)
06/11/2015
Cdk1–CycB
François Fages C2-19 MPRI
10
Formal Proteins
• Cyclin dependent kinase 1
Cdk1
(free, inactive)
• Complex Cdk1-Cyclin B
(low activity)
• Phosphorylated form
Cdk1–CycB
Cdk1~{thr161}-CycB
at site threonine 161
(high activity)
06/11/2015
François Fages C2-19 MPRI
11
Formal Proteins
• Cyclin dependent kinase 1
Cdk1
(free, inactive)
• Complex Cdk1-Cyclin B
(low activity)
Cdk1–CycB
Nuclear division
• Phosphorylated form
Cdk1~{thr161}-CycB
at site threonine 161
(high activity)
“Mitosis-Promoting Factor”
phosphorylates actin in microtubules
06/11/2015
François Fages C2-19 MPRI
12
Complexation and Phosphorylation Rules
• Complexation
Cdk1 + CycB => Cdk1-CycB
• Decomplexation
Cdk1-CycB => Cdk1 + CycB
• Phosphorylation
Cdk1-CycB =[Myt1]=> Cdk1~{thr161}-CycB
• Dephosphorylation
Cdk1~{thr14,tyr15}-CycB =[Cdc25~{Nterm}]=> Cdk1-CycB
06/11/2015
François Fages C2-19 MPRI
13
DNA Deoxyribonucleic Acid
1) Primary structure: word over 4 nucleotides
Adenine, Guanine, Cytosine, Thymine
06/11/2015
François Fages C2-19 MPRI
14
DNA Deoxyribonucleic Acid
1) Primary structure: word over 4 nucleotides
Adenine, Guanine, Cytosine, Thymine
2) Secondary structure:
double helix of pairs A--T and C---G
stabilized by hydrogen bonds
Size of DNA = number of pairs
Genes are parts of DNA
Nobel Prizes Watson and Crick
06/11/2015
François Fages C2-19 MPRI
15
DNA: Genome Size
Species
Genome size
Chromosomes
Coding DNA
E. Coli (bacteria)
5 Mb
1 circular
100 %
S. Cerevisae (yeast)
12 Mb
16
70 %
06/11/2015
…
3 Gb
…
15 Gb
…
140 Gb
François Fages C2-19 MPRI
16
DNA: Genome Size
Species
Genome size
Chromosomes
Coding DNA
E. Coli (bacteria)
5 Mb
1 circular
100 %
S. Cerevisae (yeast)
12 Mb
16
70 %
Mouse, Human
3 Gb
20, 23
15 %
…
15 Gb
…
140 Gb
3,200,000,000 pairs of nucleotides
single nucleotide polymorphism 1 / 2kb
06/11/2015
François Fages C2-19 MPRI
17
Genome Size
Species
Genome size
Chromosomes
Coding DNA
E. Coli (bacteria)
4 Mb
1
100 %
S. Cerevisae (yeast)
12 Mb
16
70 %
Mouse, Human
3 Gb
20, 23
15 %
Onion
15 Gb
8
1%
…
140 Gb
06/11/2015
François Fages C2-19 MPRI
18
Genome Size
Species
Genome size
Chromosomes
Coding DNA
E. Coli (bacteria)
4 Mb
1
100 %
S. Cerevisae (yeast)
12 Mb
16
70 %
Mouse, Human
3 Gb
20, 23
15 %
Onion
15 Gb
8
1%
Lungfish
140 Gb
06/11/2015
François Fages C2-19 MPRI
0.7 %
19
DNA Replication
1. Separation of the two helices
2. Production of one complementary strand for each copy
(from one or several starting points of replication)
06/11/2015
François Fages C2-19 MPRI
20
Transcription and Translation
1. Activation (Inhibition):
Nobel prize Jakob and Monod
transcription factors bind to the regulatory region of the gene
2. Transcription:
RNA polymerase copies the DNA from start to stop positions
into a single stranded pre-mature messenger pRNA
3. (Alternative) splicing:
non coding regions of pRNA are removed giving mature messenger mRNA
4. Translation:
mRNA moves to cytoplasm and binds to ribosome to assemble a protein
06/11/2015
François Fages C2-19 MPRI
21
Formal Genes
• Part of DNA, unique
• Activation
binding of promotion factor
• Repression (inhibition)
binding of another molecule
06/11/2015
#E2
#E2-E2f13-DP12
#E2-Rep
François Fages C2-19 MPRI
22
Transcription and Translation Rules
Activation
#E2 + E2f13-DP12 => #E2-E2f13-DP12
Repression
#E2 + Rep => #E2-Rep
06/11/2015
François Fages C2-19 MPRI
23
Transcription and Translation Rules
Activation
#E2 + E2f13-DP12 => #E2-E2f13-DP12
Repression
#E2 + Rep => #E2-Rep
Transcription
_ =[#E2-E2F13-DP12]=> pRNAcycA
06/11/2015
François Fages C2-19 MPRI
24
Transcription and Translation Rules
Activation
#E2 + E2f13-DP12 => #E2-E2f13-DP12
Repression
#E2 + Rep => #E2-Rep
Transcription
_ =[#E2-E2F13-DP12]=> pRNAcycA
(Alternative) Splicing
pRNAcycA => mRNAcycA
pRNAcycA => mRNAcycA2
06/11/2015
François Fages C2-19 MPRI
25
Transcription and Translation Rules
Activation
#E2 + E2f13-DP12 => #E2-E2f13-DP12
Repression
#E2 + Rep => #E2-Rep
Transcription
_ =[#E2-E2F13-DP12]=> pRNAcycA
(Alternative) Splicing
pRNAcycA => mRNAcycA
pRNAcycA => mRNAcycA2
Translation
mRNAcycA => mRNAcycA::cyt
mRNAcycA::cyt + ribosome::cyt => cycA::cyt + ribosome::cyt
06/11/2015
François Fages C2-19 MPRI
26
BIOCHAM Syntax of Objects
E == compound | E-E | E~{p1,…,pn}
• compound: molecule, #gene binding site
• - : binding operator for protein complexes, gene binding sites, …
Associative and commutative.
• ~{…}: modification operator for phosphorylated sites, …
Set of modified sites (Associative, Commutative, Idempotent).
O == E | E::location
• location: symbolic compartment (nucleus, cytoplasm, membrane, …)
06/11/2015
François Fages C2-19 MPRI
27
BIOCHAM Syntax of Rules
S == _ | O+S
+ : solution operator (Associative, Commutative, Neutral _)
R ::= S => S | kinetic-expression for R
Abbreviations
A =[C]=> B stands for A+C => B+C
A <=> B stands for A=>B and B=>A,
Compatible with the Systems Biology Markup Language SBML
exchange format for reaction models
06/11/2015
François Fages C2-19 MPRI
28
Elementary Rule Schemas
• Complexation: A + B => A-B.
cdk1+cycB => cdk1–cycB
Decomplexation A-B => A + B.
• Phosphorylation: A =[C]=> A~{p}.
Dephosphorylation A~{p} =[C]=> A.
Cdk1-CycB =[Myt1]=> Cdk1~{thr161}-CycB
Cdk1~{thr14,tyr15}-CycB =[Cdc25~{Nterm}]=> Cdk1-CycB
• Synthesis: _ =[C]=> A.
_=[#E2-E2f13-Dp12]=>cycA
Degradation: A =[C]=> _.
cycE =[@UbiPro]=> _
(not for cycE-cdk2 which is stable)
• Transport: A::L1 => A::L2.
Cdk1~{p}-CycB::cytoplasm=>Cdk1~{p}-CycB::nucleus
06/11/2015
François Fages C2-19 MPRI
29
From Syntax to Semantics
1. Boolean Semantics: presence-absence of molecules
– Concurrent Transition System (asynchronous, non-deterministic)
2. Continuous Semantics: concentrations
– Ordinary Differential Equations
– Hybrid system (deterministic)
3. Stochastic Semantics: numbers of molecules
– Continuous time Markov chain
06/11/2015
François Fages C2-19 MPRI
30
Two-stroke Engine with ATP fuel
Myosin + ATP => Myosin-ATP
Myosin-ATP => Myosin + ADP
•
http://www.sci.sdsu.edu/movie
s
06/11/2015
François Fages C2-19 MPRI
31
Two-stroke Engine with ATP fuel
Myosin + ATP => Myosin-ATP
Myosin-ATP => Myosin + ADP
http://www.sci.sdsu.edu/movies
06/11/2015
François Fages C2-19 MPRI
32
Two-stroke Engine with ATP fuel
Myosin + ATP => Myosin-ATP
Myosin-ATP => Myosin + ADP
http://www.sci.sdsu.edu/movies
http://www-rocq.inria.fr/sosso/icema2
06/11/2015
François Fages C2-19 MPRI
33
Two-stroke Engine with ATP fuel
Myosin + ATP => Myosin-ATP
Myosin-ATP => Myosin + ADP
Actin-Myosin microtubule contraction
controlled by MPF that phosphorylates
myosin
06/11/2015
François Fages C2-19 MPRI
34
Two-stroke Engine with ATP fuel
Myosin + ATP => Myosin-ATP
Myosin-ATP => Myosin + ADP
Actin-Myosin muscle contraction
controlled by ion Ca2+
http://www-rocq.inria.fr/sosso/icema2
06/11/2015
François Fages C2-19 MPRI
35
Cell Signaling
• Signals:
– hormones: insulin, adrenaline, steroids, EGF, …,
– neighboring cell membrane proteins: Delta
– nutriments, light, pressure, …
• Receptors:
– Tyrosine kinases,
– G protein-coupled,
– TGFβ,
– Notch,
– …
06/11/2015
François Fages C2-19 MPRI
36
Receptor Tyrosine Kinase RTK
L + R <=> L-R
L-R + L-R => L-R-L-R
RAS-GDP =[L-R-L-R]=> RAS-GTP
06/11/2015
François Fages C2-19 MPRI
37
Five MAP Kinase Pathways
in Budding Yeast
(Saccharomyces Cerevisiae)
06/11/2015
François Fages C2-19 MPRI
38
MAPK Signaling Pathways
• Input:
RAS activated by the receptor
activates RAF
RAS-GTP + RAF-P14-3-3 =>
RAS-GDP + RAF + P14-3-3
• Output:
active MAPK moves to the nucleus
phosphorylates a transcription
factor which stimulates gene
expression
RAF + … => …
… => MAPK~{T183,Y185}
06/11/2015
François Fages C2-19 MPRI
39
Three Levels MAPK Cascade
RAF


RAF~{p1}
MEK


MEK~{p1}
 MEK~{p1,p2}

MAPK 

06/11/2015
MAPK~{p1}  MAPK~{p1,p2}

François Fages C2-19 MPRI
40
MAPK Signaling Cascade in BIOCHAM
RAF + RAFK <=> RAF-RAFK.
Pattern variables $P for
RAF-RAFK => RAFK + RAF~{p1}.
- Phosphorylation sites
RAF~{p1} + RAFPH <=> RAF~{p1}-RAFPH.
- Molecules
RAF~{p1}-RAFPH => RAF + RAFPH.
MEK~$P + RAF~{p1} <=> MEK~$P-RAF~{p1}
with symbolic constraints
where p2 not in $P.
MEK~{p1}-RAF~{p1} => MEK~{p1,p2} + RAF~{p1}.
BIOCHAM rules with
MEK-RAF~{p1} => MEK~{p1} + RAF~{p1}.
patterns are expanded
MEKPH + MEK~{p1}~$P <=> MEK~{p1}~$P-MEKPH.
in BIOCHAM rules
MEK~{p1}-MEKPH => MEK + MEKPH.
without patterns
MEK~{p1,p2}-MEKPH => MEK~{p1} + MEKPH.
MAPK~$P + MEK~{p1,p2} <=> MAPK~$P-MEK~{p1,p2}
where p2 not in $P.
MAPKPH + MAPK~{p1}~$P <=> MAPK~{p1}~$P-MAPKPH.
MAPK~{p1}-MAPKPH => MAPK + MAPKPH.
MAPK~{p1,p2}-MAPKPH => MAPK~{p1} + MAPKPH.
MAPK-MEK~{p1,p2} => MAPK~{p1} + MEK~{p1,p2}.
MAPK~{p1}-MEK~{p1,p2} => MAPK~{p1,p2}+MEK~{p1,p2}.
06/11/2015
François Fages C2-19 MPRI
41
Bipartite Proteins-Reactions Graph of MAPK
GraphViz
http://www.research.att.co/sw/tools/graphviz
06/11/2015
François Fages C2-19 MPRI
42
Influence Graph
Influence Graph inferred from the
syntactical reaction model of the MAPK
“cascade”
Negative feedback loops
[Fages Soliman CMSB 06]
Possibility of oscillations
[Qiao et al. PLOS 07]
06/11/2015
François Fages C2-19 MPRI
43
Reaction Model of the MAPK Cascade
[Levchenko et al. PNAS 2000]
(MA(1), MA(0.4)) for RAF + RAFK <=> RAF-RAFK.
(MA(0.5),MA(0.5)) for RAF~{p1} + RAFPH <=> RAF~{p1}-RAFPH.
(MA(3.3),MA(0.42)) for MEK~$P + RAF~{p1} <=> MEK~$P-RAF~{p1}
where p2 not in $P.
(MA(10),MA(0.8)) for MEKPH + MEK~{p1}~$P <=> MEK~{p1}~$P-MEKPH.
(MA(20),MA(0.7)) for MAPK~$P + MEK~{p1,p2} <=> MAPK~$P-MEK~{p1,p2}
where p2 not in $P.
(MA(5),MA(0.4)) for MAPKPH + MAPK~{p1}~$P <=> MAPK~{p1}~$P-MAPKPH.
MA(0.1) for RAF-RAFK => RAFK + RAF~{p1}.
MA(0.1) for RAF~{p1}-RAFPH => RAF + RAFPH.
MA(0.1) for MEK~{p1}-RAF~{p1} => MEK~{p1,p2} + RAF~{p1}.
MA(0.1) for MEK-RAF~{p1} => MEK~{p1} + RAF~{p1}.
MA(0.1) for MEK~{p1}-MEKPH => MEK + MEKPH.
MA(0.1) for MEK~{p1,p2}-MEKPH => MEK~{p1} + MEKPH.
MA(0.1) for MAPK-MEK~{p1,p2} => MAPK~{p1} + MEK~{p1,p2}.
MA(0.1) for MAPK~{p1}-MEK~{p1,p2} => MAPK~{p1,p2} + MEK~{p1,p2}.
MA(0.1) for MAPK~{p1}-MAPKPH => MAPK + MAPKPH.
MA(0.1) for MAPK~{p1,p2}-MAPKPH => MAPK~{p1} + MAPKPH.
06/11/2015
François Fages C2-19 MPRI
44
Differential Simulation
06/11/2015
François Fages C2-19 MPRI
45
Stochastic Simulation
06/11/2015
François Fages C2-19 MPRI
46
Boolean Simulation
06/11/2015
François Fages C2-19 MPRI
47
Automatic Generation of CTL Properties
•
•
•
•
•
•
•
•
•
•
•
•
•
•
reachable(MAPK~{p1}))
reachable(!(MAPK~{p1})))
oscil(MAPK~{p1}))
…
reachable(MAPKPH-MAPK~{p1}))
reachable(!(MAPKPH-MAPK~{p1})))
oscil(MAPKPH-MAPK~{p1}))
AG(!(MAPKPH-MAPK~{p1})->checkpoint(MAPKPH,MAPKPH-MAPK~{p1})))
AG(!(MAPKPH-MAPK~{p1})->checkpoint(MAPK~{p1},MAPKPH-MAPK~{p1})))
…
reachable(MAPK~{p1,p2}))
reachable(!(MAPK~{p1,p2})))
oscil(MAPK~{p1,p2}))
…
06/11/2015
François Fages C2-19 MPRI
48
A Logical Paradigm for Systems Biology
Biological process model = Concurrent Transition System
Biological property = Temporal Logic Formula
Biological validation = Model-checking
•
[Lincoln et al. PSB’02] [Chabrier Fages CMSB’03] [Bernot et al. TCS’04] …
•
•
•
•
•
•
Model:
BIOCHAM
- Boolean
- simulation
- Differential
- query evaluation
- Stochastic
- rule learning
(SBML)
- parameter search
Types: static analyses
06/11/2015
Biological Properties:
- Temporal logic CTL
- LTL with constraints
- PCTL with constraints
François Fages C2-19 MPRI
49
The Biochemical Abstract Machine
BIOCHAM
Modeling environment based on two formal languages:
1. Rule Language for Modeling Biochemical Systems
– Syntax of molecules, compartments and reactions
– Semantics at 3 abstraction levels: Boolean, Concentrations, Populations
2. Temporal Logic for Formalizing Biological Properties
– CTL for Boolean semantics
– Constraint LTL for concentration semantics,
– PCTL for stochastic semantics
Learning from Temporal Properties
– Learning reaction rules from CTL specification
– Learning kinetic parameter values from Constraint-LTL specification
06/11/2015
François Fages C2-19 MPRI
50
Language Approach to Cell Systems Biology
Qualitative models: from diagrammatic notation to
•
•
•
•
•
•
Boolean networks [Thomas 73]
Petri Nets [Reddy 93]
Milner’s π–calculus [Regev-Silverman-Shapiro 99-01, Nagasali et al. 00]
Bio-ambients [Regev-Panina-Silverman-Cardelli-Shapiro 03]
Pathway logic [Eker-Knapp-Laderoute-Lincoln-Meseguer-Sonmez 02]
Transition systems [Chabrier-Chiaverini-Danos-Fages-Schachter 04] BIOCHAM-1, Kappa
Quantitative models: from differential equation systems to
•
•
•
•
•
Hybrid Petri nets [Hofestadt-Thelen 98, Matsuno et al. 00]
Hybrid automata [Alur et al. 01, Ghosh-Tomlin 01]
Hybrid concurrent constraint languages [Bockmayr-Courtois 01]
Stochastic π–calculus [Priami 03, Cardelli 04]
Rules with continuous/stochastic dynamics BIOCHAM-2, BioNetGen
06/11/2015
François Fages C2-19 MPRI
51