Languages for Systems Biology Luca Cardelli Microsoft Research Cambridge UK http://www.luca.demon.co.uk/BioComputing.htm http://research.microsoft.com/bioinfo Structural Architecture Eukaryotic Cell (10~100 trillion in human body) Nuclear membrane Mitochondria Membranes everywhere Golgi Vesicles E.R. Plasma membrane ( membranes)11/7/2015

Download Report

Transcript Languages for Systems Biology Luca Cardelli Microsoft Research Cambridge UK http://www.luca.demon.co.uk/BioComputing.htm http://research.microsoft.com/bioinfo Structural Architecture Eukaryotic Cell (10~100 trillion in human body) Nuclear membrane Mitochondria Membranes everywhere Golgi Vesicles E.R. Plasma membrane ( membranes)11/7/2015

Languages for
Systems Biology
Luca Cardelli
Microsoft Research
Cambridge UK
http://www.luca.demon.co.uk/BioComputing.htm
http://research.microsoft.com/bioinfo
Structural Architecture
Eukaryotic
Cell
(10~100 trillion
in human body)
Nuclear
membrane
Mitochondria
Membranes
everywhere
Golgi
Vesicles
E.R.
Plasma
membrane
(<10% of all
membranes)
2
11/7/2015
Functional Architecture
Regulation
Abstract Machines of
Molecular Biology
Gene
Machine
Biochemical Networks
- The Protein Machine
Gene Regulatory Networks
- The Gene Machine
Transport Networks
- The Membrane Machine
Nucleotides
Model Integration
Different time
and space scales
Protein
Machine
Holds receptors, actuators
hosts reactions
Aminoacids
Metabolism, Propulsion
Signal Processing
Molecular Transport
Implements fusion, fission
P
Q
Membrane
Machine
Phospholipids
Phospholipids
Confinement
Storage
Bulk Transport
11/7/2015
3
1: The Protein Machine
cf. BioCalculus [Kitano&Nagasaki], k-calculus [Danos&Laneve]
On/Off switches
Inaccessible
Protein
Binding Sites
Pretty close
to the atoms.
Each protein has a structure
of binary switches and binding sites.
But not all may be always accessible.
Inaccessible
Switching of accessible switches.
- May cause other switches and
binding sites to become (in)accessible.
- May be triggered or inhibited by nearby specific
proteins in specific states.
Binding on accessible sites.
- May cause other switches and
binding sites to become (in)accessible.
-- May be triggered or inhibited by nearby specific
proteins in specific states.
11/7/2015
4
Molecular Interaction Maps
http://www.cds.caltech.edu/~hsauro/index.htm
JDesigner
The p53-Mdm2 and DNA Repair Regulatory Network
Taken from
Kohn5
Kurt W.
11/7/2015
2. The Gene Machine
Positive Regulation
Negative Regulation
Input
Pretty far from
the atoms.
cf. Hybrid Petri Nets [Matsuno, Doi, Nagasaki, Miyano]
Transcription
Output
Gene
(Stretch of DNA)
Regulation of a gene (positive and
negative) influences
transcription. The regulatory
region has precise DNA
sequences, but not meant for
coding proteins: meant for
binding regulators.
Transcription produces molecules
(RNA or, through RNA, proteins)
that bind to regulatory region of
other genes (or that are endproducts).
Coding region
Regulatory region
Output2
Input
Output1
“External Choice”
The phage
lambda switch
Human (and mammalian) Genome Size
3Gbp (Giga base pairs) 750MB @ 4bp/Byte (CD)
Non-repetitive: 1Gbp 250MB
In genes: 320Mbp 80MB
Coding: 160Mbp 40MB
Protein-coding genes: 30,000-40,000
M.Genitalium (smallest true organism)
580,073bp 145KB (eBook)
E.Coli (bacteria): 4Mbp 1MB (floppy)
Yeast (eukarya): 12Mbp 3MB (MP3 song)
Wheat 17Gbp 4.25GB (DVD)
11/7/2015
6
Gene Regulatory Networks
http://strc.herts.ac.uk/bio/maria/NetBuilder/
NetBuilder
Taken from
Eric H Davidson
And
Begin coding region
DNA
Or
Sum
Amplify
Gate
11/7/2015
7
The Membrane Machine
Very far from
the atoms.
Zero case
P
P
Q
Q
Mate
Mito
Arbitrary
subsystem
P
P
Mito:
special
cases
Q
Drip
P
One case
P
R
Bud
P
R
Fusion
Fission
Zero case
P
P
Q
Q
Exo
Endo
Q
P Q
Endo:
special
cases
One case
R
Arbitrary
subsystem
Fusion
Q
Pino
Q
Phago
R
Q
Fission
11/7/2015
8
Membrane Transport Algorithms
Protein Production
and Secretion
LDL-Cholesterol
Degradation
Viral Replication
Taken from
p.7309
MCB
11/7/2015
Equations => Notations => Languages
• How to model a system
– Mathematical modeling:
• Formal (e.g. differential equations).
• Dynamic (but increasingly difficult to analyze).
• Non scalable. Non “visual”.
– => Alterantive notations in biology:
• Too informal. Too static. Non scalable.
• Exceeding capabilities of traditional mathematical
modeling.
– => “Programming” languages for biology:
• Formal, Dynamic
• Scalable, Analyzable
• Visual (with some effort).
11/7/2015
10
Road Ahead
• Identifying the architecture
– Physics, Chemistry, Biology, Informatics:
Principles of Operation
Model
Integration
• Modeling the system
– Scalable, compositional, integrated descriptions
– A common framework (stochastic process calculi)
• Analyzing the model
– Exploiting techniques unique to computing
• Perturbing, predicting, engineering
“The data are accumulating and the
computers are humming, what we are
lacking are the words, the grammar
and the syntax of a new language…”
D. Bray (TIBS 22(9):325-326, 1997)
“Although the road ahead is long
and winding, it leads to a future
where biology and medicine are
transformed into precision
engineering.” Hiroaki Kitano.
11/7/2015
11