IntroNetworksandGenes

Download Report

Transcript IntroNetworksandGenes

Introduction to Synthetic Biology
423
2013
Herbert Sauro
[email protected]
www.sys-bio.org
Gene and Genomes
Smallest Genome – was in 1999
}
Single Gene
One of the smallest Genomes: Mycoplasma genitalium (Small parasitic bacterium)
3
Smallest Genome
Total genes:
Protein coding genes:
tRNA and rRNA:
521
482
39
This genome is of interest to synthetic biology because Craig Venter wants to use
this organism as the basis for a minimal organism for genetic engineering.
Venter’s group has removed roughly 101 genes and the organism is still viable,
the idea then is to patent the minimal set of genes required for life.
PNAS (2006) 103, 425--430
4
Gene Function
The complexity of simplicity
Scott N Peterson and Claire M Fraser
Genome Biol. 2001;2(2):COMMENT 2002. Epub 2001 Feb 8.
5
But the real prize goes to….
The 160-Kilobase Genome of the
Bacterial Endosymbiont Carsonella
Atsushi Nakabachi, Atsushi
Yamashita, Hidehiro Toh, Hajime
Ishikawa, Helen E. Dunbar, Nancy
A. Moran, and Masahira Hattori
(13 October 2006)
Science 314 (5797), 267.
Endosymbiont :
organism that lives
in another cells.
160-Kilobase Genome of the Bacterial Endosymbiont Carsonella
Symbiont of sap sucking PSYLLIDS or ‘jumping plant lice’ ~182 genes
6
Prokaryotic Cells: E. coli
1 .Bacteria lack membrane bound nuclei
2. DNA is circular
3. No complex internal organelles
2-3 um
http://www.ucmp.berkeley.edu/bacteria/bacteriamm.html
7
Prokaryotic Cells: E. coli
http://atlas.arabslab.com
8
Comparison to Eukaryotic Cells
http://www.cod.edu/people/faculty/fancher/ProkEuk.htm
9
E. coli Cytoplasm
Average spacing between proteins:
7 nm/molecule
Diameter of a protein: 5 nm
David S. Goodsell (Scripps)
10
E. Coli Statistics
Length: 2 to 3 um
Diameter: 1 um
Generation time: 20 to 30 mins
Translation rate: 40 aa/sec
Transcription rate: 70 nt/sec
Number of ribosomes per cell : 18,000
Small Molecules/Ions per cell:
Alanine:
Pyruvate:
ATP:
Ca ions:
Fe ions:
350,000
370,000
2,000,000
2,300,000
7,000,000
Data from: http://bionumbers.hms.harvard.edu
http://redpoll.pharmacy.ualberta.ca/CCDB/cgi-bin/STAT_NEW.cgi
David S. Goodsell (Scripps)
11
E. Coli Statistics
E coli has approximately 4300 protein coding genes.
Protein abundance per cell:
ATP Dependent helicase: 104
LacI repressor: 10 to 50 molecules
LacZ (galactosidase) : 5000
CheA kinase (chemotaxis): 4,500
CheB (Feedback):
240
CheY (Motor signal):
8,200
Chemoreceptors:
15,000
Glycolysis
Phosphofructokinase: 1,550
Pyruvate Kinase:
11,000
Enolase:
55,800
Phosphoglycerate kinase: 124,000
Source: Protein abundance profiling of the Escherichia coli cytosol.
BMC Genomics 2008, 9:102. Ishihama et al.
Krebs Cycle
Malate Dehydrogenase: 3,390
Citrate Synthase:
1,360
Aconitase:
1630
12
E. Coli Statistics
E coli has approximately 4300 protein coding genes.
Molecules Numbers in Prokaryotes:
1.
2.
3.
4.
5.
6.
Ions
Small Molecules
Metabolic Enzymes
Signaling Proteins
Transcription Factors
DNA
Millions
10,000 – 100,000
1000 – 10,000s
100 – 1000s
10s to 100s
1 – 10s
Source: Protein abundance profiling of the Escherichia coli cytosol.
BMC Genomics 2008, 9:102. Ishihama et al.
Protein abundance per cell:
ATP Dependent helicase: 104
LacI repressor: 10 to 50 molecules
LacZ (galactosidase) : 5000
CheA kinase (chemotaxis): 4,500
CheB (Feedback):
240
CheY (Motor signal):
8,200
Chemoreceptors:
15,000
Glycolysis
Phosphofructokinase: 1,550
Pyruvate Kinase:
11,000
Enolase:
55,800
Phosphoglycerate kinase: 124,000
Krebs Cycle
Malate Dehydrogenase: 3,390
Citrate Synthase:
1,360
Aconitase:
1630
13
Circular Chromosome in E. coli
Most Prokaryotic DNA is circular. Gene are
located on both strands of the DNA. Genes
on the outside are transcribed clockwise
and those on the inside anticlockwise.
E. coli’s genome is 4,639,221 base pairs
Coding for 4472 genes, of which 4316
are genes that code for proteins.
Proteins
4316
tRNAs
89
rRNAs
22
Other RNAs
64
14
Circular Chromosome in E. coli
88% of the E. coli genome codes for
proteins, the rest includes RNA coding,
promoter, terminators etc.
In contrast, the Human genome:
3,000,000,000 base pairs and
about 25,000 genes.
Only 2% of the Human genome codes
for proteins. The rest is……RNA regulatory
network? Human genes are also segmented
into Exon and Introns, with alternative splicing,
significantly increasing the actual number
of protein
15
EcoCyc: http://ecocyc.org/
16
E. coli Gene Structure
Stop codon
(TAG, TAA, TGA)
Start codon
Page 134
RNA Polymerase Binds to Promoters
mRNA
Changes in the promoter sequence
can change the efficiency of RNA
polymerase binding to the DNA.
The promoter is therefore a site
which can be engineered.
http://mgl.scripps.edu/people/goodsell/pdb/pdb40/pdb40_1.html
Strong and Weak Promoters
The strength of a promote is one of the factors which
determines the rate of transcription.
Strong Promoter. The recA promoter is a strong promoter.
TTGATA -- 16 -- TATAAT
TTGACA -- 17 -- TATAAT
Most common Promoter
(Consensus sequence)
It differs from the averaged promoter sequence by one nucleotide and on base
pair in the spacer region.
Weak Promoter. The araBAD promoter is a weak promoter.
CTGACG -- 18 -- TACTGT
TTGACA -- 17 -- TATAAT
RNA Polymerase Stops at a
Terminator
Changes in the terminator sequence
can change the efficiency of RNA
polymerase stopping. If the gene is
part of an operon, terminators can
modulate relative expression levels of
the different genes in the operon.
The terminator is therefore a site
which can be engineered.
Operon Structure
Gene A
100%
Promoter
Gene B
60%
Terminator
Gene C
30%
Operators – Regulating Expression
Gene Regulation
lac Operon
Metabolic Enzyme (output)
Promoter
Promoter
Operator
lacZ codes for β-galactosidase.
lacY codes for β-galactoside permease.
Sugar in Medium
Relative βgalactosidase
Glucose
1
Glucose + lactose
50
Lactose
2500
Gene Regulation
lac Operon
Lac repressor
Metabolic Enzyme (output)
Promoter
Promoter
Operator
Gene Regulation
lac Operon
LacI Repressor
lacI is a tetramer (x4)
LacI binding to Promoter
Ribosome Binding Sites
In summary:
Stop Codon
Start Codon
Promoter
RBS
Gene
Operators
5’-UTR
Terminator
This course is about networks:
The Science and Engineering of Biological Networks
The world is full of networks
Electronic
WWW
Road
Social
Biological Networks
Metabolic Networks
Metabolic
About 1000-1400 genes that code for
metabolic enzymes in E. coli (out of a total
of about 4300 genes)
Protein-Protein Networks
Protein Signaling Network
Protein-Protein Networks
Protein Signaling Network: CellDesigner
Kohn MIMS
20% of the human protein-coding
genes encode components of
signaling pathways, including
transmembrane proteins, guaninenucleotide binding proteins (G
proteins), kinases, phosphatases
and proteases.
Protein-Protein Networks
C
Genetic Networks
Gene Regulatory Networks: BioTapestry
Genetic Networks
Gene Regulatory Networks: BioTapestry : Ventral Neural Tube in Vertebrate Embryo
Genetic Units
Understanding the Dynamic Behavior of Genetic Regulatory Networks by Functional Decomposition.
William Longabaugh and Hamid Bolouri Curr Genomics. Author manuscript; available in PMC 2007
December 12. Published in final edited form as: Curr Genomics. 2006 November; 7(6): 333–341.
Hybrid Network: Cell Cycle
Control is Bacteria
Two Kinds of Representations
1. Non-Stoichiometry – or ball and stick networks
No stoichiometry, kinetics or mass conservation
Cytoscape: Ball and Stick
2. Stoichiometry – reaction maps
?? – Stuff that people make up, whose knows
what they really mean
Stoichiometric
Network Classification
Elementary
Stoichiometric
Non-Elementary
Networks
Probabilistic
NonStoichiometric
Ball and Stick
(Data dependent)
Systems and Synthetic Biology
Systematic
Biology
Synthetic
Network
Physiology Biology
Top Down
Bottom Up
Systems Biology
Synthetic Biology
Top Down and Bottom Up
Top Down “-omics”
System
• Whole cell
Model
• Statistical Correlations
Data
• High-throughput
Yeast Protein-Protein Interaction Map
Top Down and Bottom Up
Top Down “-omics”
Bottom Up ”mechanistic”
System
• Whole cell
System
• Networks/Pathways
Model
• Statistical Correlations
Model
• Mechanistic, biophysical
Data
• Quantitative, single-cell
Data
• High-throughput