Information Storage and Processing in Biological Systems:

Download Report

Transcript Information Storage and Processing in Biological Systems:

Information Storage and Processing in Biological Systems:
A seminar course for the Natural Sciences
Sept. 17
Biological Information, DNA, Gene regulation
Sept 24
Proteins, Enzymes, Biochemistry
Oct 1
Biochemical and Genetic Networks:
Chemotaxis/Motility in E. coli and Dictyostelium
Sept 17.
Reading List for Sept 17.2001
Chapters 1-3 “The Thread of Life” S. Aldridge Cambridge University Press.
1996.
From molecular to modular cell biology. (1999) L. H. Hartwell, J. J.
Hopfield, S. Leibler and A. W. Murray. Nature 402 (SUPP): C47-C52.
The challenges of in silico biology. (2000) B. Palsson. Nature
Biotechnology 18: 1147-1150.
Genetic Switch: Phage Lambda and Higher Organisms by Mark Ptashne.
2nd edition (1992) Blackwell Science.
It’s a noisy business! Genetic regulation at the nanomolar scale. H. Harley
and A Aarkin. Trends In Genetics February 1999, volume 15, No. 2
Simulation Of Prokaryotic Genetic Circuits. H. H. Mcadams and A. Arkin.
Annu. Rev. Biophys. Biomol. Struct. 1998. 27:199–224
Student Presentation Topic:
Bioinformatics: Overview and future challenges
Melissa Dobson
What is “biological information”
and how is it “Stored” and Processed”?
M.C. Escher Spirals
What is “biological information”?
Genetic
(DNA and RNA)
What is “biological information”?
Genetic
(DNA and RNA)
Epigenetic
(DNA modification)
What is “biological information”?
Genetic
(DNA and RNA)
Epigenetic
(DNA modification)
Non-Genetic Inheritance
(template dependent replication)
What is “biological information”?
Genetic
(DNA and RNA)
Epigenetic
(DNA modification)
Non-Genetic Inheritance
(template dependent replication)
Physiological-Cellular Level
(Structural/Metabolism/Signal Transduction)
Simplified Connectivity of Map of Metabolism
Each node represents a chemical in the cell (E. coli)
Each connection represents an enzymatic step or steps
What is “biological information”?
Genetic
(DNA and RNA)
Epigenetic
(DNA modification)
Non-Genetic Inheritance
(template dependent replication)
Physiological-Cellular Level
(Structural/Metabolism/Signal Transduction)
Physiological- Organism Level
(Structural/Metabolism/Signal Transduction,
Development, Immune System)
What is “biological information”?
Genetic
(DNA and RNA)
Epigenetic
(DNA modification)
Non-Genetic Inheritance
(template dependent replication)
Physiological-Cellular Level
(Structural/Metabolism/Signal Transduction)
Physiological- Organism Level
(Structural/Metabolism/Signal Transduction,
Development, Immune System)
Populations
(Population dynamics, Evolution
What is “biological information”?
Genetic
(DNA and RNA)
Epigenetic
(DNA modification)
Non-Genetic Inheritance
(template dependent replication)
Physiological-Cellular Level
(Structural/Metabolism/Signal Transduction)
Physiological- Organism Level
(Structural/Metabolism/Signal Transduction,
Development, Immune System)
Populations
(Population dynamics, Evolution
Ecosystem
(Interacting Populations,
environment  populations )
What is “biological information”?
Genetic
(DNA and RNA)
Epigenetic
(DNA modification)
Non-Genetic Inheritance
(template dependent replication)
Physiological-Cellular Level
(Structural/Metabolism/Signal Transduction)
Physiological- Organism Level
(Structural/Metabolism/Signal Transduction,
Development, Immune System)
Populations
(Population dynamics, Evolution
Ecosystem
(Interacting Populations,
environment  populations )
The“Central Dogma”
The central dogma relates to the flow of ‘genetic’
information in biological systems.
DNARNAProtein
DNA
transcription
mRNA
translation
Protein
Overview of Biological Systems
Organization of the Tree of Life
Three evolutionary branches of life:
Eubacteria, Archaebacteria, Eukaryotes
The macroscopic world represents a small portion of the tree.
The Eubacteria (bacteria), Archaebacteria (archae), and Eukaryotes represent
three fundamental differences in organization of the cell.
Major Similarities:
Genetic code
Basic machinery for interpreting the code
Major Differences:
Organization of genes
Organization of the cell
sub-cellular organelles in Eukaryotes *
cytoskeletal structure in Eukaryotes **
No true multicellular organization in bacteria and archae (there are
many single celled eukaryotes).
* compartmentalization of function
** morphologically distinct cell structure
Bacteria
Morphologically “simple” - shape defined by cell surface structure.
Transcription (reading the genetic message) and Translation (converting the
genetic message into protein) are coupled- they take place within the same
compartment (cytoplasm).
Compartmentalization of Function in eukaryotic cells
Transcription (reading the genetic message) and Translation (converting the
genetic message into protein) occur in different compartments in the
eukaryotic cell.
Example of single celled eukaryotic organisms
Morphological diversity (cytoskeleton as well as cell surface structures)
There are many distinct morphological cell types
within a multicellular organism.
Morphological diversity arises from cytoskeletal networks architectural proteins
Some ‘Model’ Experimental Eukaryotic Organisms
Saccharomyces cerevisiae
Caenorhabditis elegans
(round worm)
Drosophila melanogaster
(fruit fly)
mouse
Zebrafish
Arabidopsis thaliana
Antirrhinum majus
(snapdragons )
Bacteriophage (Phage) and Viruses
1) genetic material / nucleic acid
2) protective coat protein
The information for their own replication and the means to “target” the correct
cell/host but no interpretive machinery
Constraints in Biological Systems
Chemical/Physical constraints
• stability of biological material
• reaction rates and diffusion rates
- properties of biochemical reactions (enzymes) differ from chemical
reactions
• time dependency of many steps - time scales over many orders of magnitude for
different steps
-receptor ligand binding msec
-biochemical response sec
-genetic response minutes- hours-days
• statistical properties of ‘small-scale” chemistry, i.e. where concentration of
reacting molecules is low.
Evolutionary constraints
• a biological system is constrained by it’s own evolutionary history (and also
‘biological’ history)
“Alarm clock” from the movie Brazil
Evolution of new functions is rarely de novo invention but is typically
due to the modification of pre-existing functions/structures.
Modularity
• Is the cell/organism designed in a modular fashion?
• Can we approximate cell behavior into modules?
•- Can interactions of cells, individuals, organisms be treated in a similar
way?
Coarse graining
• At what level of detail do we need to study/model a system to extract
information about the underlying mechanisms?
• What level of detail is required to define the “state” of the cell, the
individual, the population and ecosystem…?
• Can we define the “state” of the cell or only “states” of modules?
Stochastic variations and Individuality
• What is the source of stochastic variation (independent of genetic variation)?
• In genetically identical populations, does this play a role in adaptation?
• What role do stochastic processes play in development?
Robustness
• Despite stochastic variations, many cellular processes are extremely robust
(genetic networks, biochemical networks, cell divisions, development,…)
• How does the cell overcome the limitations imposed by stochastic variations?
• Where does robustness arise? Is it a network property?
DNA Basics
Four bases
A - adenine
T - thymine
C - cytosine
G - guanine
anti- parallel double stranded structure with specific bonding between
the two strands:
A  T base pairing
C  G base pairing
DNA Structure
• DNA is composed of two strands
A
C
G
A
T
G
G
G
T
-
T
G
C
T
A
C
C
C
A
• Each strand is composed of a sugar phosphate backbone
with one of four bases attached to each sugar
•The arrangement of bases along a strand is aperiodic
• The two strands are arranged anti-parallel
• There is base specific pairing between the strands such
that A pairs with T and C with G, consequently knowing
the sequence of one strand gives us the sequence of the
opposite strand.
Chemical Structure of DNA
The Double Helix
DNA Replication
• Template copying
• Semi-conservative
A
C
G
A
T
G
G
G
T
-
T
G
C
T
A
C
C
C
A
A
C
G
A
T
G
G
G
T - A
A
C
G
A
T
G
G
G
T
-
T
G
C
T
A
C
C
C
A
A - T
G
C
T
A
C
C
C
A
A
C
G
A
T
G
G
G
T
-
T
G
C
T
A
C
C
C
A
The Genetic Code – Triplet Code
- directional (always read 5’ 3’)
- each triplet of bases codes one amino acid (Codon)
- degenerate (many AA have more than one codon)
For a given sequence there are three possible reading frames
DNA contains information about the start and end of the
gene as well as when to make or if to make transcribe the
information.
DNA as an information molecule
• DNA sequence itself
• DNA sequence as a code of protein
(sequence/properties of the protein)
• DNA sequence as controlling elements and recognition sites for cellular
machinery
• DNA secondary structure and chemical modifications (e.g. methylation)
• genetic networks from multiple controlling elements and recognition sites
with multiple genes and feedback and or feedforward systems
5001
CATAAACCGG GGTTAATTTA AATACTGGAA CCGCTTACCA ATAAGACTAA
GTATTTGGCC CCAATTAAAT TTATGACCTT GGCGAATGGT TATTCTGATT
-2
end of luxS ***I
? gene start
+1
MetGlnPhe LeuGlnPhe PhePheArgGln ArgGlnLeu PheIleAla
5051 ATATGCAATT CCTGCAGTTT TTCTTTCGGC AGCGCCAGCT CTTTATTGCT
TATACGTTAA GGACGTCAAA AAGAAAGCCG TCGCGGTCGA GAAATAACGA
-2 leHisLeuGlu GlnLeuLys GluLysProLeu AlaLeuGlu LysAsnSer
+1 hrProAspArg ArgArgLeu
5501 CCCCGGACCG CCGGCGCTTG
GGGGCCTGGC GGCCGCGAAC
-2 lyArgValAla ProAlaGln
HisProGlyMet IleAspCys GluAlaIle
CATCCGGGTA TGATCGACTG CGAAGCTATC
GTAGGCCCAT ACTAGCTGAC GCTTCGATAG
MetArgThrHis AspValAla PheSerAsp
+1 ***
end of ? gene
5551 TAATAATGGC ATTTAGTCAC CTCCGATAAT TTTTTAAAAA TAAACTGAAC
ATTATTACCG TAAATCAGTG GAGGCTATTA AAAAATTTTT ATTTGACTTG
-2 LeuLeuProMet
 luxS start
Two ways of thinking about “information” in DNA
1) DNA has sequence information which is TRANSCRIBED into RNA (i.e. it
is a template) and TRANSLATED from RNA into protein (Genetic Code).
DNA
5’---CTCAGCGTTACCAT---3’
3’---GAGTCGCAATGGTA---5’
Transcription
RNA
5’---CUCAGCGUUACCAU---3’
Translation
PROTEIN
N---Leu-Ser-Val-Thr---C
• In RNA T’s are replaced by U’s
• Some gene products are RNA, i.e. they are not translated (e.g. tRNA, rRNA)
Two ways of thinking about “information” in DNA
2) DNA has sequence information at a structural level. This form of
information directs the ‘interpretative machinery’ in the cell (protein
complexes), in most instances binding sites for proteins. This type of
‘information’ is important for example in determining where (along a
sequence of DNA) and when a gene may be turned on, initiation of DNA
replication, packaging of DNA etc…
i.e - Regulation
The Basic Transcription Components (Bacterial)
s factor
a2bb’holoenzyme
Transcription
Machinery
RNA Polymerase
start
DNA
-35
-10
Promoter - binding site for RNA polymerase, defines where the process
will begin.
Promoter Binding
-35
-10
Open Complex Formation
Promoter Clearance
Messenger RNA (mRNA)
Regulation of Gene Expression: The Basics
Transcriptional Regulators are proteins that act to modulate gene
expression.
Proteins that negatively regulate expression (i.e decrease transcription) are
called Repressors and those that act positively (i.e. increase transcription of
a gene) are called Activators.
These proteins act by binding at specific DNA sites are modulate RNA
polymerase function. These binding sites are called operators.
start
-35
-10
promoter
operator
Repressor
X
-35
start
-10
Repression can be viewed as a competition for binding between the
polymerase and the repressor (an oversimplification).
Activator
start
-35
-10
promoter
operator
An Activator promotes RNA polymerase biding activity through
direct protein-protein interactions (an oversimplification).
• Any DNA binding protein, with an appropriately placed binding site can
act as a repressor. Activation requires specific protein-protein interaction
between the activator and RNA polymerase.
• Typically bacterial promoters are regulated by a few proteins at most
and the control regions tend to be quite small.
• Eukaryotic gene regulatory regions can be very large and involve many
transcriptional regulators.
• Activation and repression depend on positioning of operator sites.
• Multiple inputs can be integrated at the level of gene expression.
Consensus Binding Sites
The interaction of a DNA-Binding Protein (such as RNA Polymerase or
transcriptional regulators) is dependent on the ‘affinity’ of the protein for the
binding site. This affinity will vary under different physiological conditions, as
the concentration of the protein changes and also will depend on the binding site
itself.
The optimal binding site is usually close to the consensus sequence for that site
obtain by aligning all the know binding sites. On can thus have a range of
‘activity’ at different promoters/operators by having differences in DNA binding
sites.
E. coli Promoters
-35 box
-10 box
Consensus
TTGACA- N17- TATAAT
Examples:
TTGATA- N16- TATAAT
TTCCAA- N17- TATACT
TGTACA- N19- CATAAT
TTGATC- N17- TACTAT
TTGACA- N17- TAGCTT
“Activity” of Transcriptional
Regulators in Response to ‘Signals’
Case 1. Affinity of the protein for DNA may be modified by binding a ‘ligand’
(Allosteric mechanism).
Case 2. Affinity of the protein may be affected by covalent modification such as
phosphorylation.
DNA
R
R-DNA
x
DNA
Rx
Rx-DNA
DNA
Both of these mechanisms (ligand binding and post-translational
modification) are common themes in the regulation of proteins, not just in
transcription control.
Regulation of Gene Expression
DNA
RNA polymerase binding
Open Complex Formation
Transcription
mRNA
mRNA stability
Translation
Protein
Polypeptide folding
Protein stability
Both positive and negative regulation can occur at any step in this process.
The lac operon:
A simple example of regulation of gene expression
lacI
LacI
(Repressor)
Constitutively expressed
i.e. not regulated
The lac operon:
A simple example of regulation of gene expression
lacZ
lacY
lacA
LacZ - b-galactosidase (enzyme degrading lactose)
LacY - permease (lets lactose into cells)
LacA - transacetylase
The lac operon:
A simple example of regulation of gene expression
X
lacI
lacZ
lacY
X
LacI
(Repressor)
lacA
The lac operon:
A simple example of regulation of gene expression
lacI
lacZ
lacY
lacA
LacI
(Repressor)
+ lactose
(allolactose)
Inducer
LacI-Inducer complex cannot bind DNA
The lac operon:
A simple example of regulation of gene expression
lacI
LacI
(Repressor)
+ lactose
(allolactose)
lacZ
lacY
lacA
LacZ - b-galactosidase (enzyme degrading lactose)
LacY - permease (lets lactose into cells)
LacA - transacetylase
The lac operon:
A simple example of regulation of gene expression
In the absence of lactose in the environment, the lacZYA operon is
essentially OFF with a low probability of expression.
lacI
LacI
(Repressor)
lacZ
lacY
lacA
LacZ - b-galactosidase (enzyme degrading lactose)
LacY - permease (lets lactose into cells)
LacA - transacetylase
The lac operon:
A simple example of regulation of gene expression
In the presence of lactose in the environment, the lacZYA operon is ON
with a high probability of expression.
lacI
LacI
(Repressor)
+ lactose
(allolactose)
lacZ
lacY
lacA
LacZ - b-galactosidase (enzyme degrading lactose)
LacY - permease (lets lactose into cells)
LacA - transacetylase
The lac operon:
A simple example of regulation of gene expression
COMPLICATIONS!
lacZ
lacI
lacY
lacA
1) There are actually three operator sites
that LacI can bind. Each has a different
affinity.
O3
O1
O2
O2 is within the gene
The lac operon:
A simple example of regulation of gene expression
COMPLICATIONS!
lacZ
lacI
lacY
lacA
2) There is a transcriptional activator (CAP)
that is required for expression of the lacZYA
operon.
O3
CAP
O1
O2
O3
CAP
O1
O2
CAP binding facilitates RNA
polymerase binding and activates
transcription.
ON State
O3
CAP
O1
O2
LacI binding to two sites (O1
and O2) results in blocking of
transcription (promoter
clearance) by RNA
polymerase and repression.
O2
O1
O3
CAP
O1
O2
LacI binding to two sites (O1 and O3)
results in exclusion of RNA
polymerase and repression.
X
X
O3
O1
Two regulators- CAP and LacI = Signal Integration
Two LacI operators required = Cooperativity
(repression does not show a linear dependence of ‘active’ lacI)
LacI acts as dimer – tetramer = more potential Cooperativity!
O2 and O3 are redundant- not obvious effects of deleting one.
lacZYA expression
Hysteresis in the lac operon expression
The expression of the lacZYA
operon increases in a culture of
bacteria as the concentration of
lactose in the environment goes up.
Lactose concentration
lacZYA expression
Hysteresis in the lac operon expression
The expression of the lacZYA
operon decreases in a culture of
bacteria as the concentration of
lactose in the environment goes
down but it does not follow the
same dependence on lactose.
Lactose concentration
lacZYA expression
Hysteresis in the lac operon expression
Why? Consider the two situations
at some concentration of lactose
[x]
Lactose concentration
Hysteresis in the lac operon expression
The response of the cell to lactose is dependent on
the state of the cell (i.e. the cell’s history)
Uninduced cell
Pre-induced cell
LacY permease pumps lactose
into the cell, accumulating it
above the threshold
concentration
No induction.
Continued Induction.