No Slide Title
Download
Report
Transcript No Slide Title
Genomes
Definition
Complete set of instructions for making an organism
• master blueprints for all enzymes, cellular structures &
activities
An organism‘s complete set of DNA
All the DNA contained in the cell of an organism
The collection of DNA that comprises an organism.
Total genetic information carried by a single set of
chromosomes in a haploid nucleus
Genome sequencing chronology
Genome
size (bp)
Number
of genes
Year Organism
Significance
Bacteriophage
1977
fX174
First genome
5,386
11
Human
mitochondria
First organelle
16,500
37
Haemophilus
influenzae Rd
First freeliving
organism
1,830,137
~3,500
Saccharomyces
1996
cerevisiae
First
eukaryote
12,086,000
~6,000
1981
1995
Genome sequencing chronology
Significance
Genome size
(bp)
Number
of genes
First multiCaenorhab-ditis
1998
cellular
elegans
organism
97,000,000
~19,000
Human
1999 chromosome
22
49,000,000
673
Year
Organism
First human
chromosome
Drosophila
melanogaster
First insect
150,000,000
~14,000
Arabidopsis
2000
thaliana
First plant
genome
150,000,000
~25,000
2000
Genome size
Virus
A subcellular parasite with
genes of DNA or RNA
and which replicates inside
the host cell upon which it
relies for energy and
protein synthesis.
In addition, it has an extracellular form in which the
virus genes are contained
inside a protective coat
The Baltimore classification
system
Based
on genetic contents and replication strategies of viruses.
According to the Baltimore classification, viruses are divided into the
following seven classes:
1. dsDNA viruses
2. ssDNA viruses
3. dsRNA viruses
4. (+) sense ssRNA viruses (codes directly for protein)
5. (-) sense ssRNA viruses
6. RNA reverse transcribing viruses
7. DNA reverse transcribing viruses
"ds" represents "double strand" and "ss" denotes "single strand".
Plant Viruses
Plant DNA viruses are rare
Cauliflower mosaic virus
Spherical, kills Cauliflower and Brussel Sprouts
Most plant viruses are small and comprised of ssRNA
Rod shaped, attacks tomato, pepper, beets, turnips, tobacco
2,130 identical proteins surround the ssRNA
~10,000bp, ~10 genes
Plant Viroids
Plant Viroids
Highly complementary circular ssRNA
No protein coat
Smaller than viruses (few hundreds of bases)
Smallest known virus is 3.2 kbp in size
RNA does not code for any known protein
Some even lack the AUG initiation codon
Replication mechanism is unknown
Viroids cannot recognize and infect host cell
Relies on cells being weak or injured
Proposed that viroids are "escaped introns"
Viroids are usually transmitted by seed or pollen
Infected plants can show distorted growth
The first viroid to be identified was the Potato spindle tuber viroid (PSTVd)
Some 33 species have been identified
Procaryotic genomes
Generally 1 circular chromosome (dsDNA)
Usually without introns
Relatively high gene density (~2500 genes per mm of
E. coli DNA)
Often indigenous plasmids are present
1. Eschericia coli
2. Agrobacterium tumefaciens
Eschericia coli
It is a free living, gram negative bacterium
It is a normal resident of the large
intestine in healthy people
It grows best with incubation at 37°C in a
culture medium that approximates the
nutrient available in the human digestive
tract
It is a type of probiotic organism because
it crowds out disease causing bacteria.
It also makes vitamin K which humans
require to be healthy.
Some strains make people sick. The toxic
strains are responsible for about half of
all cases of traveler's diarrhea.
Eschericia coli
It replicates once every 22
minutes, giving rise to 30
generations and more than 1
billion cells in 11 hours
Its growth falls into several
distinct phases (lag, logaritmic,
stationary and death)
The individual cells are invisible to
the naked eye, after plating onto
solid medium, each cell divides to
form a visible colony of identical
daughter cells in 12-24 hours
Eschericia coli
It provides a relatively simple and well
understood genetic improvement in
which to isolate foreign DNA
Its primary genetic complement is
contained on a single chromosome
which locations and sequences of a
large number of its genes are known
The genetic code is nearly universal
Under the best circumstances, the
uptake of a specific foreign gene is a
relatively rare occurrence and is thus
most easily accomplished in a large
populations that are reproducing rapidly
Eschericia Coli genome
Single chromosome of
approximately 5 million base pairs
(5 Mbp)
4288 protein coding genes:
• Average ORF 317 amino acids
• Average gene size 1000 bp
• Very compact: average
distance between genes 118bp
Contour length of genome: 1.7
mm
It can accept foreign DNA
derived from any organism
Some genes are arranged in the
plasmid
Agrobacterium tumefaciens
Agrobacterium tumefaciens
Agrobacterium tumefaciens is a gramnegative soil phytopathogennonsporing, motile, rod-shaped bacterium,
closely related to Rhizobium which
forms nitrogen-fixing nodules on
clover and other leguminous plants.
Agrobacterium affect most
dicotyledonous plants in nature,
resulting in crown gall tumors at the
soil-air junction upon tissue wounding
Agrobacterium has the broadest host
range of any plant pathogenic
bacterium
Agrobacterium tumefaciens
Most of the genes involved in crown gall
disease are not borne on the
chromosome of A. tumefaciens but on a
large plasmid, termed the Ti (tumourinducing) plasmid.
It is important to note that only a small
part of the plasmid (the T-DNA) enters
the plant; the rest of the plasmid remains
in the bacterium to serve further roles.
When integrated into the plant genome,
the genes on the T-DNA code for:
production of cytokinins
production of indoleacetic acid
synthesis and release of novel plant
metabolites - the opines and
agrocinopines.
Agrobacteria that causes
neoplastic diseases in plants
Agrobacterium rhizogenes (hairy root disease).
Agrobacterium rubi (cane gall disease)
Agrobacterium tumefaciens (crown gall disease)
Agrobacterium vitis (crown gall of grape)
What will Agrobacterium
tumefaciens affect in plants?
Crown gall disease is not
generally fatal, but it will
reduce plant vigor and crop
yield, and crown galls will
attract other phytopathogens
or pests.
In some cases, necrosis or
apoptosis is observed after
Agrobacterium infection.
The discovery of Agrobacterium
In 1897, Fridiano Cavara identified a flagellate,
bacilloid bacterium as a casual agent of crown gall of
grape.
This organism is Agrobacterium vitis, causing the growth
of neoplastic tumors on the stem and crown of
grapevines and inducing necrotic lesions on grape
roots.
Historical discoveries about
agrobacterium
Turn of 20th century – found causes crown gall disease
1940’s – crown gall tissue cultured due to hormone
autotrophy
1970’s – pathogenicity transferred between bacteria via
conjugation – evidence of plasmid involvement
1980’s T-DNA was first engineered to carry useful
genes into plants using methods that ‘hijacked’ the
natural process
Evidence for plasmid involvement in the
virulence of agrobacterium
1.
2.
3.
4.
5.
6.
7.
Relationship between virulence and specific plasmids in different
agrobacterium strains
Loss of virulence with loss of plasmids when grown at high temp (plus
restoration of virulence when same plasmids replaced)
Virulence transferred when plasmids transferred between virulent and
non-virulent strains
Stable nature of hormone autotrophy in infected host plant tissues
indicated that this was genetically determined and could result from
genetic transfers between agrobacterium and its host
Fragments of agrobacterium plasmids (T-DNA) were found in the DNA
of diseased tissues
Plants regenerated from diseased tissues were bred to produce offspring
which inherited the T-DNA in a Mendelian manner.
This indicated that the T-DNA was integrated into nuclear DNA
Autoradiogram of a Southern
blot of DNA extracted from
cured crown gall cells probed
with T-DNA showing the
presence of T-DNA within
the plant genome.
Lanes 1 & 2: T-DNA extracted
from agrobacterium Ti plasmid
Lanes 3, 5 & 6: DNA extracted
from gall cells
Lane 4: DNA from non-infected
plant tissue
Agrobacterium lives in intercellular spaces of the plant
Steps of Agrobacterium-plant
cell interaction
1. Cell-cell recognition
2. Signal transduction and transcriptional activation of
vir genes
3. Conjugal DNA metabolism
4. Intercellular transport
5. Nuclear import
6. T-DNA integration
Agrobacterium tumefaciens
genome
•
•
•
•
Genome size (chromosome) is about 6 Mb
A large (~250kbp) plasmid called Tumor-inducing (Ti) plasmid)
Plasmid contains genes responsible for the disease
Portion of the Ti plasmid is transferred between bacterial cells and
plant cells T-DNA (Transfer DNA)
• T-DNA integrates stably into plant genome
• Single stranded T-DNA fragment is converted to dsDNA
fragment by plant cell
Then integrated into plant genome
2 x 23bp direct repeats play an important role in the excision
and integration process
Agrobacterium tumefaciens
What is naturally encoded in T-DNA?
• Enzymes for auxin and cytokinin synthesis
Causing hormone imbalance tumor
formation/undifferentiated callus
Mutants in enzymes have been characterized
• Opine synthesis genes (e.g. octopine or nopaline)
Carbon and nitrogen source for A. tumefaciens growth
Insertion genes
• Virulence (vir) genes
• Allow excision and integration into plant genome
Plasmids
Naturally Extra chromosomal circular DNAs
They exist separate from the main chromosome
They replicate within the host cells
Their size vary form ~ 1,000 to 250,000 base pairs
They can be divided into two broad groups according to how tightly
their replication in regulated:
1. stringent plasmids (low copy number plasmids: 1-2 plasmids/cell)
only replicate along with the main bacterial chromosome
and so exist as single copy, or at most several copies within the cell
2. Relaxed plasmid (multi copy number plasmids)
replicate autonomously of the main chromosome and have copy numbers
of 10 - 500 per cells
pBR322
The plasmid pBR322 is one of the most commonly used E.coli cloning vectors. pBR322 is 4361 bp in length and
contains: (1) the replicon rep responsible for the replication of plasmid (source – plasmid pMB1); (2) rop gene coding
for the Rop protein, which promotes conversion of the unstable RNA I – RNA II complex to a stable complex and
serves to decrease copy number (source – plasmid pMB1); (3) bla gene, coding for beta-lactamase that confers
resistance to ampicillin (source – transposon Tn3); (4) tet gene, encoding tetracycline resistance protein (source –
plasmid pSC101).
Genetic structure of the Ti plasmid
Oncogenes
TL
Aux
Cyt
Opines
TR
Left Border and Right Border
transfer
(Tumor-inducing)
Ti plasmid of A. tumefaciens
1. Auxin, cytokinin,
opine synthetic genes
transferred to plant
2. Plant makes all 3
compounds
3. Auxins and cytokines
cause gall formation
4. Opines provide unique
carbon/nitrogen
source only A.
tumefaciens can use!
Saccharomyces cerevisiae
Nonpathogenic
Rapid growth (generation time ca.
80 min)
Dispersed cells
Ease of replica plating and mutant
isolation
Can be grown on defined media
giving the investigator complete
control over environmental
parameters
Well-defined genetic system
Highly versatile DNA
transformation system
Saccharomyces cerevisiae
Strains have both a stable haploid
and diploid state
Viable with a large number of
markers
Recessive mutations are
conveniently manifested in haploid
strains and complementation tests
can be carried out with diploid
strains
The ease of gene disruptions and
single step gene replacements
offers an outstanding advantage for
experimentation
Saccharomyces cerevisiae
Yeast genes can functionally be expressed
when fused to the green fluorescent protein
(GFP) thus allowing to localize gene products
in the living cell by fluorescence microscopy
The yeast system has also proven an
invaluable tool to clone and to maintain large
segments of foreign DNA in yeast artificial
chromosomes (YACs) being extremely useful
for other genome projects and to search for
protein-protein interactions using the twohybrid approach
Transformation can be carried out directly
with short single-stranded synthetic
oligonucleotides, permitting the convenient
productions of numerous altered forms of
proteins
Yeast genome
Genome of diploid Saccharomyce cerevisiae cell
Characteristic
Relative amount (%)
Number of copies
Size (kbp)
Chromosomes
85
2 x 16
14.000
Plasmid
5
60-100
6,318
Mitochondiral
10
~50 (8-130)
70-76
Yeast plasmid
The yeast genome
S. cerevisiae contains a haploid set of 16 well-characterized chromosomes,
ranging in size from 200 to 2,200 kb
Total sequence of chromosomal DNA is 12,8 Mb
6,183 ORFs over 100 amino acids long
First completely sequenced eukaryote genome
Very compact genome:
• Short intergenic regions
• Scarcity of introns
• Lack of repetitive sequences
Strong evidence of duplication:
• Chromosome segments
• Single genes
Redundancy: non-essential genes provide selective advantage
Eucaryotic genomes
Located on several chromosomes
Relatively low gene density (50 genes per mm of DNA in
humans)
Carry organellar genome
Plant genomes
Plant contains three genomes
Genetic information is divided in the chromosome.
The size of genomes is species dependent
The difference in the size of genome is mainly due to a different
number of identical sequence of various size arranged in sequence
The gene for ribosomal RNAs occur as repetitive sequence and
together with the genes for some transfer RNAs in several thousand
of copies
Structural genes are present in only a few copies, sometimes just
single copy. Structural genes encoding for structurally and
functionally related proteins often form a gene family
The DNA in the genome is replicated during the interphase of
mitosis
Arabidopsis thaliana
A weed growing at the roadside of central
Europe
It has only 2 x 5 chromosomes
It is just 70 Mbp
It has a life cycle of only 6 weeks
It contains 25,498 structural genes from
11,000 families
The structural genes are present in only few
copies sometimes just one protein
Structural genes encoding for structurally
and functionally related proteins often form
a gene family
Peculiarities of plant genomes
Huge genomes reaching tens of billions of base pairs
Numerous polyploid forms
Abundant (up to 99%) non coding DNA which seriously hinders
sequencing, gene mapping and design of gene
Poor morphological, genetics, and physical mapping of
chromosomes
A large number of “small-chromosome” in which the
chromosome length does not exceed 3 μm
The number of chromosomes and DNA content in many species
is still unknown
Size of the genome in plants
and human
Genome
Zea mays
Vicia faba
Human
Nucleus
Arabidopsis
thaliana
70 Millions
3900 Millions
14500 Millions
2800 Millions
Plastid
0.156 Millions
0.136 Millions
0.120 Millions
Mitochondrion 0.370 Millions
.570 Millions
.290 Millions
.017 Millions
Organisation of the genome
into chromosome
The nuclear genome is organized into chromosome
Chromosomes consist of essentially one long DNA helix
wound around nucleosome
At metaphase, when the genome is relatively inactive, the
chromosome are most condensed and therefore most easily
observed cytologically, counted or separated
Chromosomes provide the means by which the plant genome
constituents are replicated and segregated regularly in mitosis
and meiosis
Large genome segments are defined by their conserved order
of constituent genes
Genome composition
1. Heterochromatin
Darkly staining portions of chromosomes,
believed due to high degree of coiling
Non-genic DNA
a. Centromere
~ “middle” of Chromosomes
spindle attachment sites
b. Telomeres
1. ends of chromosome
2. important for the stability of
chromosomes tips.
2. Euchromatin
Lightly staining portion of chromosomes
It represents most of the genomes
It contains most of genes.
Ploidy and chromosome number
Organism
Ploidy
Chromosome number
Corn
Diploid (2X)
20
Tomato
Diploid (2X)
24
Arabidopsis
Diploid (2X)
10
Potato
Tetraploid (4X)
48
Wheat
Hexaploid (6X)
42
Organization of Plant Genome
Protein coding gene
Portion of genome which encodes for most of the transcribed genes
(Protein coding genes)
Non coding gene
1. Intron
2. Regulatory elements of genes
3. Multiple copies of genes, including pseudogenes
4. Intergenic sequences
5. Interspersed repeats
Organization of Plant Genome
Most plants contain quantities of DNA that greatly exceed their needs
for coding and regulatory functions
Very small percentage of the genome may encode for genes involved
in protein production
Based on kinetics:
Low-copy-number DNA
DNA sequences encodes for most of the transcribed genes (Protein coding
genes)
Medium-copy-number DNA
DNA sequences that encode ribosomal RNA (Tandemly repeated expressed
DNA)
High-copy-number DNA
It is composed of highly repetitive sequences (Repetitious DNA)
Gene classification
Chromosome
(simplified)
coding genes
Messenger RNA
intergenic
region
non-coding
genes
Structural RNA
Proteins
transfer
RNA
Structural proteins
Enzymes
ribosomal
RNA
other
RNA
Protein Coding Genes
Segment of DNA which can be transcribed and translated to amino acid
Protein Coding Genes
Transcribed region ≈ Open Reading Frame (ORF)
• long (usually >100 aa)
• “known” proteins likely
Basal signals
• Transcription, translation
Regulatory signals
Protein Coding Genes
Plant contains about 10 000 – 30 000 structural genes
They are present in only a few copies, sometimes just one (single copy
gene)
They often form a gene family
The transcription of most structural genes is subject to very complex and
specific regulation
The gene for enzymes of metabolism or protein biosynthesis which
proceed in all cells are transcribed more often
Most of the genes are switched off and are activated only in certain
organ and then often only in certain cells
Many genes are only switched on at specific times
House keeping gene:
The genes which every cell needs for such basic functions independent of its
specialization
What do the genes encode?
Microbes
highly
specialized
Basic functions
+
Yeast –
simplest
eukaryote
Fly –
complex
development
Genes for basic cellular functions such as translation,
transcription, replication and repair share similarity
among all organisms
Worm –
programmed
development
Arabidopsis –
plant life cycle
Gene families expand to
meet biological needs.
Pseudogenes
Nonfunctional copies of genes
Formed by duplication of ancestral gene, or reverse
transcription (and integration)
Not expressed due to mutations that produce a stop
codon (nonsense or frame-shift) or prevent mRNA
processing, or due to lack of regulatory sequences
Tandemly Repeated DNA
A large number of identical repeated DNA sequences
It spread over the entirely chromosome
There is variation within species for the number of copies in
allelic arrays
Variations in the lengths of tandemly repeat units have been
used as a sources of molecular marker
It is divided into:
1. Tandemly repeated expressed DNA
2. Tandemly repeated non expressed DNA (Repetitious DNA)
Tandemly Repeated Expressed
Genes
Genes which are duplicated and clustered at many
location of the genome
Ribosomal 18S, 58S, 25S and 5S RNA genes are highly
reiterated in clusters and form at sites called nucleolus
organizers (NOR)
They are also observed for tDNA and histones
Tandemly Repeat non expressed
DNA
Repetitive sequences which are unable to be
expressed but found in huge amount in the
genome
Simple-sequence DNA
Moderately repeated DNA (mobile DNA)
Simple Sequence DNA
Very sort sequences repeated many times in tandem in large
clusters
It is also called as satellite DNA
It often lies in heterochromatin especially in centromeres and
telomeres
It is divided into 2 groups:
Mini satellite : Variable number tandem repeat (VNTR)
Micro satellite : Simple sequence repeat (SSR)
It is used in DNA fingerprinting to identify individuals
Tandemly repeated DNA
Microsatellite (SSR: Simple sequence repeat)
• Unit size: at most 5 bp
• ATATATATATATATATATATATAT
Minisatellite
• Unit size: up to 25 bp
• ATTGCTGTATTGCTGTATTGCTGT
Mobile DNA (Jumping gene)
Units of DNA which are predisposed to move to another
location, sometimes involving replication of the unit, with the
help of products of genes on the elements or on related
element
Move within genomes
Most of moderately repeated DNA sequences found throughout
higher eukaryotic genomes
Some encode enzymes that catalyze movement
2 types:
a. Transposon
b. Retrotransposon
Transposon DNA
Involves copying of mobile DNA element and
insertion into new site in genome
Molecular parasite: “selfish DNA”
They probably have significant effect on evolution by
facilitating gene duplication, which provides the fuel for
evolution, and exon shuffling
Retrotransposon (retroelement)
Transposon like segment of DNA
Retroviruses lacking the sequence encoding the structural
envelope protein
Major component of plant genome
Size ranges from 1 to 13 kb in length
Widely distributed over the chromosomes of many plant
species gene
Retrovirus
A virus of higher organism whose genome is RNA, but which can insert
a DNA copy its genome into host chromosome
Mobile elements
50-80% of plant genomes are Transposable Element
Plant genome sizes
Predicted Gene numbers
Small difference in gene number, although rice genome is
3x the size
Eukaryotic cells
Peter J. Russell, iGenetics: Copyright © Pearson Education, Inc., publishing as Benjamin Cummings.
Mitochondrial genome (mtDNA)
Number of mitochondria in plants can be between 50-2000
One mitochondria consists of 1 – 100 genomes (multiple identical
circular chromosomes).
They are one large and several smaller
Size ~ 200 kb to 2,500 kb in plants
Mt DNA is replicated before or during mitosis
Transcription of mtDNA yielded an mRNA which did not contain the
correct information for the protein to be synthesized.
RNA editing is existed in plant mitochondria
Over 95% of mitochondrial proteins are encoded in the nuclear
genome.
Often A+T rich genomes
Chloroplast genome (ctDNA)
Multiple circular molecules, similar to procaryotic cyanobacteria,
although much smaller (0.001-0.1%of the size of nuclear genomes)
Cells contain many copies of plastids and each plastid contains many
genome copies
Size ranges from 120 kb to 160 kb
Plastid genome has changed very little during evolution. Though two
plants are very distantly related, their genomes are rather similar in
gene composition and arrangement
Some of plastid genomes contain introns
Many chloroplast proteins are encoded in the nucleus (separate signal
sequence)
DNA for chloroplast proteins can be come
from the nucleus or chloroplast genome
Buchannan et al. Fig. 4.4