Transcript Document

Genomics
Chapter 18
Mapping Genomes
Maps of genomes can be divided into 2 types:
-Genetic maps
-Abstract maps that place the relative
location of
genes on chromosomes
based on recombination
frequency.
-Physical maps
-Use landmarks within DNA sequences, ranging from
restriction sites to the actual DNA sequence.
2
Physical Maps
Distances between “landmarks” are measured in base-pairs.
-1000 basepairs (bp) = 1 kilobase (kb)
Knowledge of DNA sequence is not necessary.
There are three main types of physical maps:
-Restriction maps (constructed use restriction enzymes)
-Cytological maps (chromosome-banding pattern)
-Radiation hybrid maps (using radiation to fragment
chromosomes)
3
Restriction maps
-The first physical maps;
-Based on distances
between restriction sites;
-Overlap between smaller
segments can be used to
assemble them into a contig
-Continuous segment
of the genome.
4
Physical Maps
Cytological maps
-Employ stains that generate
reproducible patterns of
bands on the chromosomes
-Divide chromosomes
into subregions
-Provide a map of the
whole genome, but at low
resolution
-Cloned DNA is
correlated with map using
fluorescent in situ
hybridization (FISH)
5
Physical Maps
Radiation hybrid maps
-Use radiation to fragment chromosomes randomly;
-Fragments are then recovered by fusing irradiated cell to
another cell
-Usually a rodent cell
-Fragments can be identified based on banding patterns
or FISH.
6
Genetic Maps
Most common markers are short repeat sequences called,
short tandem repeats, or STR loci:
-Differ in repeat length between individuals;
-13 form the basis of modern DNA fingerprinting developed
by the FBI;
-Cataloged in the CODIS database to
criminal offenders
identify
7
Genetic Maps
Genetic and physical maps can be correlated:
-Any cloned gene can be placed within the genome and can
also be mapped genetically.
8
Genetic Maps
All of these different kinds of maps are stored in databases:
-The National Center for Biotechnology Information (NCBI)
serves as the US repository for these data and more;
-Similar databases exist in Europe and Japan
9
Whole Genome Sequencing
The ultimate physical map is the base-pair sequence of the
entire genome.
- Requires use of highthroughout automated
sequencing and computer
analysis.
10
Whole Genome Sequencing
Sequencers provide accurate sequences for DNA segments
up to 800 bp long
-To reduce errors, 5-10 copies of a genome are
sequenced and compared
Vectors use to clone large pieces of DNA:
-Yeast artificial chromosomes (YACs)
-Bacterial artificial chromosomes (BACs)
-Human artificial chromosomes (HACs)
-Are circular, at present
11
Whole Genome Sequencing
Clone-by-clone sequencing
-Overlapping regions between BAC clones are identified
by restriction mapping or STS analysis.
Shotgun sequencing
-DNA is randomly cut into smaller fragments, cloned and
then sequenced;
-Computers put together the overlaps.
-Sequence is not tied to other information.
12
13
The Human Genome Project
Originated in 1990 by the International Human Genome
Sequencing Consortium;
Craig Venter formed a private company, and entered the
“race” in May, 1998;
In 2001, both groups published a draft sequence.
-Contained numerous gaps
14
The Human Genome Project
In 2004, the “finished” sequence was published as the
reference sequence (REF-SEQ) in databases:
-3.2 gigabasepairs
-1 Gb = 1 billion basepairs;
-Contains a 400-fold reduction in gaps;
-99% of euchromatic sequence;
-Error rate = 1 per 100,000 bases
15
Characterizing Genomes
The Human Genome Project found fewer genes than
expected:
-Initial estimate was 100,000 genes;
-Number now appears to be about 25,000!
In general, eukaryotic genomes are larger and have more
genes than those of prokaryotes:
-However, the complexity of an organism is not necessarily
related to its gene number.
16
Finding Genes
Genes are identified by open reading frames:
-An ORF begins with a start codon and contains no
stop codon for a distance long enough to encode a
protein.
Sequence annotation:
-The addition of information, such as ORFs, to the
basic sequence information.
17
Finding Genes
BLAST
-A search algorithm used to search NCBI databases for
homologous sequences;
-Permits researchers to infer functions for isolated molecular
clones
Bioinformatics
-Use of computer programs to search for genes, and to
assemble and compare genomes.
18
Genome Organization
Genomes consist of two main regions
-Coding DNA
-Contains genes than encode proteins
-Noncoding DNA
-Regions that do not encode proteins
19
Coding DNA in Eukaryotes
Four different classes are found:
-Single-copy genes: Includes most genes.
-Segmental duplications: Blocks of genes copied from one
chromosome to another.
-Multigene families: Groups of related but distinctly
different genes.
-Tandem clusters : Identical copies of genes occurring
together in clusters.
20
Noncoding DNA in Eukaryotes
Each cell in our bodies has about 6 feet of DNA stuffed into
it.
-However, less than one inch is devoted to genes!
Six major types of noncoding human DNA have been
described.
21
Noncoding DNA in Eukaryotes
Noncoding DNA within genes:
-Protein-encoding exons (less than 1.5%) are embedded
within much larger noncoding introns (about 24%).
Structural DNA:
-Called constitutive heterochromatin;
-Localized to centromeres and telomeres.
Simple sequence repeats (SSRs):
-One- to six-nucleotide sequences repeated thousands of
times. (SSRs can arise from DNA replication errors. About
3%).
22
Noncoding DNA in Eukaryotes
Segmental duplications:
-Consist of 10,000 to 300,000 bp that have duplicated and
moved either within a chromosome or to a nonhomologous
chromosome.
Pseudogenes:
-Inactive genes that may have lost function because of
mutation.
23
Noncoding DNA in Eukaryotes
Transposable elements (transposons)
-Mobile genetic elements
- Able to move from one location on a chromosome to
another.
-Four types:
-Long interspersed elements (LINEs) (21%)
-Short interspersed elements (SINEs) (13%)
-Long terminal repeats (LTRs) (8%)
-Dead transposons (3%)
TOTAL OF 45% OF THE GENOME!!!!
24
Genomics
Comparative genomics, the study of whole genome maps of
organisms, has revealed similarities among them:
-Over half of Drosophila genes have human counterparts;
- Humans and mouse: only 300 genes that have no
counterparts in the genome.
Synteny refers to the conserved arrangements of DNA
segments in related genomes;
-Allows comparisons of unsequenced genomes.
25
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Rice
Sugarcane
Corn
Wheat
Genomic Alignment (Segment Rearrangement)
26
Genomics
Functional genomics is the study of the function of genes
and their products;
DNA microarrays (“gene chips”) enable the analysis of
gene expression at the whole-genome level;
-DNA fragments are deposited on a slide:
-Probed with labeled mRNA from different sources;
-Active/inactive genes are identified.
27
Proteomics
Proteomics is the study of the proteome:
-All the proteins encoded by the genome.
- A single gene can code for multiple proteins using
alternative splicing.
Although all the DNA in a genome can be isolated from a
single cell, only a portion of the proteome is expressed
in a single cell or tissue.
The transcriptome consists of all the RNA that is present
in a cell or tissue.
28
Proteomics
Proteins are much more difficult to study than DNA
because of:
-Post-translational modifications
-Alternative splicing.
However, databases containing the known protein
structural exist:
-These can be searched to predict the structure and
function of gene sequences.
29
Applications of Genomics
The genomics revolution will have a lasting effect on how we
think about living systems;
The immediate impact of genomics is being seen in
diagnostics:
-Identifying genetic abnormalities;
-Identifying victims by their remains;
-Distinguishing between naturally occurring and
intentional outbreaks of infections.
30
Applications of Genomics
31
Applications of Genomics
Genomics has also helped in agriculture.
-Improvement in the yield
and nutritional quality of
rice.
-Doubling of world grain production in last 50 years, with only
a 1% cropland increase.
32
Applications of Genomics
Genome science is also a source of ethical challenges and
dilemmas:
-Gene patents
-Should the sequence/use of genes be freely
available or can it be patented?
-Privacy concerns
-Could one be discriminated against
because their
SNP profile indicates
susceptibility to a disease?
33