PowerPoint Presentation - The Human Genome Project: The

Download Report

Transcript PowerPoint Presentation - The Human Genome Project: The

Comparative Analysis of Human
Chromosome 22q11.1-q12.3 with
Syntenic Regions in the Chimpanzee,
Baboon, Bovine, Mouse, Pufferfish and
Zebrafish Genomes
Dr. Bruce A. Roe
George Lynn Cross Research Professor
Advanced Center for Genome Technology
Department of Chemistry and Biochemistry
University of Oklahoma
[email protected]
www.genome.ou.edu
C T
A G
LXVIII CSHL Symposium
“The Genome of Homo Sapiens”
May 28 - June 3, 2003
“The joy of science is the people
you meet along the way and how
they influence your life”
Jochanan Stenesh and Lilian Myers at Western Michigan University
and Bernie Dudock at SUNY Stony Brook
Bart Barrell and Alan Coulson
originally at the MRC-Hills
Road Cambridge and Ian
Dunham both now at the
Sanger Institute
Bev Emanuel at Childrens
Hospital of Philadelphia
Watson and Crick
C
A G
T
Fred Sanger
Sanger,
Keio,
Wash U,
OU
C T
A G
Human Chromosome 22
Sequence Features
• 39 % of the sequence is occupied by genes including
their introns, 5’ and 3’ non-translated regions.
• 3 % of the complete sequence encodes the protein
products of these genes.
• 42 % of the sequence is composed of repetitive
sequences, compared to 46 % for the entire genome.
• Only slightly over half of the genes predicted for
human chromosome 22 can be experimentally
validated.*
C
A G
T
* Shoemaker DD., et al. Experimental annotation of the human
genome using microarray technology. Nature. 409, 922-7 (2001).
An Individual’s Genome
Differs from the DNA of:
• Siblings by 1 to 2 million bases, ~99.98% identical, with
coding regions 99.99999% identical
• Unrelated humans by 6 million bases, ~99.8% identical
overall, with coding regions 99.9999% identical
• Chimpanzees by about 100 million base pairs ~98%
identical
• Baboons by about 300 million base pairs ~92% identical
• Mice by about 2.8 billion bases, but coding regions are
~90% identical
• Leaf spinach by about 2.9 billion bases, but coding
regions are ~40% identical
C T
A G
Differences between individuals
AGCCACACAGTGTCCACCGGATGGTTGATTTTGAAGCAGAGTT
AGCTTGTCACCTGCCTCCCTTTCCCGGGACAACAGAAGCTGAC
CTCTTTGNTCTCTTGCGCAGATGATGAGTCTCCGGGGCTCTAT
GGGTTTCTGAATGTCATCGTCCACTCAGCCACTGGATTTAAGC
AGAGTTCAAGTAAGTACTGGTTTGGGGAGNAGGGTTGCAGCGG
CNGAGCCAGGGTCTCCACCCAGGAAGGACTNATCGGGCAGGGT
GTGGGGAAACAGGGAGGTTGTTCAGATGACCACGGGACACCTT
TGACCCTGGCCGCTGTGGAGTGTTTGTGCTGGTTGATGCCTTC
TGGGTGTGGAATTGTTTTTCCCGGAGTGGCCTCTGCCCTCTCC
CCTAGCCTGTCTCAGATCCTGGGAGCTGGTGAGCTGCCCCCTG
CAGGTGGATCGAGTAATTGCAGGGGTTTGGCAAGGACTTTGAC
AGACATCCCCAGGGGTGCCCGGGAGTGTGGGGTCCNAGCCAG
The yellow underlined sequence is the first exon of
the BCR gene involved in leukemia. Only 5 bases
C T
A G
(N) differ in non-gene regions.
Human Chromosome 22
Single Nucleotide Polymorphisms*
Number of overlaps
Size of overlaps
Number of SNPs
Number of substitutions
Number of ins/del
335
13,203,147 bp
11,116 (~1/1000 bp)
9,123 (82%)
1,193 (18%)
Only 48 of the 11,116 SNPs were in coding
regions ~ 10 fold lower than in non-coding
C
A G
* E. Dawson, et al. A SNP Resource For Human Chromosome 22: Extracting Dense
Clusters of SNPs from the Genomic Sequence. Genome Research, 11, 170-178 (2001).
T
“We each are like a different symphony orchestra”
“All playing the same instruments slightly differently”
C T
A G
Good news and Bad news
• Good news <40,000 genes (counting dark space?)
• Bad news
• 2-4 times as many proteins as other
species due to extensive alternative
splicing in humans.
• We only know the function of about
half the predicted genes.
• Likely > 1 million different gene
products based on alternative splicing
and post-translational modifications.
C T
A G
Where we stand now
C T
A G
• We essentially have the ‘dictionary’ with all
the words (genes) spelled correctly, but only
slightly more than half of the words (genes)
have definitions.
• Slightly over half of the genes predicted for
human chromosome 22 have been
experimentally validated.
• Through comparative genomic sequencing
we can annotate the human genome based
on evolutionary conserved gene sequences
and use model systems to study gene
expression.
C T
A G
Chimpanzee and Baboon
Genomic Sequencing
• Medically important model eukaryotic organisms
• The chimpanzee is our nearest evolutionary
relative with a genome that has ~98 %
sequence identity with the human genome
• The baboon genome has ~92 % sequence
identity with the human genome
C T
A G
PIP Plot of
a region of
human
chr22
compared
to syntenic
regions of
baboon
and mouse
C T
A G
humanspecific
repeat
regions
Questionable
gene present
in primates
but not in
rodents
Variations in the regions syntenic to the
human chr 22 immunoglobulin light chain
region from chimp, baboon, rat and mouse
C T
A G
34 Kbp
deletion
in
baboon
C T
A G
Exons in one
copy of a
zebrafish
duplicated
gene with
75%
homology to
human but
greatly
diverged,
<50%
homology, in
the other
copy
C T
A G
Instance
of a rare
alu
deletion in
chimp and
a gene
having
very low
homology
in fish
C T
A G
Conclusions from the analysis of
vertebrate genomic sequences
• Approximately 40% of the genome is expressed into
hnRNA which is processed to 10-fold smaller mature
mRNA with extensive alternative splicing (1 gene -->
multiple proteins).
• Approximately 40% repeat sequence density.
• Conserved coding sequences, promoters and enhancers
and exon spacing approximately proportional to
evolutionary distance from a common ancestor.
• Additional endogenous retroviral and alu sequences in the
human genome and some regions not present is different
vertebrates.
• Sequence drift in duplicated gene families.
• About half of the predicted genes have yet to be assigned
C T
A G
any known function.
“Zebrafish are small people that swim
in the water and breathe through gills”
Han Wang, Dept. Zoology and Director of the
University of Oklahoma Zebrafish Facility
C T
A G
How much of the ~1.7 Gbp genome has been sequenced so far?
The whole genome shotgun project comprises roughly 11.6 million traces by
now. With an average quality clipped trace length of 517 bp this adds to 6 Gb in
total, so the genome is covered 3.5 times.
The new assembly Zv2 is built on 11.7 million traces with an average trace
length of 651 bp length, adding up to 7.64 Gbp (4.5 x coverage).
The current Sanger Institute in-house statistics for the clone sequencing are:
C T
A G
* 322,712,747 bp unfinished
* 112,494,895 bp finished
* 435,207,642 bp total
Zebrafish Developmental stages(HPF*)
Zygote Period (0-3/4 h)
Cleavage Period (0.7- 2.2 h)
Blastula Period (2 1/4 - 5 1/4 h)
Gastrula Period (5 1/4 - 10 1/3h)
Segmentation Period (10 1/3 - 24 h)
Pharyngula Period (24-48 h)
Hatching Period (48-72 h)
C T
A G
Description
The newly fertilized egg is in the zygote period
until the first cleavage occurs
After the first cleavage, blastomeres divide at
approximately 15 minute intervals
Begins at 128-cell stage or 8th zygotic cell cycle.
Embryo enters midblastula transition (MBT), the
onset of zygotic transcription. Period ends at the
onset ofgastrulation.
Morphogenetic cell movements of involution,
convergence, and extension occur, producing
the primary germ layers and the embryonic axis.
Somites develop, the rudiments of the primary
organs become visible, the tail bud becomes
more prominent and the embryo elongates. The
first cells differentiate morphologically, and the
first body movements appear.
Embryos developing to the phyolotypic stage
when it posesses the classic vertebrate
bauplan.Migration of the posterior lateral line
primordium. Rapid organogenesis continues.
Individuals within a single developing clutch
hatch sporadically during the whole period.
Kimmel CB, et al. Stages of embryonic development of the zebrafish. Dev Dyn 203, 253-310 (1995).
Gene Expression in Zebrafish
• Created and sequenced 10,000 clones from a zebrafish brain
and eye cDNA library.
• After a blast vs human chromosome 22, obtained the set of
zebrafish cDNA clones corresponding to several predicted
human chromosome 22 genes.
• Picked an EST whose expression profile matched a hypothetical
protein with and EST from a human fetal brain library.
C T
A G
Gene Expression in Zebrafish (cont)
• An antisense RNA hybridization probe was generated by in vitro
transcription in the presence of dig-UTP after cloning into an
expression vector.
• Whole mount in situ hybridization was to 24, 48, and 72 hours postfertilization zebrafish embryos.
• Hybridization was detected by anti-dig antibody.
24hpf
48hpf
72hpf
Probe1 b6
Probe1 b6 shows hybridization in the brain from 24 hours onward and in the eye
from 48 hours onward.
C T
A G
1b6: AP000557.1.mRNA chr22 position:18495442-18504448 KIAA1020 hypothetical protein matches EST b6n20zf
Exon-specific gene expression in zebra fish
embryos during development that is
amenable to automation
Incorporated mouse in situ methods for zebrafish that:
• shorten the length of probes from 1000 bp to 100 bp, thus
exon-specific probes,
• hybridizations in a 96 well multiplex microtiter plate format,
• digoxigenin labeled ssDNA probes generated from
assymetric, single primer amplification off PCR (eliminating
sub-cloning of each PCR product into T3/T7 expression
vectors), and
• eliminated the spurious labeling of the eye by introducing
glycine as the reagent of choice to rapidly inhibit the
proteinase K used to increase permeability of the embryos.
C T
A G
Qu i c k T i m e ™ a n d a Gra p h i c s d e c o m p re s s o r a re n e e d e d to s e e th i s p i c t u re .
Whole mount in situ hybridization with
ssDNA-digoxigenin labeled probe
made from a PCR product. Brainspecific expression of this mRNA
during embryonic development
Qu i c k T i m e ™ a n d a Gra p h i c s d e c o m p re s s o r a re n e e d e d t o s e e t h i s p i c t u re .
C T
A G
The importance of a “no probe” antibody staining
control to determine if any probe-independent
antibody staining occurs in the lens
Typically only see anti-sense probe hybridizing,
and therefore stained by anti-dig antibody with
some probe-independent staining in the eye.
Anti-sense probe
Sense probe
No probe
72 hour post fertilization embryo
C T
A G
A probe to the unique 3’ UTR if
there are multiple paralogs
One last experiment with a surprise ending
C T
A G
Hybridization
probe a8h24
unique to 3’
UTR of
zebrafish gene
2 based on our
zebrafish EST
sequence
C T
A G
One too many controls sometimes
results in a surprise observation
Both the anti-sense and sense probes hybridized
to 72 hour post fertilization embryonic brain.
Anti-sense probe
Sense probe
No probe
Indicating RNA transcribed from
the opposite, non-coding strand?
C T
A G
What’s next for our Genome Center?
• Participate in sequencing the mouse, chimp, baboon,
lemur, bovine, dog, cat, chicken and zebra fish
genomes concentrating on:
• Regions of high biological interest and
• Regions orthologous to human chromosome 22
• Sequence the Medicago truncatula (alfalfa) genome
using a mapped BAC-based approach concentrating
on coding regions
• Continued sequencing of selected pathogenic bacteria
• Investigate the function of the predicted genes with
unknown function in the zebrafish system first by
whole mount in situ and then expression knock down
experiments with morpholino oligos.
C T
A G
Laboratory Organization
Bruce Roe, PI
Support Teams
Informatics
Production
DNA Synthesis
Jim White
Steve Kenton
Hongshing Lai
Sean Qian***
Phoebe Loh*
Rose Morales-Diaz*
Sulan Qi
Mounir Elharam*
Bart Ford*
Steve Shaull**
Doug White
Work-study Undergraduate students**
Reagents &
Equip. Maint.
Administration
Mounir Elharam*
Doug White
Clayton Powell**
KayLynn Hale
Dixie Wishnuck
Tami Womack
Mary Catherine Williams
Research Teams
Doris Kupfer
Julia Kim*
Sun So
Graham Wiley**
Limei Yang
Angie Prescott*
Audra Wendt**
Mandi Aycock**
Fu Ying
Liping Zhou
Ruihua Shi****
Junjie Wu****
Trang Do
Anh Do
Lily Fu
Yang Ye**
Tessa Manning**
Ziyun Yao***
Steve Shaull*
Youngju Yoon****
Stephan Deschamps***
Shelly Oommen****
Christopher Lau****
ShaoPing Lin***
Honggui Jia
Hongming Wu
Baifang Qin
Peng Zhang
Axin Hua***
Weihong Xu****
Yanhong Li
Fares Najar***
Chunmei Qu
Keqin Wang
Shuling Li
Lin Song****
Ying Ni
Huarong Jiang
Funding from the NHGRI, Noble Foundation, DOE, NSF (pending)
A G - Collaborators at Sanger, CWRU, CHOP, Keio, UIUC and Riken
C
T
Jami Milam****
Sara Downard**
Ging Sobhraksha**
Pheobe Loh *
Sulan Qi
Bart Ford*
* Previous undergraduate res. student
** Present undergraduate res. student
*** Previous graduate student
**** Present graduate student
C T
A G
The ACGT Team
Peggy and Charles Stephenson Center
C T
A G