Nosce te ipsum: the human genome

Download Report

Transcript Nosce te ipsum: the human genome

Nosce te ipsum:
the human genome
3.3 billion base pairs
MCB140, 12-1-06
1
Declaration of Independence
WHEN in the Course of human Events, it becomes necessary for one People to dissolve the Political Bands which have connected them with another, and to assume among the
Powers of the Earth, the separate and equal Station to which the Laws of Nature and of Nature's God entitle them, a decent Respect to the Opinions of Mankind requires that
they should declare the causes which impel them to the Separation. WE hold these Truths to be self-evident, that all Men are created equal, that they are endowed by their
Creator with certain unalienable Rights, that among these are Life, Liberty and the Pursuit of Happiness -- That to secure these Rights, Governments are instituted among Men,
deriving their just Powers from the Consent of the Governed, that whenever any Form of Government becomes destructive of these Ends, it is the Right of the People to alter or
to abolish it, and to institute new Government, laying its Foundation on such Principles, and organizing its Powers in such Form, as to them shall seem most likely to effect their
Safety and Happiness. Prudence, indeed, will dictate that Governments long established should not be changed for light and transient Causes; and accordingly all Experience
hath shewn, that Mankind are more disposed to suffer, while Evils are sufferable, than to right themselves by abolishing the Forms to which they are accustomed. But when a
long Train of Abuses and Usurpations, pursuing invariably the same Object, evinces a Design to reduce them under absolute Despotism, it is their Right, it is their Duty, to throw
off such Government, and to provide new Guards for their future Security. Such has been the patient Sufferance of these Colonies; and such is now the Necessity which
constrains them to alter their former Systems of Government. The History of the present King of Great- Britain is a History of repeated Injuries and Usurpations, all having in
direct Object the Establishment of an absolute Tyranny over these States. To prove this, let Facts be submitted to a candid W orld. HE has refused his Assent to Laws, the most
wholesome and necessary for the public Good. HE has forbidden his Governors to pass Laws of immediate and pressing Importance, unless suspended in their Operation till his
Assent should be obtained; and when so suspended, he has utterly neglected to attend to them. HE has refused to pass other Laws for the Accommodation of large Districts of
People, unless those People would relinquish the Right of Representation in the Legislature, a Right inestimable to them, and formidable to Tyrants only. HE has called together
Legislative Bodies at Places unusual, uncomfortable, and distant from the Depository of their public Records, for the sole Purpose of fatiguing them into Compliance with his
Measures. HE has dissolved Representative Houses repeatedly, for opposing with manly Firmness his Invasions on the Rights of the People. HE has refused for a long Time,
after such Dissolutions, to cause others to be elected; whereby the Legislative Powers, incapable of the Annihilation, have returned to the People at large for their exercise; the
State remaining in the mean time exposed to all the Dangers of Invasion from without, and the Convulsions within. HE has endeavoured to prevent the Population of these
States; for that Purpose obstructing the Laws for Naturalization of Foreigners; refusing to pass others to encourage their Migrations hither, and raising the Conditions of new
Appropriations of Lands. HE has obstructed the Administration of Justice, by refusing his Assent to Laws for establishing Judiciary Powers. HE has made Judges dependent on
his Will alone, for the Tenure of their Offices, and the Amount and Payment of their Salaries. HE has erected a Multitude of new Offices, and sent hither Swarms of Officers to
harrass our People, and eat out their Substance. HE has kept among us, in Times of Peace, Standing Armies, without the consent of our Legislatures. HE has affected to
render the Military independent of and superior to the Civil Power. HE has combined with others to subject us to a Jurisdiction foreign to our Constitution, and unacknowledged
by our Laws; giving his Assent to their Acts of pretended Legislation: FOR quartering large Bodies of Armed Troops among us; FOR protecting them, by a mock Trial, from
Punishment for any Murders which they should commit on the Inhabitants of these States: FOR cutting off our Trade with all Parts of the World: FOR imposing Taxes on us
without our Consent: FOR depriving us, in many Cases, of the Benefits of Trial by Jury: FOR transporting us beyond Seas to be tried for pretended Offences: FOR abolishing
the free System of English Laws in a neighbouring Province, establishing therein an arbitrary Government, and enlarging its Boundaries, so as to render it at once an Example
and fit Instrument for introducing the same absolute Rules into these Colonies: FOR taking away our Charters, abolishing our most valuable Laws, and altering fundamentally
the Forms of our Governments: FOR suspending our own Legislatures, and declaring themselves invested with Power to legislate for us in all Cases whatsoever. HE has
abdicated Government here, by declaring us out of his Protection and waging War against us. HE has plundered our Seas, ravaged our Coasts, burnt our Towns, and destroyed
the Lives of our People. HE is, at this Time, transporting large Armies of foreign Mercenaries to compleat the Works of Death, Desolation, and Tyranny, already begun with
circumstances of Cruelty and Perfidy, scarcely paralleled in the most barbarous Ages, and totally unworthy the Head of a civilized Nation. HE has constrained our fellow Citizens
taken Captive on the high Seas to bear Arms against their Country, to become the Executioners of their Friends and Brethren, or to fall themselves by their Hands. HE has
excited domestic Insurrections amongst us, and has endeavoured to bring on the Inhabitants of our Frontiers, the merciless Indian Savages, whose known Rule of Warfare, is an
undistinguished Destruction, of all Ages, Sexes and Conditions. IN every stage of these Oppressions we have Petitioned for Redress in the most humble Terms: Our repeated
Petitions have been answered only by repeated Injury. A Prince, whose Character is thus marked by every act which may define a Tyrant, is unfit to be the Ruler of a free
People. NOR have we been wanting in Attentions to our British Brethren. We have warned them from Time to Time of Attempts by their Legislature to extend an unwarrantable
Jurisdiction over us. We have reminded them of the Circumstances of our Emigration and Settlement here. We have appealed to their native Justice and Magnanimity, and we
have conjured them by the Ties of our common Kindred to disavow these Usurpations, which, would inevitably interrupt our Connections and Correspondence. They too have
been deaf to the Voice of Justice and of Consanguinity. We must, therefore, acquiesce in the Necessity, which denounces our Separation, and hold them, as we hold the rest of
Mankind, Enemies in War, in Peace, Friends. WE, therefore, the Representatives of the UNITED STATES OF AMERICA, in GENERAL CONGRESS, Assembled, appealing to
the Supreme Judge of the World for the Rectitude of our Intentions, do, in the Name, and by Authority of the good People of these Colonies, solemnly Publish and Declare, That
these United Colonies are, and of Right ought to be, FREE AND INDEPENDENT STATES; that they are absolved from all Allegiance to the British Crown, and that all political
Connection between them and the State of Great-Britain, is and ought to be totally dissolved; and that as FREE AND INDEPENDENT STATES, they have full Power to levy War,
conclude Peace, contract Alliances, establish Commerce, and to do all other Acts and Things which INDEPENDENT STATES may of right do. And for the support of this
Declaration, with a firm Reliance on the Protection of divine Providence, we mutually pledge to each other our Lives, our Fortunes, and our sacred Honor.
MCB140, 12-1-06
2
D. o. I.: 6,810 characters
1 human genome: 484,581 DoI units
Hartwell et al.: 900 pages
1 human genome: 538 Hartwell units
1 Hartwell = 1.25 inches
1 human genome printed at DoI density and
bound into Hartwell units will rise to …
672 inches = 22 feet
or 4.1 Fyodor units
(1 Fyodor unit = 5 feet 11 inches)
MCB140, 12-1-06
3
0.3%
of
the
genome
U. Laemmli
MCB140, 12-1-06
4
C-value paradox
Amount of DNA = f (organism complexity)
1. human (3.3109)> fly > yeast > bacteria
2. Amphibia > > > human
3. Tulip = 10x human [sic!]
4. Amoeba dubia = 200x human
5. Broad bean = 4x kidney bean
6. Lily = 100x Arabidopsis
 unicellular organisms are under selective
pressure to have small genomes
MCB140, 12-1-06
5
How to measure the “complexity”
and composition of a genome
1. Shear the DNA to a size of about 400 bp.
2. Denature the DNA by heating to 100oC.
3. Slowly cool and take samples at different time
intervals.
4. Determine the % single-stranded DNA at each
time point.
The shape of a "Cot" curve for a given species is a
function of two factors:
1. the size or complexity of the genome
2. the amount of repetitive DNA within the genome
http://www.ndsu.nodak.edu/instruct/mcclean/plsc431/eukarychrom/eukaryo3.htm
MCB140, 12-1-06
6
C0t curve
human
http://www.ndsu.nodak.edu/instruct/mcclean/plsc431/eukarychrom/eukaryo3.htm
MCB140, 12-1-06
7
“Why sequence junk?!”
~50% of genome is repetitive DNA
~5% of genome is genes
Genome sequencing costs $1 a base.
= 3.3 billion dollars to sequence the genome
S. Brenner – Fugu (pufferfish) – compact
genome!
MCB140, 12-1-06
8
Say again?
Li et al. Nature 409: 847 (2001).
MCB140, 12-1-06
9
Repeats: ~45% of genome
Lander et al. (2001) Nature 409: 860.
MCB140, 12-1-06
10
Repetitive DNA in the HGO gene (mutation causes alkaptonuria – A. Garrod, 1902)
MCB140, 12-1-06
11
L1 (LINE) – non-LTR
retrotransposon
“L1s account directly or indirectly for about
one-third of the human genome…”
Kazazian and Goodier Cell 110: 277 (2002).
MCB140, 12-1-06
12
Alu (SINE) – 7SL RNA gene
• 7SL – component of the SRP
• Most Alu elements are inactive
• A few Alu elements can still retropose, are
mutagenic, and cause disease:
Wallace et al. (Collins) (1991)
A de novo Alu insertion results in neurofibromatosis type 1
Nature. 1991 Oct 31;353(6347):864-6.
MCB140, 12-1-06
13
The cost of sexual reproduction
• Sexual reproduction favors Tn propagation
because the fitness of a transposon is
twice that of its host
• Positive correlation between dependence
on sex for reproduction and Tn
aggressiveness in germline
• Vertebrates, of course, are obligate sexual
outcrossers
T.H. Bestor (2003) Trends Genet. 19: 185
MCB140, 12-1-06
14
Simple point
Must map genome before sequencing it:
individual sequence read < 1,000 bp
MCB140, 12-1-06
15
“genetic”
MCB140, 12-1-06
16
MCB140, 12-1-06
17
MCB140, 12-1-06
18
MCB140, 12-1-06
19
MCB140, 12-1-06
20
RFLP
VNTR
STR
SNP
STS
EST
MCB140, 12-1-06
21
MCB140, 12-1-06
22
Gross chromosome structure:
G-banding (use Giemsa stain)
= split karyotype into 300 bands
MCB140, 12-1-06
23
MCB140, 12-1-06
24
Dr. Thomas Ried, NCI/NIH:
SKY (spectral karyotyping)
MCB140, 12-1-06
25
Marker:
a DNA sequence that occurs somewhere in the human
genome in a known location relative to other markers.
For a marker to be useful, we need a way to detect it.
MCB140, 12-1-06
26
Restriction
fragment
length
polymorphism
(RFLP)
11.6
MCB140, 12-1-06
27
11.7
MCB140, 12-1-06
28
MCB140, 12-1-06
29
The first map
Botstein D, White RL, Skolnick M, Davis RW. Am J Hum Genet 1980
32(3)
Construction of a genetic linkage map in man using restriction
fragment length polymorphisms.
We describe a new basis for the construction of a genetic linkage map
of the human genome. The basic principle of the mapping scheme is to
develop, by recombinant DNA techniques, random single-copy DNA
probes capable of detecting DNA sequence polymorphisms, when
hybridized to restriction digests of an individual's DNA. Each of these
probes will define a locus. Loci can be expanded or contracted to
include more or less polymorphism by further application of
recombinant DNA technology. Suitably polymorphic loci can be tested
for linkage relationships in human pedigrees by established methods;
and loci can be arranged into linkage groups to form a true genetic map
of "DNA marker loci." Pedigrees in which inherited traits are known to
be segregating can then be analyzed, making possible the mapping of
the gene(s) responsible for the trait with respect to the DNA marker loci,
without requiring direct access to a specified gene's DNA. For inherited
diseases mapped in this way, linked DNA marker loci can be used
predictively for genetic counseling.
MCB140, 12-1-06
30
Construction of a high-resolution
genetic map
MCB140, 12-1-06
31
PCR
Kary B. Mullis
MCB140, 12-1-06
32
VNTRs: variable number tandem repeats
(STR: short tandem repeat -- same, but shorter)
MCB140, 12-1-06
33
11.12
MCB140, 12-1-06
34
11.12
MCB140, 12-1-06
35
Centre d’Etude du Polymorphisme Humain
517 individuals
40 three-generation families
5,264 SSLPs (specifically, STRs)
Genotyped everyone for each one (gasp).
MCB140, 12-1-06
36
MCB140, 12-1-06
37
230/225
227/223
230 / 223
MCB140, 12-1-06
38
5.12
MCB140, 12-1-06
39
MCB140, 12-1-06
40
By ~2001, there were 8,031 STRs.
2-3 markers at ~ every centimorgan (1,000,000 bp!!)
NOT GOOD ENOUGH
MCB140, 12-1-06
41
STS:
sequence-tagged site:
a unique sequence in the human genome.
MCB140, 12-1-06
42
MCB140, 12-1-06
43
Two maps
Genetic map
Physical map
Two genes (loci) on the same
chromosome will become
separated if a recombination
event occurs between them.
Recombination is governed by
rules of meiosis: the genetic
distance between two loci is a
complex function of the actual
distance between two loci.
Nature, shmature.
Shear the DNA randomly (by Xrays): two genes (loci) become
separated if a break occurs
between them.
Shearing is governed by rules of
physics (ahem, the Poisson
distribution = $100) – loci that are
further apart will tend to become
separated more frequently –
physical distance between two
loci measured this way is more
accurate.
MCB140, 12-1-06
44
Radiation hybrid mapping
(construction of a high-resolution
physical map)
MCB140, 12-1-06
45
MCB140, 12-1-06
46
The best part
Control levels of radiation  control
fragment size
CONTROL RESOLUTION OF MAP
Name: radiation  avg. size  resolution
GeneBridge 4 : 3000 rad  25,000,000 bp  1 Mb
Stanford G3: 10,000 rad  2,400,000 bp  0.25 Mb
Stanford TNG: 50,000 rad  ?  < 100 kb
MCB140, 12-1-06
47
MCB140, 12-1-06
48
RH mapping
Science. 2001 Feb 16;291(5507):1298-302.
Our strategy involved an initial electronic analysis of genomic DNA
sequence to eliminate repetitive DNA sequences, followed by an
automated selection of oligonucleotide primers to generate PCR
products 90 to 350 bp in length under a single set of reaction
conditions, as described (9).
PCR products were assayed by ethidium bromide staining after
agarose gel electrophoresis. An STS was judged successful when the
primers produced a distinct PCR product of the expected size from total
human DNA and failed to produce a product of this size from either
hamster or mouse genomic DNA.
We generated a total of 41,234 human STSs that met these criteria. Of
these STSs, 14,953 were scored on rodent-human hybrid somatic cell
mapping panels to determine their chromosomal location (10, 11).
A total of 14,041 of these 14,953 STSs (94%) could be assigned to a
unique human chromosome.
These 14,041 chromosome-specific STSs, as well as the remaining
26,281 STSs not scored on the chromosomal mapping panel, were
used to construct a high-resolution RH map of the human genome as
described below.
MCB140, 12-1-06
49
Generate 41,234 STSs
Assign them all to chromosomes
Make large number of human-hamster RH lines
Genotype each one for each STS
Map them as if this were a cross
MCB140, 12-1-06
50
Result
Can integrate genetic and physical map!!!
MCB140, 12-1-06
51
MCB140, 12-1-06
52
BAC
Bacterial artifical chromosome – large piece
of some other genome that is sustainable
in bacteria.
CHORI:
Split entire genome into BACs and order
BACs by STSs.
MCB140, 12-1-06
53
Fingerprinting
Run 242-well gels.
On each one, 50 marker lanes and 192
BACs, each digested with the same
restriction enzyme – pattern of bands
unique for each BAC.
MCB140, 12-1-06
54
10.10
MCB140, 12-1-06
55
Fingerprint @ CHORI
MCB140, 12-1-06
56
And?
Do 20,000 fingerprints a week.
Now, genotype each BAC for known STSs!!
MCB140, 12-1-06
57
STS
STS
10.7
MCB140, 12-1-06
58
10.11
MCB140, 12-1-06
59
Finally (well, not exactly)…
Once the physical and genetic maps of the
genome have been integrated, the
genome is “broken” down into “small
pieces” and each can be sequenced.
Reassembly of the completed sequence
from the pieces becomes possible
because each piece contains known
markers whose relationship to markers in
other pieces is known.
MCB140, 12-1-06
60
6 years…
Lander et al. (2001) Nature 409: 860.
MCB140, 12-1-06
61
Human Whole-Genome Shotgun Sequencing
“…The crux of our plan involves high-quality,
semiautomated sequencing from both ends of very large
numbers of randomly selected human genomic DNA
fragments. DNA of high molecular weight purified from at
least a few different human donors would be sheared, sizeselected, and cloned into E. coli. Insert sizes would fall into
two classes. Long inserts would be 5-20 kb in size and
would be cloned into plasmid, phage, or possibly cosmid
vectors. Short inserts would be 0.4-1.2 kb in size and would
be cloned into plasmid vectors. Read lengths would be of
sufficient magnitude so that the two sequence reads from
the ends of the short inserts overlap. … Standard, gelbased methods would be utilized to generate at least
30 billion nucleotides of raw sequence (10-fold coverage of
the genome).”
Weber and Myers (1997) Genome Res. 7: 401.
MCB140, 12-1-06
62
The key idea – paired-end reads
“Sequencing from both ends of relatively long
insert subclones is an essential feature of the plan.
… Sequence information from both ends of
relatively long inserts dramatically improves the
efficiency of sequence assembly. In contrast to
single sequence reads from one end of shotgun
subclones, the pairs of sequence reads from both
ends have known spacing and orientation. Use of
relatively long insert subclones also aids in the
assembly of sequences containing interspersed
repetitive elements.”
Weber and Myers (1997) Genome Res. 7: 401.
MCB140, 12-1-06
63
MCB140, 12-1-06
64
twasbrilligandtheslithytovesdidgyreandgimble

Break into small bits and sequence:
wasb
yto
dgim
yrea
(and so on)
How to assemble into complete sequence?
What is the linkage relationship of each bit to every other
one?
What is their relative orientation and distance?
MCB140, 12-1-06
65
twasbrilligandtheslithytovesdidgyreandgimble

Break into large bits:
sbrilligandthe
hytovesdidgyreandgi
wasbrilligandtheslithyto

Read just the end of each large bit:
sb??????????he
hy???????????????gi
wa????????????????????to
Now you know that wa and to are on the same piece of
DNA, and you know their orientation! Apply this info to
the shotgun sequence data.
MCB140, 12-1-06
66
10.13
MCB140, 12-1-06
67
The Celera genome
Nine months. Twenty seven million sequence
reads (5x coverage).
“For our assembly operations, the total compute
infrastructure consists of 10 four-processor SMPs
with 4 gigabytes of memory per cluster (Compaq's
ES40, Regatta) and a 16-processor NUMA
machine with 64 gigabytes of memory (Compaq's
GS160, Wildfire). The total compute for a run of
the assembler was roughly 20,000 CPU hours.”
MCB140, 12-1-06
68
Important
Celera took the unfinished public genome
sequence data, shredded those data, and
used those in the assembly.
MCB140, 12-1-06
69
The key point of dispute
PGP: Celera did not achieve a true “whole
genome shotgun assembly” of the human
genome because they relied too much on
(publicly available) data from the publicly
funded human genome project.
Celera: that is not what happened.
MCB140, 12-1-06
70
PNAS 99: 3712 (2002).
MCB140, 12-1-06
71
PNAS 99: 4145 (2002).
MCB140, 12-1-06
72
There is, unfortunately, a lot of bad blood in
the human genome sequencing “community”
Richard Preston, The New Yorker 6/12/02:
“… “Craig Venter is an *******. He’s an idiot. He is
a thorn in people’s sides and an egomaniac,” a
senior scientist in the Human Genome Project said
to me recently.”
For the record: in my personal opinion, Dr. Venter
is a very gifted scientist and the work by Celera on
the human genome is a major contribution to
biology. = FDU
MCB140, 12-1-06
73
Reading
James Shreve
“The Genome War”
John Sulston
“The Common Thread”
MCB140, 12-1-06
74