Transcript Slide 1

Hybridization capture, high-throughput sequencing and its
implications for ancient DNA research
Michael Hofreiter
Is science becoming infantilized?
Our young people are undisciplined and sleazy. They do not listen
to their parents anymore. The end of the world is near.
Ur, Chaldäa, 2,000 BC
Is ancient DNA research infantile?
It’s a zebra
Higuchi et al. 1984, Nature
However...........
Watson and Crick 1953 was also a short Nature paper
Simple stories are not always bad
There are lies for children and lies for adults
Terry Pratchett
Some more reflections
Not all investigations deserve equal respect.
Observations alone do not always make sense.
What do we really learn from genomic data?
How to win a Nobel prize?
I don’t know.
From no data to drowning in data
The latest fancy piece of kit
~ 200 Gb total sequence
~ 1 billion individual reads
The latest throughput
Increase in sequencing
Mammoth
Palaeo-Eskimos
Neanderthals
Data first?
So what have we learned from ancient genomes
Mammoth genome draft:
Hm............
Saqqaq genome:
Migrated from Arctic north-east Asia 5,500 B.P.
Neanderthal genome draft:
Diverged from modern humans ~ 0.4 mya
Maybe gene flow into modern human gene pool
Genetic regions were selected on human lineage
And what did they cost?
Mammoth genome draft:
~ $ 800,000
Saqqaq genome:
$ 500,000
Neanderthal genome draft:
$ 6.4 million
The disadvantages
Shotgun sequencing
Made for a maximum of 8 samples
Costs - $ 20,000 per run
Another problem
Neandertal 4.0%
Percentage endogenous DNA
Are more data better data?
HH SP1050
HH19 cons
HH11 cons
HH17 cons
HH15 cons
HH8 cons
HH5 cons
HH9 cons
HH14 cons
30
HH4 cons
HH12 cons
HH13 cons
HH3 cons
56
HH7 cons
HH1 cons
10
HH6 cons
HH2 cons
Hohle Fels1
SP1698 HohleFels
69 SP2083 CezaCos1
14
SP2085 CezaCos2
7
34
Pyrenees young
45 SP1659 ArcyCure4
SP2081 CovaLin2
Cova Linares 1
50 Cova Linares 2
SP2080 CovaLin1
15 SP1662 ArcyCure9
SP1325 Zoolith2
SP1737 Goyet*2742
SP2091 Eiros1
47 12
SP2019 Neandertal
Scladina 3800
0 SP1665 ArcyCure12
1 Scladina 3500
5
SP1738 Goyet*2763
65
99 SP1739 Goyet*2836-7
SP1333 Zoolith4
55
SP1326 Zoolith8
27
Gailenreuth
HH10 cons
SP1330 Zoolith3
7
SP1323 Zoolith7
Balme a Colomb
34 SP1328 Zoolith10
SP1334 Zoolith5
SP1322 Zoolith6
64
NJ tree
123 sequences
250 bp control region
17
SP553 Denisova Altai
SP1166 Strashnaya Altai
Altai D10
BH4 cons
87 Mt.Generoso
28
Grotta Rota Imagna
57 Ramesch1
64 Ramesch2
Conturines
60 France
AJ300177 U.spelaeus Grotte Merveilleuse
32
Pretelange
29
SP1223 Ajdov1
30
SP1263 Ajdov2
44
SP1293 Ajdov3
53
93 SP2060 Kizel1
SP2061 Kizel2
Oase Romania SP1617
67 Oase Romania SP1631
Oase Romania SP1620
73
Oase Romania SP1623
72
34
Oase Romania SP1377
Oase Romania SP1624
45 Oase Romania SP1625
93
SP1324 Zoolith1
SP1327 Zoolith9
39
71 SP1843 DivBab1
26
SP1844 DivBab2
Maehren
29
Oase Romania SP1618
23 Oase Romania SP1622
15 Oase Romania SP1626
Oase Romania SP1374
14
Oase Romania SP1629
Oase Romania SP1630
43
Oase Romania SP1632
Oase Romania SP1619
17
Payjma Ukraine SP1372
Oase Romania SP1628
25
Oase Romania SP1375
6
Oase Romania SP1376
Oase Romania SP1378
SP2024 Molochnyi Kamin TC
SP1845 DivBab3
2
SP1850 DivBab8
35
SP1849 DivBab7
SP1847 DivBab5
16
SP1848 DivBab6
SP2062 Bolshoi
21
SP2065 Medvezhiya1
14
34
SP1012 N Ural
12
18 SP2064 Secrets
SP764 Nerubajskoe
SP2066
Medvezhiya2
29
15
Oase Romania SP1373
Oase Romania SP1627
SP2027 Geissenkl 13ky
21
63 SP2106 Geissenkl 13ky
3
Geissenkloesterle1
Geissenkloesterle2
3
SP1846 DivBab4
VindijaCB6
0
Vindija1585
1
Nixloch
3
AJ300173 U.ingressus Potocka Zijalka
9
SP1006 Yana
SP2074 Hovk3
SP2073 Hovk2
98
SP2070 Hovk1
19
SP1009 Kudaro layer2
20
88 Kaukasus
58
SP751 Russ Kaukasus
SP1016 Kudaro layer3
73
78 SP1636 N Armenia
Brown bear
28
93
90
51
X75863 U.americanus
“spelaeus”
99
60
91
“eremus”
“ladinicus”
“rossicus”
“ingressus”
“kudarensis”
outgroups
64
Condensed NJ tree
56
50% bootstrap cutoff
123 sequences
69
250 bp D-loop
50
65
99
55
99
91
60
87
57
64
60
93
53
93
67
73
72
93
71
90
51 !
63
51
98
88
58
73
78
HH SP1050
HH19 cons
HH11 cons
HH17 cons
HH15 cons
HH8 cons
HH5 cons
HH9 cons
HH14 cons
HH4 cons
HH12 cons
HH13 cons
HH3 cons
HH7 cons
HH1 cons
HH6 cons
HH2 cons
Hohle Fels1
SP1698 HohleFels
SP2083 CezaCos1
SP2085 CezaCos2
Pyrenees young
SP1659 ArcyCure4
SP2081 CovaLin2
Cova Linares 1
Cova Linares 2
SP2080 CovaLin1
SP1662 ArcyCure9
SP1325 Zoolith2
SP1737 Goyet*2742
SP2091 Eiros1
SP2019 Neandertal
Scladina 3800
SP1665 ArcyCure12
Scladina 3500
SP1738 Goyet*2763
SP1739 Goyet*2836-7
SP1333 Zoolith4
SP1326 Zoolith8
Gailenreuth
HH10 cons
SP1330 Zoolith3
SP1323 Zoolith7
Balme a Colomb
SP1328 Zoolith10
SP1334 Zoolith5
SP1322 Zoolith6
SP553 Denisova Altai
SP1166 Strashnaya Altai
Altai D10
BH4 cons
Mt.Generoso
Grotta Rota Imagna
Ramesch1
Ramesch2
Conturines
France
AJ300177 U.spelaeus Grotte Merveilleuse
Pretelange
SP1223 Ajdov1
SP1263 Ajdov2
SP1293 Ajdov3
SP2060 Kizel1
SP2061 Kizel2
Oase Romania SP1617
Oase Romania SP1631
Oase Romania SP1620
Oase Romania SP1623
Oase Romania SP1377
Oase Romania SP1624
Oase Romania SP1625
SP1324 Zoolith1
SP1327 Zoolith9
SP1843 DivBab1
SP1844 DivBab2
Maehren
Oase Romania SP1618
Oase Romania SP1622
Oase Romania SP1626
Oase Romania SP1374
Oase Romania SP1629
Oase Romania SP1630
Oase Romania SP1632
Oase Romania SP1619
Payjma Ukraine SP1372
Oase Romania SP1628
Oase Romania SP1375
Oase Romania SP1376
Oase Romania SP1378
SP2024 Molochnyi Kamin TC
SP1845 DivBab3
SP1850 DivBab8
SP1849 DivBab7
SP1847 DivBab5
SP1848 DivBab6
SP2062 Bolshoi
SP2065 Medvezhiya1
SP1012 N Ural
SP2064 Secrets
SP764 Nerubajskoe
SP2066 Medvezhiya2
Oase Romania SP1373
Oase Romania SP1627
SP2027 Geissenkl 13ky
SP2106 Geissenkl 13ky
Geissenkloesterle1
Geissenkloesterle2
SP1846 DivBab4
VindijaCB6
Vindija1585
Nixloch
AJ300173 U.ingressus Potocka Zijalka
SP1006 Yana
SP2074 Hovk3
SP2073 Hovk2
SP2070 Hovk1
SP1009 Kudaro layer2
Kaukasus
SP751 Russ Kaukasus
SP1016 Kudaro layer3
SP1636 N Armenia
Brown bear
X75863 U.americanus
“spelaeus”
“eremus”
“ladinicus”
“rossicus”
“ingressus”
“kudarensis”
outgroups
Different PCR types
SP1325 Zoolithen cave Ger
90-86-1.0
99-100-1.0
99-100-1.0
100-99-1.0
100-100-1.0
SP2083 A Ceza Sp
SP2085 A Ceza Sp
SP1659 Arcy Cure Fr
EU327344 Chauvet Fr
SP2091 Eiros Sp
SP1497 Herrmanns cave Ger
SP2081 Cova Linares Sp
93-94-1.0
SP1330 Zoolithen cave Ger
99-85-1.0
100-100-1.0
SP1334 Zoolithen cave Ger
SP2129 Grotte d’ours Fr
Ursus spelaeus
Combined NJ, ML and Bayesian tree
based on 9,632 bp of 2 published and 31
additional cave bear specimens
SP370 Herdengel cave Au
100-40-0.5
100-100-1.0
SP2133 Schneiber cave Ger
SP1324 Zoolithen cave Ger
SP1844 Divje babe Slo
SP1626 Pestera cu Oase Ro
100-84-1.0
100-100-1.0
100-100-1.0
SP1629 Pestera cu Oase Ro
SP2125 Medvedia jaskyna Slv
95-87-0.89
61-62-0.89
SP2062 Bolshoi cave Ru
SP2065 Medvezhiya cave Ru
SP2064 Secrets cave Ru
SP1845 Divje babe Slo
58-55-0.98
92-86-1.0 59-59-0.98
97-91-1.0
100-95-1.0
100-100-1.0
SP2027 Geissenkloesterle Ger
SP2106 Geissenkloesterle Ger
SP232 Nixloch Au
SP234 Potocka zijalka Slo
SP335 Gamssulzen Au
Ursus ingressus
85-91-1.0
SP233 Potocka zijalka Slo
SP1850 Divje babe Slo
NC011112 Gamssulzen Au
63-63-0.93
SP341 Gamssulzen Au
SP2073 Hovk Arm
100-100-1.0
SP2074 Hovk Arm
Ursus
kudarensis
Results of DMPS
between 13.0 and 16.5 kb
replicated sequence for
each of the 31 individuals
~1.0 Mb of targeted aDNA
sequence data
Requirements for PCR
PCR
Primer F
Min
20BP
target
Min
30BP
Primer R
Min
20BP
Min molecule length
70BP
Frequency
Fragment length in ancient DNA
½ fragment size = 2 - 100x number of
molecules
30 50 70
Fragment length in BP
DNA hybridization capture
DNA hybridization capture
DNA hybridization capture
• ~5Mb
Probes
Glass slide
targeted per array
•7 arrays, whole exome
•~98% of exons retrieved
•300,000 primer pairs for aDNA
•6,000 LR-PCRs for modern DNA
Ancient DNA capture
Science 2010
HH SP1050
HH19 cons
HH11 cons
HH17 cons
HH15 cons
HH8 cons
HH5 cons
HH9 cons
HH14 cons
30
HH4 cons
HH12 cons
HH13 cons
HH3 cons
56
HH7 cons
HH1 cons
10
HH6 cons
HH2 cons
Hohle Fels1
SP1698 HohleFels
69 SP2083 CezaCos1
14
SP2085 CezaCos2
7
34
Pyrenees young
45 SP1659 ArcyCure4
SP2081 CovaLin2
Cova Linares 1
50 Cova Linares 2
SP2080 CovaLin1
15 SP1662 ArcyCure9
SP1325 Zoolith2
SP1737 Goyet*2742
SP2091 Eiros1
47 12
SP2019 Neandertal
Scladina 3800
0 SP1665 ArcyCure12
1 Scladina 3500
5
SP1738 Goyet*2763
65
99 SP1739 Goyet*2836-7
SP1333 Zoolith4
55
SP1326 Zoolith8
27
Gailenreuth
HH10 cons
SP1330 Zoolith3
7
SP1323 Zoolith7
Balme a Colomb
34 SP1328 Zoolith10
SP1334 Zoolith5
SP1322 Zoolith6
64
NJ tree
123 sequences
250 bp control region
17
SP553 Denisova Altai
SP1166 Strashnaya Altai
Altai D10
BH4 cons
87 Mt.Generoso
28
Grotta Rota Imagna
57 Ramesch1
64 Ramesch2
Conturines
60 France
AJ300177 U.spelaeus Grotte Merveilleuse
32
Pretelange
29
SP1223 Ajdov1
30
SP1263 Ajdov2
44
SP1293 Ajdov3
53
93 SP2060 Kizel1
SP2061 Kizel2
Oase Romania SP1617
67 Oase Romania SP1631
Oase Romania SP1620
73
Oase Romania SP1623
72
34
Oase Romania SP1377
Oase Romania SP1624
45 Oase Romania SP1625
93
SP1324 Zoolith1
SP1327 Zoolith9
39
71 SP1843 DivBab1
26
SP1844 DivBab2
Maehren
29
Oase Romania SP1618
23 Oase Romania SP1622
15 Oase Romania SP1626
Oase Romania SP1374
14
Oase Romania SP1629
Oase Romania SP1630
43
Oase Romania SP1632
Oase Romania SP1619
17
Payjma Ukraine SP1372
Oase Romania SP1628
25
Oase Romania SP1375
6
Oase Romania SP1376
Oase Romania SP1378
SP2024 Molochnyi Kamin TC
SP1845 DivBab3
2
SP1850 DivBab8
35
SP1849 DivBab7
SP1847 DivBab5
16
SP1848 DivBab6
SP2062 Bolshoi
21
SP2065 Medvezhiya1
14
34
SP1012 N Ural
12
18 SP2064 Secrets
SP764 Nerubajskoe
SP2066
Medvezhiya2
29
15
Oase Romania SP1373
Oase Romania SP1627
SP2027 Geissenkl 13ky
21
63 SP2106 Geissenkl 13ky
3
Geissenkloesterle1
Geissenkloesterle2
3
SP1846 DivBab4
VindijaCB6
0
Vindija1585
1
Nixloch
3
AJ300173 U.ingressus Potocka Zijalka
9
SP1006 Yana
SP2074 Hovk3
SP2073 Hovk2
98
SP2070 Hovk1
19
SP1009 Kudaro layer2
20
88 Kaukasus
58
SP751 Russ Kaukasus
SP1016 Kudaro layer3
73
78 SP1636 N Armenia
Brown bear
28
93
90
51
X75863 U.americanus
“spelaeus”
99
60
91
“eremus”
“ladinicus”
“rossicus”
“ingressus”
“kudarensis”
outgroups
The costs
Capture array
up to 1 million features
£ 350 each
SureSelect 10 rxns
200 kb – 6.6 Mb
£ 6,638
SureSelect 100 rxns
200 kb
£ 30,777
SureSelect 1,000 rxns
200 kb
£ 107,719
=> Home-made solutions
=> Multiplexing
Barcoding
So.....................
How does it work?
Sometimes well
Long range versus capture
And sometimes not so well
Jumping artefacts
Clade 1
Clade 2
Clade 3
Possible capture methodologies
Methodology
Results
Problems
SureSelect
no experience yet
high costs
Array capture
mammoth mtDNA
jumping artefacts
PEC
mammoth nuDNA
limited sensitivity
high costs
454, biotin adaptors
Castor mtDNA
length limited
454, biotin UTP
Castor mtDNA
length limited
Illumina, biotin UTP
Castor mtDNA
length limited
jumping artefacts
Dynalbeads
In solution
Capture advantages
High sequence yield per sample aliquot
Time and work efficient
Higher sensitivity than PCR
Capture disadvantages
High costs
Sometimes low on-target ratio
Problems with multiplexing
Generally jumping artefacts
Summary for capture
Long term little alternative - if large amounts of data required
Also some methods have better sensitivity than PCR
Multiplex problems especially for low-complexity data need resolving
Currently not suitable for routine applications
Methodological development required
Some final thoughts
How should blank controls be done? And how many?
What does contamination mean when you have 20 million sequence reads?
How shall we replicate the data?
Is independent replication possible? And is it necessary?
Thanks
Molecular
Ecology
•
Many people
•
Adrian Briggs, Harvard Medical school
•
Kevin Campbell, University of Manitoba
•
Research Group Molecular Ecology
•
Sequencing group in Leipzig
•
MPG, DFG and Volkswagen foundation for money
•
University of York
•
For your attention