Transcript Slide 1
Hybridization capture, high-throughput sequencing and its implications for ancient DNA research Michael Hofreiter Is science becoming infantilized? Our young people are undisciplined and sleazy. They do not listen to their parents anymore. The end of the world is near. Ur, Chaldäa, 2,000 BC Is ancient DNA research infantile? It’s a zebra Higuchi et al. 1984, Nature However........... Watson and Crick 1953 was also a short Nature paper Simple stories are not always bad There are lies for children and lies for adults Terry Pratchett Some more reflections Not all investigations deserve equal respect. Observations alone do not always make sense. What do we really learn from genomic data? How to win a Nobel prize? I don’t know. From no data to drowning in data The latest fancy piece of kit ~ 200 Gb total sequence ~ 1 billion individual reads The latest throughput Increase in sequencing Mammoth Palaeo-Eskimos Neanderthals Data first? So what have we learned from ancient genomes Mammoth genome draft: Hm............ Saqqaq genome: Migrated from Arctic north-east Asia 5,500 B.P. Neanderthal genome draft: Diverged from modern humans ~ 0.4 mya Maybe gene flow into modern human gene pool Genetic regions were selected on human lineage And what did they cost? Mammoth genome draft: ~ $ 800,000 Saqqaq genome: $ 500,000 Neanderthal genome draft: $ 6.4 million The disadvantages Shotgun sequencing Made for a maximum of 8 samples Costs - $ 20,000 per run Another problem Neandertal 4.0% Percentage endogenous DNA Are more data better data? HH SP1050 HH19 cons HH11 cons HH17 cons HH15 cons HH8 cons HH5 cons HH9 cons HH14 cons 30 HH4 cons HH12 cons HH13 cons HH3 cons 56 HH7 cons HH1 cons 10 HH6 cons HH2 cons Hohle Fels1 SP1698 HohleFels 69 SP2083 CezaCos1 14 SP2085 CezaCos2 7 34 Pyrenees young 45 SP1659 ArcyCure4 SP2081 CovaLin2 Cova Linares 1 50 Cova Linares 2 SP2080 CovaLin1 15 SP1662 ArcyCure9 SP1325 Zoolith2 SP1737 Goyet*2742 SP2091 Eiros1 47 12 SP2019 Neandertal Scladina 3800 0 SP1665 ArcyCure12 1 Scladina 3500 5 SP1738 Goyet*2763 65 99 SP1739 Goyet*2836-7 SP1333 Zoolith4 55 SP1326 Zoolith8 27 Gailenreuth HH10 cons SP1330 Zoolith3 7 SP1323 Zoolith7 Balme a Colomb 34 SP1328 Zoolith10 SP1334 Zoolith5 SP1322 Zoolith6 64 NJ tree 123 sequences 250 bp control region 17 SP553 Denisova Altai SP1166 Strashnaya Altai Altai D10 BH4 cons 87 Mt.Generoso 28 Grotta Rota Imagna 57 Ramesch1 64 Ramesch2 Conturines 60 France AJ300177 U.spelaeus Grotte Merveilleuse 32 Pretelange 29 SP1223 Ajdov1 30 SP1263 Ajdov2 44 SP1293 Ajdov3 53 93 SP2060 Kizel1 SP2061 Kizel2 Oase Romania SP1617 67 Oase Romania SP1631 Oase Romania SP1620 73 Oase Romania SP1623 72 34 Oase Romania SP1377 Oase Romania SP1624 45 Oase Romania SP1625 93 SP1324 Zoolith1 SP1327 Zoolith9 39 71 SP1843 DivBab1 26 SP1844 DivBab2 Maehren 29 Oase Romania SP1618 23 Oase Romania SP1622 15 Oase Romania SP1626 Oase Romania SP1374 14 Oase Romania SP1629 Oase Romania SP1630 43 Oase Romania SP1632 Oase Romania SP1619 17 Payjma Ukraine SP1372 Oase Romania SP1628 25 Oase Romania SP1375 6 Oase Romania SP1376 Oase Romania SP1378 SP2024 Molochnyi Kamin TC SP1845 DivBab3 2 SP1850 DivBab8 35 SP1849 DivBab7 SP1847 DivBab5 16 SP1848 DivBab6 SP2062 Bolshoi 21 SP2065 Medvezhiya1 14 34 SP1012 N Ural 12 18 SP2064 Secrets SP764 Nerubajskoe SP2066 Medvezhiya2 29 15 Oase Romania SP1373 Oase Romania SP1627 SP2027 Geissenkl 13ky 21 63 SP2106 Geissenkl 13ky 3 Geissenkloesterle1 Geissenkloesterle2 3 SP1846 DivBab4 VindijaCB6 0 Vindija1585 1 Nixloch 3 AJ300173 U.ingressus Potocka Zijalka 9 SP1006 Yana SP2074 Hovk3 SP2073 Hovk2 98 SP2070 Hovk1 19 SP1009 Kudaro layer2 20 88 Kaukasus 58 SP751 Russ Kaukasus SP1016 Kudaro layer3 73 78 SP1636 N Armenia Brown bear 28 93 90 51 X75863 U.americanus “spelaeus” 99 60 91 “eremus” “ladinicus” “rossicus” “ingressus” “kudarensis” outgroups 64 Condensed NJ tree 56 50% bootstrap cutoff 123 sequences 69 250 bp D-loop 50 65 99 55 99 91 60 87 57 64 60 93 53 93 67 73 72 93 71 90 51 ! 63 51 98 88 58 73 78 HH SP1050 HH19 cons HH11 cons HH17 cons HH15 cons HH8 cons HH5 cons HH9 cons HH14 cons HH4 cons HH12 cons HH13 cons HH3 cons HH7 cons HH1 cons HH6 cons HH2 cons Hohle Fels1 SP1698 HohleFels SP2083 CezaCos1 SP2085 CezaCos2 Pyrenees young SP1659 ArcyCure4 SP2081 CovaLin2 Cova Linares 1 Cova Linares 2 SP2080 CovaLin1 SP1662 ArcyCure9 SP1325 Zoolith2 SP1737 Goyet*2742 SP2091 Eiros1 SP2019 Neandertal Scladina 3800 SP1665 ArcyCure12 Scladina 3500 SP1738 Goyet*2763 SP1739 Goyet*2836-7 SP1333 Zoolith4 SP1326 Zoolith8 Gailenreuth HH10 cons SP1330 Zoolith3 SP1323 Zoolith7 Balme a Colomb SP1328 Zoolith10 SP1334 Zoolith5 SP1322 Zoolith6 SP553 Denisova Altai SP1166 Strashnaya Altai Altai D10 BH4 cons Mt.Generoso Grotta Rota Imagna Ramesch1 Ramesch2 Conturines France AJ300177 U.spelaeus Grotte Merveilleuse Pretelange SP1223 Ajdov1 SP1263 Ajdov2 SP1293 Ajdov3 SP2060 Kizel1 SP2061 Kizel2 Oase Romania SP1617 Oase Romania SP1631 Oase Romania SP1620 Oase Romania SP1623 Oase Romania SP1377 Oase Romania SP1624 Oase Romania SP1625 SP1324 Zoolith1 SP1327 Zoolith9 SP1843 DivBab1 SP1844 DivBab2 Maehren Oase Romania SP1618 Oase Romania SP1622 Oase Romania SP1626 Oase Romania SP1374 Oase Romania SP1629 Oase Romania SP1630 Oase Romania SP1632 Oase Romania SP1619 Payjma Ukraine SP1372 Oase Romania SP1628 Oase Romania SP1375 Oase Romania SP1376 Oase Romania SP1378 SP2024 Molochnyi Kamin TC SP1845 DivBab3 SP1850 DivBab8 SP1849 DivBab7 SP1847 DivBab5 SP1848 DivBab6 SP2062 Bolshoi SP2065 Medvezhiya1 SP1012 N Ural SP2064 Secrets SP764 Nerubajskoe SP2066 Medvezhiya2 Oase Romania SP1373 Oase Romania SP1627 SP2027 Geissenkl 13ky SP2106 Geissenkl 13ky Geissenkloesterle1 Geissenkloesterle2 SP1846 DivBab4 VindijaCB6 Vindija1585 Nixloch AJ300173 U.ingressus Potocka Zijalka SP1006 Yana SP2074 Hovk3 SP2073 Hovk2 SP2070 Hovk1 SP1009 Kudaro layer2 Kaukasus SP751 Russ Kaukasus SP1016 Kudaro layer3 SP1636 N Armenia Brown bear X75863 U.americanus “spelaeus” “eremus” “ladinicus” “rossicus” “ingressus” “kudarensis” outgroups Different PCR types SP1325 Zoolithen cave Ger 90-86-1.0 99-100-1.0 99-100-1.0 100-99-1.0 100-100-1.0 SP2083 A Ceza Sp SP2085 A Ceza Sp SP1659 Arcy Cure Fr EU327344 Chauvet Fr SP2091 Eiros Sp SP1497 Herrmanns cave Ger SP2081 Cova Linares Sp 93-94-1.0 SP1330 Zoolithen cave Ger 99-85-1.0 100-100-1.0 SP1334 Zoolithen cave Ger SP2129 Grotte d’ours Fr Ursus spelaeus Combined NJ, ML and Bayesian tree based on 9,632 bp of 2 published and 31 additional cave bear specimens SP370 Herdengel cave Au 100-40-0.5 100-100-1.0 SP2133 Schneiber cave Ger SP1324 Zoolithen cave Ger SP1844 Divje babe Slo SP1626 Pestera cu Oase Ro 100-84-1.0 100-100-1.0 100-100-1.0 SP1629 Pestera cu Oase Ro SP2125 Medvedia jaskyna Slv 95-87-0.89 61-62-0.89 SP2062 Bolshoi cave Ru SP2065 Medvezhiya cave Ru SP2064 Secrets cave Ru SP1845 Divje babe Slo 58-55-0.98 92-86-1.0 59-59-0.98 97-91-1.0 100-95-1.0 100-100-1.0 SP2027 Geissenkloesterle Ger SP2106 Geissenkloesterle Ger SP232 Nixloch Au SP234 Potocka zijalka Slo SP335 Gamssulzen Au Ursus ingressus 85-91-1.0 SP233 Potocka zijalka Slo SP1850 Divje babe Slo NC011112 Gamssulzen Au 63-63-0.93 SP341 Gamssulzen Au SP2073 Hovk Arm 100-100-1.0 SP2074 Hovk Arm Ursus kudarensis Results of DMPS between 13.0 and 16.5 kb replicated sequence for each of the 31 individuals ~1.0 Mb of targeted aDNA sequence data Requirements for PCR PCR Primer F Min 20BP target Min 30BP Primer R Min 20BP Min molecule length 70BP Frequency Fragment length in ancient DNA ½ fragment size = 2 - 100x number of molecules 30 50 70 Fragment length in BP DNA hybridization capture DNA hybridization capture DNA hybridization capture • ~5Mb Probes Glass slide targeted per array •7 arrays, whole exome •~98% of exons retrieved •300,000 primer pairs for aDNA •6,000 LR-PCRs for modern DNA Ancient DNA capture Science 2010 HH SP1050 HH19 cons HH11 cons HH17 cons HH15 cons HH8 cons HH5 cons HH9 cons HH14 cons 30 HH4 cons HH12 cons HH13 cons HH3 cons 56 HH7 cons HH1 cons 10 HH6 cons HH2 cons Hohle Fels1 SP1698 HohleFels 69 SP2083 CezaCos1 14 SP2085 CezaCos2 7 34 Pyrenees young 45 SP1659 ArcyCure4 SP2081 CovaLin2 Cova Linares 1 50 Cova Linares 2 SP2080 CovaLin1 15 SP1662 ArcyCure9 SP1325 Zoolith2 SP1737 Goyet*2742 SP2091 Eiros1 47 12 SP2019 Neandertal Scladina 3800 0 SP1665 ArcyCure12 1 Scladina 3500 5 SP1738 Goyet*2763 65 99 SP1739 Goyet*2836-7 SP1333 Zoolith4 55 SP1326 Zoolith8 27 Gailenreuth HH10 cons SP1330 Zoolith3 7 SP1323 Zoolith7 Balme a Colomb 34 SP1328 Zoolith10 SP1334 Zoolith5 SP1322 Zoolith6 64 NJ tree 123 sequences 250 bp control region 17 SP553 Denisova Altai SP1166 Strashnaya Altai Altai D10 BH4 cons 87 Mt.Generoso 28 Grotta Rota Imagna 57 Ramesch1 64 Ramesch2 Conturines 60 France AJ300177 U.spelaeus Grotte Merveilleuse 32 Pretelange 29 SP1223 Ajdov1 30 SP1263 Ajdov2 44 SP1293 Ajdov3 53 93 SP2060 Kizel1 SP2061 Kizel2 Oase Romania SP1617 67 Oase Romania SP1631 Oase Romania SP1620 73 Oase Romania SP1623 72 34 Oase Romania SP1377 Oase Romania SP1624 45 Oase Romania SP1625 93 SP1324 Zoolith1 SP1327 Zoolith9 39 71 SP1843 DivBab1 26 SP1844 DivBab2 Maehren 29 Oase Romania SP1618 23 Oase Romania SP1622 15 Oase Romania SP1626 Oase Romania SP1374 14 Oase Romania SP1629 Oase Romania SP1630 43 Oase Romania SP1632 Oase Romania SP1619 17 Payjma Ukraine SP1372 Oase Romania SP1628 25 Oase Romania SP1375 6 Oase Romania SP1376 Oase Romania SP1378 SP2024 Molochnyi Kamin TC SP1845 DivBab3 2 SP1850 DivBab8 35 SP1849 DivBab7 SP1847 DivBab5 16 SP1848 DivBab6 SP2062 Bolshoi 21 SP2065 Medvezhiya1 14 34 SP1012 N Ural 12 18 SP2064 Secrets SP764 Nerubajskoe SP2066 Medvezhiya2 29 15 Oase Romania SP1373 Oase Romania SP1627 SP2027 Geissenkl 13ky 21 63 SP2106 Geissenkl 13ky 3 Geissenkloesterle1 Geissenkloesterle2 3 SP1846 DivBab4 VindijaCB6 0 Vindija1585 1 Nixloch 3 AJ300173 U.ingressus Potocka Zijalka 9 SP1006 Yana SP2074 Hovk3 SP2073 Hovk2 98 SP2070 Hovk1 19 SP1009 Kudaro layer2 20 88 Kaukasus 58 SP751 Russ Kaukasus SP1016 Kudaro layer3 73 78 SP1636 N Armenia Brown bear 28 93 90 51 X75863 U.americanus “spelaeus” 99 60 91 “eremus” “ladinicus” “rossicus” “ingressus” “kudarensis” outgroups The costs Capture array up to 1 million features £ 350 each SureSelect 10 rxns 200 kb – 6.6 Mb £ 6,638 SureSelect 100 rxns 200 kb £ 30,777 SureSelect 1,000 rxns 200 kb £ 107,719 => Home-made solutions => Multiplexing Barcoding So..................... How does it work? Sometimes well Long range versus capture And sometimes not so well Jumping artefacts Clade 1 Clade 2 Clade 3 Possible capture methodologies Methodology Results Problems SureSelect no experience yet high costs Array capture mammoth mtDNA jumping artefacts PEC mammoth nuDNA limited sensitivity high costs 454, biotin adaptors Castor mtDNA length limited 454, biotin UTP Castor mtDNA length limited Illumina, biotin UTP Castor mtDNA length limited jumping artefacts Dynalbeads In solution Capture advantages High sequence yield per sample aliquot Time and work efficient Higher sensitivity than PCR Capture disadvantages High costs Sometimes low on-target ratio Problems with multiplexing Generally jumping artefacts Summary for capture Long term little alternative - if large amounts of data required Also some methods have better sensitivity than PCR Multiplex problems especially for low-complexity data need resolving Currently not suitable for routine applications Methodological development required Some final thoughts How should blank controls be done? And how many? What does contamination mean when you have 20 million sequence reads? How shall we replicate the data? Is independent replication possible? And is it necessary? Thanks Molecular Ecology • Many people • Adrian Briggs, Harvard Medical school • Kevin Campbell, University of Manitoba • Research Group Molecular Ecology • Sequencing group in Leipzig • MPG, DFG and Volkswagen foundation for money • University of York • For your attention