Potentials and limits of haplotype trees in exploring population structure and pathogenicity of mutations Hans-Jürgen Bandelt (Hamburg) 17.
Download ReportTranscript Potentials and limits of haplotype trees in exploring population structure and pathogenicity of mutations Hans-Jürgen Bandelt (Hamburg) 17.
Potentials and limits of haplotype trees in exploring population structure and pathogenicity of mutations Hans-Jürgen Bandelt (Hamburg) 17. Jahrestagung der Deutschen Gesellschaft für Humangenetik Heidelberg, 08.–11. März 2006 Human mtDNA HVS-I alias HVR1 from MITOMAP The perception of evolution as seen through the lenses of laboratories constitutes an overlay of two different processes: Perceived evolution = Natural evolution (of the genome) + Artificial evolution (in the lab) mtDNA and evolution α: Natural evolution Migrational processes (prehistory) ML tree of basal African mtDNA haplogroups 200,000 Time (years) L 3516A 5442 9042 9347 10589 10664 10915 13276 L1’5 = L1’2’3’4’5’6’7 2758 2885 7146 8468 13105 3666 7055 7389 13789 14178 14560 150,000 L0 L2’5= L2’3’4’5’6’7 L1 4586 9818 3423 7972 12432 12950 2395d 5951 6071 8027 9072 10586 12810 13485 14000A 14911 L0ak 2245 5603 11641 15136 15431 L5 L2’6 = L2’3’4’6’7 100,000 2416 8206 9221 10115 13590 10321 9545 9554 13116 1598 2220 5162 5899+C 6962 10031 11164 11252 11959 12477 12540 15929 709 851 930 1822 4496 5004 5111 5147 5656 6182 6297 7424 7873 8155 8188 8582 8754 9305 9329 9899 11015 11025 11881 12236 13105 13722 14212 14239 14581 14905 14971 15217 15884 1 2 3 L1c1’2 12049 13149 L1c2 6150 6253 7076 7337 8784 8877 10792 10793 11654 L0a 50,000 5147 5711 6257 8460 9bp-del 11172 825A 8655 10688 10810 13506 15301 L1c L0af 5231 5460 8428 8566 11176 12720 14308 Coding-region variation displayed Torroni et al. (TIG, June 2006) 1048 4312 6185 9755 11914 12007 L1c2a L0a2 11143 14755 L0a2a L2 4104 7521 L4’6 = L3’4’6’7 3594 7256 13650 709 770 961 13710 15289 15499 L3’7 = L3’4’7 769 1018 3693 L2d Ethiopian samples . L6 L7 0 2417G 3027 3720 4976 5213 8152 9809C 10493 11065 11260 11701 12188 12215 12546T 12714 12810 13569 13830 15383 870 2159 2332 3254A 3434 6231 8856 9130A 9554 9941 10700 10955 11353 11944 12630 13239 14845 15263 15458 15703 15777C 965+3C 1461 4964 5267 6002 6284 9332 10978 11116 11743 12405 12714 12771 14533A 14791 14959 15244 3357 5460 6167 7376 7762 7775 8473 8631 8697 10373 11253 11344 11485 11653 12280 12414 13174 13344 14000A 14302 4 5 6 7 3918 8104 9855 12609 13470 L4 9 L3bd = L3bcd 5147 7424 8618 13886 14284 L3ex = L3eix 3483 6401 8311 8817 13708 3435 3621 648 723 5894+T 6392 1413 7129 5471 8041 5580 8197 5746 8928 10750 9941 14182 12340 14861 14034 8 L3 13105 10819 7645 14040 14395 5186 14905 3459 5046 5605 6272 6680 6842 1193 3441 5211 5581 9477 10373 11002 15299 745+T 1719 1842 5821 9365 15314 15479 1822 3666 7819A 8527 8932 11440 14769 14 15 16 17 18 19 20 21 L3d1 L3b L3e 750 2158 8598 10679 11260 13687 13800A L3e5 5899+C 14750 15172 5441 8222 12630 14818 15388 15944d 10 11 12 13 4715 8392 12561 15367 10400 14783 15043 M 5601 9950 3197 3693 4048 4350 5194 7270 8853 12507 12634 14148 15106 15952 L3d L3a L3h L3f 959 1692 4643 5181 6293 6480 6602 8158 8251 8400 9932 10604 11176 11770 14590 15940 2352 14212 L3i L3c 678 792 3582 4491 5393 7394 8835 9337 9682 11944 12373 14221 14371 14560 14587 15833 921 L3x 3450 5773 6221 9449 10086 13914A 15311 15824 15944d 7861 9575 3396 4218 15514 15944d L3f1 1719 2831 3777 4388 4859 5300 7055 8767 9509 9827 10044 10289 11563 11590 11963 14410 2707 3879 4122 5147 5460 5567 5813 5930 8020 9098 9254 9380 9965 11440 12469 13080 13755 721 2357 5310 10184 10314 12618 12816 13443 13708 14461 14566 14851 15553 22 23 24 6446 6680 12403 12950C 14110 M1 813 3604 3705 4375 4793 6671 12346 13635 15514 25 8701 9540 10398 10873 15301 N 12705 R 750 1438 2706 4769 7028 8860 11719 14766 15326 rCRS One of the first views of the East Asian mtDNA phylogeny (Ozawa, Herz 1994) incorrect rooting all mutations that distinguish haplogroups M and R (part of N) CRS R M Star-burst of autochthonous mtDNA lineages in Eurasia (haplogroup N and its subhaplogroup R) R5 U preHV JT W R2 N1 R1 X R6 N5 R7 N R R8 R30 A 9140 6755 8404 15607 R31 N9 West Eurasia South Asia R9 R11 B P O S East Asia Oceania Palanichamy et al (Amer J Hum Genet, 2004) ... and a massive burst in haplogroup M, as e.g. seen in India: Sun et al (Mol Biol Evol, March 2006) An Out-of-Africa model based on mtDNA analysis Kivisild et al (Springer-Verlag, April 2006) Sketch of the phylogeny of basal European mtDNA haplogroups N R JT U X R0 = pre-HV W R0a = (pre-HV)1 HV N1 N1b N1a’I N1a I H HV0 = pre-V HV0a N2 H1 H3 V Torroni et al (TIG, June 2006) Spatial frequency distributions of haplogroups H1, H3, V, and U5b reveal signature of post-LGM expansions Torroni et al (TIG, June 2006) mtDNA and evolution β: Artificial evolution Laboratory-specific processes (error and fraud) Major sources of error in mtDNA sequence data Artificial Recombination through contamination or sample mix-up (or targeting nuclear inserts of mtDNA) Phantom mutations sequencing errors at electrophoresis Documentation errors incurred by casual reading or writing Impurifying selection is the driving force in artificial evolution inasmuch as incorrect data are more flexible to interpret and can support sexy stories — seemingly told by DNA — which are then disseminated by high-impact factor journals (e.g. Science and Nature). Worst case: mtDNA in cancer research (Salas et al, PLoS Medicine 2005) Case of mtDNA sample mix-up, mis-interpreted as somatic mutations; data generated with MitoChip by Maitra et al (Genome Res, 2004) Data re-analysis by Bandelt et al (J Med Genet, 2005) A case of cross-over in the 672 human complete mtDNA sequences from Tanaka et al (2004) NDsq0167 NDsq0178 15618 200 195 rCRS R F F1 F1a F1a1 14002 63 F1a’c 64 13759 16162 9548 12882 12406 16172 4086 R9 16129 9053 13928C 16519 10609 6962 522-523d 12705 10310 6392 249d L3 16304 3970 M N 16223 M7 15301 10873 10398 9540 8701 16209 4958 4386 2772 2626 12771 15043 14783 10400 489 M7a 9824 6455 16519 16140 15422 8005 5899+C 4435 2218 965+CC 961 249 10410 @9824 F1a1b @6455 965.2+CC NDsq0015 NDsq0168 F B D A 1 F C 3000 6000 E 9000 12000 15000 NDsq0168 M7a 2 F1a1b M7a NDsq0167 F1a1b M7a F1a1b 16569 Prime example of a phantom mutation (Brandstätter et al, Electrophoresis 2005) Electropherogram from Nasidze and Stoneking (2001) generated 1997 / 1998 and for the first time presented in Stoneking and Nasidze (Ann Hum Genet, 2006) rCRS Phantom mutations can be found in excess in the HVS-I Caucasus data of Nasidze and Stoneking (2001). In view of additional problems, this may be regarded as the worst data set ever published in the realm of molecular anthropology; see Bandelt and Kivisild (Ann Hum Genet 2006) for data re-analysis Sequences with phantom transitions at 16280-16281 in those Caucasus data Code Mutation (16000+) Haplogroup AR31 AR483 AZ2 AZ342 AZ6 CH444 CH451 DAR23 DAR36 KAB408 067 279G 280 281 355 069 126 145 280 281 367C 280 281 280 281 298 154 168A 280 281 356 384 111 214G 249 280 281 327 388 280 281 292 129 223 278 280 281 258 280 281 384 224 280 281 311 HV1 J ? pre-V ? U1b ? ? ? K This mutation pair has never been observed in >40,000 HVS-I sequences! Electropherogram presented by Stoneking and Nasidze (Ann Hum Genet, 2006) rCRS Phantom mutations in the HVS-I data of Plaza et al (Ann Hum Genet, 2003) (267 samples) Sample Mutation (16000+) Haplogroup Algeria Andalusia Andalusia Andalusia Catalonia Catalonia Morroco Morroco Morroco Morroco Morroco Morroco Saharawi Saharawi Saharawi 279N 285N 129 182C 183C 189 223 249 311 359 371 129 281 281 093 192 270 281 290A 304 311 224 281 311 093 224 242 311 371 124 223 284C 285T 300 319 374T 126 187 189 223 264 270 278 293 311 371 374 126 284C 292 294 183C 189 223 278 382G 189 192 270 369T 093 172 185 223 327 382G 172 281 311 189 382G ? M1 ? ? U5b K K L2d L1b T2 X U5b L3e1 U6? ? Comparison with 1624 complete sequences stored in the mtDB database Variation in 16279-16285: Only 20 transitional variants at 16284 Variation in 16369-16389: Only 1+1+6 transitional variants at 16371, 16380, and 16381 Re-evaluation of the mtDNA data from the lab of Min-Xin Guan rCRS rCRS WH6967 4 12 15 Qu 2005 WZ4 BJ101 7 QJ383 16265 8270G 16227 16093 5885 15910 15784 5442 13044 10988C 5076 11914 16218A 10980G 1555 10398 14989 16140 10873 495G 4802 10754 14314 5 15326 3535 5773 11914 16519 10894 489 15 5 8860 10427 150 2392 8 7 750 523-524d 11778 16304G 151 3 315+C 10325 15784 12 263 9150 13928 F3b 9021 11778 8167 204 16220C 2389 9947 H2 523-524d 8281-8289d 16519 152 18 7 4769 F3 12 8 7 1438 B5a2 16362 6960 F1 16304 4 3540 16298 H 15 11065 B5a 12882 10320 12406 16266A 17 7 7028 5978 10609 17 8 7 15235 5913 2706 6962 3537 15 5585 210 3434 HV 18 17 16 9 10 11 12 13 15 8 7 6 5 4 2 1 14766 pre-HV 8 7 11719 73 17 B5 10310 6392 249d 12 13928C 12 3970 R R 7 16223 12705 2 13 16298 16189 13928C 1555 495 204 199 184 N9a KAsq 0089 BJ104 14384 207 15043C 13182 11778 1978 16291 15930 15244 14605 523-524d 16 5417 15301 17 13 10873 1 10398 9540 8701 14 15758 12468 10742G 10640 10589 8634 6710 3423 1 6 M10a D4b2 D4a 9824A 8964 1382C 17 16129 14979 8473 3206 152 8020 15218 10646 8856 7250 3172+C D 16362 13 2 5178A 4883 16519 14978 12957 7853 6338 1 5987 5821 4047 146 16173 15327 11914 11410 200 151 14180 9667 9383 10 11 #078 #081 HNsq SD10324 0152 16292A 16189 16497 16167 16265T 15236 14488 13928T 12477 11860A 2361 10658 16092 10235 14569 8602 11350 11935 2885 200 9554 15924 2238 146 14869 11926 200 154 1719 152 16311 M11b 198 146 14790 13890 10685 16172 C M11a M10 D4 5 14668 18 8414 9 16129 16311 16093 16217 13135 15930 11778 7982 11257 1719 9966 5897 8821G 4454 6357 1555 3866 523-524d 1555 9296 18 194 16311 15071 15040 14502 13152 12549 8793 4140 709 14 573+C 16327 1 14318 13263 11914 9545 3552A M N N missing mutations 17 BJ103 WZ6 BJ105 LN7710 Wang GD7817 Miao #101 2005 271 16519 7444 3324 16519 1811 15236 217 7511 10410 D4b2b 3010 N9 3 Yuan 2005 D4a1 D4b 16261 16257A 12372 12358 5231 150 5 Li 2005 1555 16519 13 1494 18 BJ106 WZ5 16129 16111 12007 16 4386 16319 7 16290 8794 4824 8 7 4248 1736 663 235 12 16304 16189 8281-8289d 16362 14075C 13856A 11718 10873 11639C 10640 4247 9443 2572G 8532 1709 16294 5046 16390 961+C 14776 1555 16291 13287 2736 8567 1555 8551 961+C 4257 3687 N9a1 1168d 654 A R9 B 16 Zhao WH6980 BJ102 2004 16362 523-524d F 16140 10398 4 9950 8584 709 8 Li 2004 CZ 249d M8 16298 15487T 8584 7196A 4715 10 14340 M11 13074 11969 9 9950 8108 7642 6 6531 1095 10 326 318 215 146 13 misscored mutations in red L3 15043 14783 17 1 10400 14 5 3 489 Yao et al (Hum Genet, 2006) Strategies of authors to deal with errors 1st: Publishing a corrigendum [rare event] 2nd: No correction — but avoiding similar errors in future work [common practice] 3rd: No action — and committing the same errors as before [e.g. as Min-Xin Guan and colleagues do] 4th: Fraudulent action — performing fake analyses and giving false statements [as done by Mark Stoneking and Ivane Nasidze in the Ann Hum Genet] ... only L strand, no H strand information shown! Stoneking and Nasidze (2006) Human Mitochondrial DNA and the Evolution of Homo sapiens Series: Nucleic Acids and Molecular Biology, Vol.18 Volume package: Human Mitochondrial DNA Bandelt, Hans-Jürgen; Richards, Martin; Macaulay, Vincent (Eds.) 2006, Approx. 250 p., 31 illus., 2 in colour., Hardcover ISBN: 3-540-31788-0 Springer-Verlag Due: April 2006