Transcript Document
Phylogenetic reconstruction using secondary structures and sequence motifs of ITS2 rDNA of Paragonimus westermani (Kerbert, 1878) Braun, 1899 (Digenea: Paragonimidae) and related species P. K. Prasad1, V. Tandon1, D. K. Biswal2, L. M. Goswami1 and A. Chatterjee3 1Departments of Zoology, 3Biotechnology & Bioinformatics and 2Bioinformatics Centre, North-Eastern Hill University, Shillong, 793022 Email: - [email protected] 8th International Conference 0n Bioinformatics, 7 – 11 September 2009, Biopolis, Singapore Paragonimus • • • Zoonotic lung fluke having diversified effect on its final host. Over 40 species infect lungs of different mammalian hosts. ~15 species known to infect humans. Egg 80-110×48-60µ • • Encysted metacercaria ~300- 400µ Pre-adult Adult: 7.5-12 × 4-6×1-3mm (l: w = 2:1) P. westermani (Kerbert, 1878) - Best known species, human parasite that can undergo development in >16 different snail species and 50 crustacean species. Status of prevalence and host range in Indianot well documented. 8th International Conference 0n Bioinformatics, 7 – 11 September 2009, Biopolis, Singapore Distribution of human paragonimiasis Species of Paragonimus are encountered in Asia, the Americas, and Africa. 8th International Conference 0n Bioinformatics, 7 – 11 September 2009, Biopolis, Singapore Distribution in India 8th International Conference 0n Bioinformatics, 7 – 11 September 2009, Biopolis, Singapore Paragonimus species: India 1. P. westermani (Kerbert,1878)- Bengal tiger, Amsterdam Zoo collection from India, Indonesia, China [ Syn. P. edwardsi Gulati, 1926- civet] 2. P. compactus (Cobbold, 1859)- Herpestes edwardsii India (Vevers,1923; Ravikumar et al.,1979); Sri Lanka (Dissanaike and Paramananthan, 1962) 3. P. heterotremus Chen and Hsia, 1964- China, Thailand, Laos, Vietnam • Arunachal Pradesh (Narain et al. 2003) • Manipur (Singh,1996) 4. P. hueitungensis Chung et al.,1975- China • Manipur (Singh, 2002) 5. P. mungoi, P. pantheri – Orissa (Mishra et al.,1976)- nomen nudem 8th International Conference 0n Bioinformatics, 7 – 11 September 2009, Biopolis, Singapore Life Cycle 1. Infective stage: metacercaria 2. Infective mode: eating raw fresh water crabs and crayfish with metacercariae 3. Infective route: by mouth 4. Site of inhabitation: lungs 5. Intermediate hosts: 1st – snail; 2nd crab, crayfish 6. Reservoir hosts: carnivores (tiger, lion, wolf, fox, dog, leopard, cat etc.) 7. Life span: 5-6 years 8th International Conference 0n Bioinformatics, 7 – 11 September 2009, Biopolis, Singapore Paragonimus westermani is the best-known species. Diploid & triploid forms in N.E. Asia: only diploids elsewhere. Pleurocerid snail hosts ? ? ? ? Thiarid snail hosts ? Diploid Triploid 8th International Conference 0n Bioinformatics, 7 – 11 September 2009, Biopolis, Singapore Many other species described. Nearly 50 species, mostly in Asia: over half of these in China alone. This prompted a molecular taxonomy approach. Chengdu Beijing Shanghai (P. westermani not shown here) Guangzhou Euparagonimus cenocopiosus E. hongzesiensis P. asymmetricus P. bangkokensis P. cheni P. divergens P. fukienensis P. harinasutai P. heterorchis P. heterotremus P. hokuoensis P. heuitungensis P. iloktsuenensis P. jiangsuensis P. macrorchis P. menglaensis P. microrchis P. minqinensis P. ohirai P. paishuihoensis P. proliferus P. szechuanensis P. tuanshanensis P. veocularis P. xiangshanensis P. yunnanensis 8th International Conference 0n Bioinformatics, 7 – 11 September 2009, Biopolis, Singapore Molecular Taxonomy • Molecular tools- allow quick and accurate identification of genetically distinct but morphologically similar species • Genetic markers in nuclear ribosomal DNA (rDNA) - to resolve taxonomic issues related to various animal groups - to find phenotypic variants, geographical isolates • ITS rDNA– widely used region, to explore species boundaries in at least 19 digenean (helminth parasites’) families 8th International Conference 0n Bioinformatics, 7 – 11 September 2009, Biopolis, Singapore The ribosomal DNA gene cluster • Ribosomal genes and their associated spacers are among the most versatile sequences for phylogenetic analysis. • Useful for diagnostic purposes at the level of species. 8th International Conference 0n Bioinformatics, 7 – 11 September 2009, Biopolis, Singapore Bioinformatic Tools • Similarity search - BLAST http://www.ncbi.nlm.nih.gov/blast • Phylogenetic prediction - ClustalW/X http://www.ebi.ac.uk/clustalw • Phylogenetic trees construction - MEGA (Molecular Evolutionary Genetics Analysis) format, - Distance methods (Neighbour-Joining, Minimum Evolution, UPGMA) - Character- state method (Maximum Parsimony) • Bayesian Analysis - Mrbayes 3.1. 8th International Conference 0n Bioinformatics, 7 – 11 September 2009, Biopolis, Singapore • ITS sequence motifs- useful for development of DNA bar coding; short DNA sequences from a standardized region of genome- diagnostic “biomarker” for species identification (http://www.barcoding.si.edu/) • Molecular morphometrics- traditional morphological and molecular sequence comparisons by measuring structural parameters. - Geometrical features, bond energies, base composition etc. of secondary structure to study phylogenetic relationships of species (http://www.bioinfo.rpi.edu/applications/mfold) 8th International Conference 0n Bioinformatics, 7 – 11 September 2009, Biopolis, Singapore Materials & Methods Parasite: • Collected from mountain stream crabs of suspected focal area (Changlang Dist. in Arunachal Pradesh). • Metacercariae isolated from muscles by digestion technique. DNA isolation: • DNA extracted in FTA card (Whatman’s), punching sample discs • Sample discs- washed with FTA Purification reagent and TE Buffer; used for PCR. 8th International Conference 0n Bioinformatics, 7 – 11 September 2009, Biopolis, Singapore DNA amplification and sequencing • ITS2 primers used : 3S: 5’- GGTACCGGTGGATCACTCGGCTCGTG-3’ (forward) A28:5’-GGGATCCTGGTTAGTTTCTTTTCCTCCGC-3’ (reverse) [Designed based on conserved sequences of the 5.8S and 28S genes of Schistosoma spp] • Marker- Phi X 174 DNA/ Hae III Digest in agarose gel • PCR products- purified by Genei Quick PCR Purification Kit; - sequenced in both directions using primer set 3S-A28 8th International Conference 0n Bioinformatics, 7 – 11 September 2009, Biopolis, Singapore Molecular Phylogenetic Analysis • Only unique sequences were used in tree construction. • ITS sequences entered in MEGA for phylogenetic trees construction. • Tree building methods- Maximum Parsimony, Neighbor-Joining, UPGMA, Minimum Evolution. • Branch support given using 1000 bootstrap replicates in MEGA Bayesian Phylogenetic Analysis • Sequences aligned using Clustal W 2.0.7 and converted to NEXUS file. • Analysis carried out with combined datasets using Mrbayes 3.1. • Cladogram and phylogram with mean branch lengths generated, and read by Tree view V1.6.6. 8th International Conference 0n Bioinformatics, 7 – 11 September 2009, Biopolis, Singapore Motif identification, testing and validation • Sequences motifs identified from aligned sequences of the data set using PRATT software. • Motifs were expressed using DNA alphabet (A,T,C,G) in PROSITE language. • Validation of motifs were performed for each species using a ‘PATTERN MATCHING’ web application. Evaluation through BLAST analysis • Sequences motifs subjected to BLAST algorithms against nonredundant GenBank database of NCBI (nr). • BLAST outputs analyzed to find perfect pair-wise matches in terms of percent identity and E-values for each species. 8th International Conference 0n Bioinformatics, 7 – 11 September 2009, Biopolis, Singapore Predicted ITS2 RNA secondary structures and analyses • Secondary structures of ITS2 sequences of various paragonimid species were reconstructed by aligning their sequences using Bioedit (Hall, 1999). • The acquired structures with restrictions and constrains submitted in MFOLD (Zuker, 2003). • RNA structure chosen from different output files with highest negative free energy for various similar structures obtained. 8th International Conference 0n Bioinformatics, 7 – 11 September 2009, Biopolis, Singapore Results PCR amplification and analysis M 1 2 3 4 5 6 1b PCR products of Paragonimus metacercaria DNA using primer set 3S - A28 for ITS2 Amplification conditions: Product size: ~ 500 bp Final reaction volume = 50μl 1.6% agarose gel electrophoresis Marker = Ø x 174 DNA/ HaeIII Digest Initial denaturation at 94ºC Denaturation at 94ºC Annealing at 55ºC Extension at 72ºC Final extension at 72ºC =1 min = 30 sec = 38 sec = 42 sec = 10 min } 26 cycles 8th International Conference 0n Bioinformatics, 7 – 11 September 2009, Biopolis, Singapore Sequence motif in PROSITE format (from 5’ to 3’ ends) 8th International Conference 0n Bioinformatics, 7 – 11 September 2009, Biopolis, Singapore NJ tree MP tree Phylogenetic trees of ITS2 sequences of Paragonimid species. (*) = Query sequence 8th International Conference 0n Bioinformatics, 7 – 11 September 2009, Biopolis, Singapore BLAST outputs of Paragonimus ITS sequence motifs against NCBI GenBank database Species motif patterns (5’-3’ ends) Length (bp) No. of best hits Identity (%) E-value >Pattern1 G-G-C-C-A-C-G-G-G-T-T-A-G-C-C-T-G-T-G-G-C-C-A-C-GC-C-T-G-T-C-C-G-A-G-G-G-T-C-G-G-C-T-T-A-T-A-A-A-C 50 100 100 8e-19 >Pattern2 C-G-G-C-C-A-C-G-G-G-T-T-A-G-C-C-T-G-T-G-G-C-C-A-CG-C-C-T-G-T-C-C-G-A-G-G-G-T-C-G-G-C-T-T-A-T-A-A-A 50 100 100 8e-19 >Pattern3 G-C-G-G-C-C-A-C-G-G-G-T-T-A-G-C-C-T-G-T-G-G-C-C-AC-G-C-C-T-G-T-C-C-G-A-G-G-G-T-C-G-G-C-T-T-A-T-A-A 50 100 100 8e-19 >Pattern4 T-G-C-G-G-C-C-A-C-G-G-G-T-T-A-G-C-C-T-G-T-G-G-C-CA-C-G-C-C-T-G-T-C-C-G-A-G-G-G-T-C-G-G-C-T-T-A-T-A 50 100 100 8e-19 >Pattern5 T-T-G-C-G-G-C-C-A-C-G-G-G-T-T-A-G-C-C-T-G-T-G-G-CC-A-C-G-C-C-T-G-T-C-C-G-A-G-G-G-T-C-G-G-C-T-T-A-T 50 100 100 8e-19 >Pattern6 A-T-T-G-C-G-G-C-C-A-C-G-G-G-T-T-A-G-C-C-T-G-T-G-GC-C-A-C-G-C-C-T-G-T-C-C-G-A-G-G-G-T-C-G-G-C-T-T-A 50 100 100 8e-19 >Pattern7 T-A-T-T-G-C-G-G-C-C-A-C-G-G-G-T-T-A-G-C-C-T-G-T-GG-C-C-A-C-G-C-C-T-G-T-C-C-G-A-G-G-G-T-C-G-G-C-T-T 50 100 100 8e-19 >Pattern8 A-T-A-T-T-G-C-G-G-C-C-A-C-G-G-G-T-T-A-G-C-C-T-G-TG-G-C-C-A-C-G-C-C-T-G-T-C-C-G-A-G-G-G-T-C-G-G-C-T 50 100 100 8e-19 >Pattern9 C-A-T-A-T-T-G-C-G-G-C-C-A-C-G-G-G-T-T-A-G-C-C-T-GT-G-G-C-C-A-C-G-C-C-T-G-T-C-C-G-A-G-G-G-T-C-G-G-C >Pattern10 G-C-A-T-A-T-T-G-C-G-G-C-C-A-C-G-G-G-T-T-A-G-C-C-TG-T-G-G-C-C-A-C-G-C-C-T-G-T-C-C-G-A-G-G-G-T-C-G-G 50 100 100 8e-19 50 100 100 8e-19 8th International Conference 0n Bioinformatics, 7 – 11 September 2009, Biopolis, Singapore Predicted ITS2 RNA secondary structures with enthalpies: Indian isolates 8th International Conference 0n Bioinformatics, 7 – 11 September 2009, Biopolis, Singapore Predicted ITS2 RNA secondary structures: Neighbouring countries isolates 8th International Conference 0n Bioinformatics, 7 – 11 September 2009, Biopolis, Singapore Family Paragonimidae: Hypothetical Bayesian analysis phylogeny based upon secondary structure alignment data of ITS2 8th International Conference 0n Bioinformatics, 7 – 11 September 2009, Biopolis, Singapore Discussion • ITS sequences- high species-specific homogeneity. • Primary sequence analysis close relationship between query sequence (P. westermani from India) and isolates of related species from neighbouring countries. • Secondary structure analysis provided additional information for correct identification of the species. confirmed the results from primary sequence analysis. 8th International Conference 0n Bioinformatics, 7 – 11 September 2009, Biopolis, Singapore • ITS sequence motifs All the sequence motifs were available in all the Paragonimus sequences of different geographical isolates under study. Validation of motifs showed high percent identity and low E-value scores. 8th International Conference 0n Bioinformatics, 7 – 11 September 2009, Biopolis, Singapore Conclusion • The Paragonimus species prevalent in the region is in fact Paragonimus westermani, the most common lung fluke throughout the globe. • ITS2 sequences:- - reliable tool to identify relationships species and phylogenetic - potential as species markers. • Different geographical isolates of Paragonimus spp need further study with additional molecular markers and barcoding to ascertain intra-specific strain variations, if any. 8th International Conference 0n Bioinformatics, 7 – 11 September 2009, Biopolis, Singapore Acknowledgements DST, DBT, CSIR (GOI)- for Travel Fellowship for InCoB2009. DIT Project to VT. AICOPTAX programme (MoE&F, GOI) to VT DBT Project to VT & AC. DSA programme (UGC-SAP) in Zoology; UPE (Biosciences) programme in School of Life Sciences, NEHU, Shillong. Co-ordinator, Bioinformatics Centre, NEHU. 8th International Conference 0n Bioinformatics, 7 – 11 September 2009, Biopolis, Singapore