Transcript Slide 1
SNP Haplotypes as Diagnostic Markers Shrish Tiwari CCMB, Hyderabad Introduction Genomic variation mostly (90%) as single nucleotide differences (2/3 of which are transitions) Non-uniformly distributed (on an average 1 difference for every 1000 bp) Useful as genetic markers Unambiguous assay techniques Slow mutation rate Amenable for high-throughput genotyping Single Nucleotide Polymorphism Single base-pair differences occurring in a population with a frequency of >1% ...C C A T T G A C... …G G T A A C T G... ...C C G T T G A C... …G G C A A C T G... Types of SNPs SNPs can occur in protein coding or non-coding DNA The coding region SNPs can either be in exons (cSNPs), introns or regulatory regions cSNPs may either be silent or cause a change in amino acid SNP databases dbSNP: (~3 million SNPs) JSNP: (~195,000 SNPs) http://snp.ims.u-tokyo.ac.jp/ The SNP Consortium (TSC): http://www.ncbi.nih.gov/SNP/ http://snp.cshl.org/ Human Genome Variation Database http://hgvbase.cgb.ki.se/ SNP analysis SNP discovery: by sequencing, by EST sequence comparison SNP validation: SNAPShot, DHPLC, Sequenom SNP allele frequency SNP association SNP discovery: Experimental approach Obtain DNA sequence around SNP PCR amplify the segment of interest Identify SNP by sequencing the segment from different individuals Map the marker onto chromosome Compute the allele frequency in population SNP discovery: in silico Align EST’s to themselves or to cDNAs Find variations in ESTs that align with 95% identity over >100bp Distinguish true SNPs from sequencing errors by checking the base quality using Phred/Phrap set of programs The PolyBayes Program Use genomic sequence as reference Use Bayesian statistics to cluster and align all available EST sequences remove repeats/paralogs distinguish polymorphic sites from artifacts estimate likelihood Marth, GT, Korf, I, Yandell, MD, Yeh, RT, Gu, Z, Zakeri, H, Stitziel, NO, Hillier, L, Kwok, P-Y, Gish, WR: A general approach to single-nucleotide polymorphism discovery. Nature Genet. 1999; 23:452-456. SOME OTHER SNP PREDICTION & SNP FINDING SOFTWARE SEAN: Search for localized SNPs and predict SNPs (http://zebrafish.doc.ic.ac.uk/Sean/) SNPpipeline: Mine SNPs stored in gene databases SNP Finder: For analyzing user-submitted trace data (http://gai.nci.nih.gov/) SNP Applications Disease diagnosis Identification of factors for increased susceptibility to disease May be responsible for the different responses of individuals to drugs Will help in developing personalised medicines SNP Haplotypes Sets of SNPs in linkage disequilibrium SNP genotype data (4 loci): 00 01 01 11 Possible SNP haplotypes: 0001/0111 0011/0101 BLACK EYE BROWN EYE BLACK EYE BLUE EYE BROWN EYE BROWN EYE DNA Sequence 1 2 3 4 5 6 Phenotype SNP SNP SNP-Haplotype Association GATATTCGTACGGA-T GATGTTCGTACTGAAT GATATTCGTACGGA-T GATATTCGTACGGAAT GATGTTCGTACTGAAT GATGTTCGTACTGAAT Haplotypes AG- 2/6(BLACK EYE) GTA 3/6(BROWN EYE) AGA 1/6 (BLUE EYE) Haplotype database HapMap: http://www.hapmap.org/ The international HapMap Consortium, The International HapMap Project, Nature 426, 789-796 (2003) Why SNP Haplotypes? More reliable than single SNPs as markers Usually less in number than the number of SNPs in the set Haplotype tagging SNPs further reduce genotyping costs Association study Association of allele and phenotype Association can occur if Allele is in linkage disequilibrium with the mutation responsible for the phenotype Allele is responsible for the phenotype For association study a control and a case population is necessary SNP haplotypes as diagnostic markers Identify SNPs for gene associated with disease Find the different sets of SNPs in a control population Find different sets of SNPs in affected population Look for unique sets of SNPs in affected population References A.J. Brookes, The essence of SNPs, Gene 234, 177-186 (1999). Pui-Yan Kwok and Zhijie Gu, Single Nucleotide Polymorphism libraries: why and how are we building them, Mol. Med. Today 5, 538-543 (1999). SNPs Science Primer: http://www.ncbi.nlm.nih.gov/About/primer/snps.html