Transcript Slides
Presentation : Kevin Charles Paruchuri Padmavathi Department of Computer Science UTSA 11/1/2010 Introduction GASSST: global alignment short sequence search tool A Gibbs sampling strategy applied to the mapping of ambiguous short-sequence tags. 2 GASSST: GLOBAL ALIGNMENT SHORT SEQUENCE SEARCH TOOL 3 Current Sequence Aligners Next-generation sequencing machines are able to produce huge amounts data Common techniques often restrict indels in the alignment to improve speed Flexible aligners are too slow for largescale applications 4 GASSST GASSST is thus 2-fold—achieving high performance with no restrictions on the number of indels with a design that is still effective on long reads. This method compares with BLAST, with a new efficient filtering step that discards most alignments coming from the seed phase Carefully designed series of filters of increasing complexity and efficiency to quickly eliminate most candidate alignments Algorithm manipulates pre-computed small table of 64KB which easily fits into the cache memory 5 Last step, extend, receives alignments that passed the filter step. It is computed using a traditional banded NW algorithm. Significant alignments are then printed with their full description. Provides a lower bound only 6 Tiled Algorithm 7 A Gibbs sampling strategy applied to the mapping of ambiguous shortsequence tags. 8 Gibbs Sampling for Ambiguous Seq Maps ambiguous tags to individual genomic sites. Mapping of ambiguous tags Calculating LR for each site For each map site the number of co-located tags are counted. This count is used for calculate likelihood ratio Higher likelihood ratio, higher confidence, increases nonlinearly with tag counts LRj Ps(kj) / Pn(kj) LR is calculating conditional prob Two steps are circular, led to adopt Gibbs Sampling. For some set of ambiguous tags (σ), it reaches relative entropy between Ps and Pn. 9 10 Comparison Compared against MAQ s/w method, which randomly selects a site for each ambiguous tag. Comparison on the eight seq tag libraries (20 bp tags, 35 bp tags) shows that Gibbs Sampling correctly maps from 49% to 71%, MAQ method 8% to 23%. 11 12 Thank you for listening. 13 Results We found that GASSST achieves high sensitivity in a wide range of configurations and faster overall execution time than other state-of-the-art aligners. 14