Transcript Slides

Presentation :
Kevin Charles
Paruchuri Padmavathi
Department of Computer Science
UTSA
11/1/2010
Introduction


GASSST: global alignment short sequence
search tool
A Gibbs sampling strategy applied to the
mapping of ambiguous short-sequence
tags.
2
GASSST: GLOBAL
ALIGNMENT SHORT
SEQUENCE SEARCH TOOL
3
Current Sequence Aligners



Next-generation sequencing machines are
able to produce huge amounts data
Common techniques often restrict indels in
the alignment to improve speed
Flexible aligners are too slow for largescale applications
4
GASSST

GASSST is thus 2-fold—achieving high performance
with no restrictions on the number of indels with a design
that is still effective on long reads.

This method compares with BLAST, with a new efficient
filtering step that discards most alignments coming from
the seed phase

Carefully designed series of filters of increasing
complexity and efficiency to quickly eliminate most
candidate alignments

Algorithm manipulates pre-computed small table of
64KB which easily fits into the cache memory
5
 Last step, extend, receives alignments that passed the
filter step.
 It is computed using a traditional banded NW algorithm.
Significant alignments are then printed with their full
description.
 Provides a lower bound only
6
Tiled Algorithm
7
A Gibbs sampling strategy applied to
the mapping of ambiguous shortsequence tags.
8
Gibbs Sampling for Ambiguous Seq



Maps ambiguous tags to individual genomic sites.
Mapping of ambiguous tags
Calculating LR for each site

For each map site the number of co-located tags are
counted. This count is used for calculate likelihood
ratio

Higher likelihood ratio, higher confidence, increases nonlinearly with tag counts
LRj  Ps(kj) / Pn(kj)



LR is calculating conditional prob
Two steps are circular, led to adopt Gibbs Sampling.
For some set of ambiguous tags (σ), it reaches
relative entropy between Ps and Pn.
9
10
Comparison


Compared against MAQ s/w method, which
randomly selects a site for each ambiguous tag.
Comparison on the eight seq tag libraries (20 bp
tags, 35 bp tags) shows that Gibbs Sampling
correctly maps from 49% to 71%, MAQ method 8%
to 23%.
11
12
Thank you for listening.
13
Results
We found that GASSST achieves high sensitivity in a wide range of
configurations
and faster overall execution time than other state-of-the-art aligners.
14