Transcript Slides
Presentation :
Kevin Charles
Paruchuri Padmavathi
Department of Computer Science
UTSA
11/1/2010
Introduction
GASSST: global alignment short sequence
search tool
A Gibbs sampling strategy applied to the
mapping of ambiguous short-sequence
tags.
2
GASSST: GLOBAL
ALIGNMENT SHORT
SEQUENCE SEARCH TOOL
3
Current Sequence Aligners
Next-generation sequencing machines are
able to produce huge amounts data
Common techniques often restrict indels in
the alignment to improve speed
Flexible aligners are too slow for largescale applications
4
GASSST
GASSST is thus 2-fold—achieving high performance
with no restrictions on the number of indels with a design
that is still effective on long reads.
This method compares with BLAST, with a new efficient
filtering step that discards most alignments coming from
the seed phase
Carefully designed series of filters of increasing
complexity and efficiency to quickly eliminate most
candidate alignments
Algorithm manipulates pre-computed small table of
64KB which easily fits into the cache memory
5
Last step, extend, receives alignments that passed the
filter step.
It is computed using a traditional banded NW algorithm.
Significant alignments are then printed with their full
description.
Provides a lower bound only
6
Tiled Algorithm
7
A Gibbs sampling strategy applied to
the mapping of ambiguous shortsequence tags.
8
Gibbs Sampling for Ambiguous Seq
Maps ambiguous tags to individual genomic sites.
Mapping of ambiguous tags
Calculating LR for each site
For each map site the number of co-located tags are
counted. This count is used for calculate likelihood
ratio
Higher likelihood ratio, higher confidence, increases nonlinearly with tag counts
LRj Ps(kj) / Pn(kj)
LR is calculating conditional prob
Two steps are circular, led to adopt Gibbs Sampling.
For some set of ambiguous tags (σ), it reaches
relative entropy between Ps and Pn.
9
10
Comparison
Compared against MAQ s/w method, which
randomly selects a site for each ambiguous tag.
Comparison on the eight seq tag libraries (20 bp
tags, 35 bp tags) shows that Gibbs Sampling
correctly maps from 49% to 71%, MAQ method 8%
to 23%.
11
12
Thank you for listening.
13
Results
We found that GASSST achieves high sensitivity in a wide range of
configurations
and faster overall execution time than other state-of-the-art aligners.
14