Bioperl modules - Ohio State University

Download Report

Transcript Bioperl modules - Ohio State University

Parsing BLAST output

Output of a local BLAST search

“less” program Full path to the BLAST output file

BLAST program used for the search Reference Information of the query sequence Information of the database One-line summary of the search results Detailed information for the first 2 hsps of the first hit: Accession number, description, organism, score, E value, identities, positives, and alignment

Sample BLAST output (continued) Hsp information from the first hit

Press “q” to quit the “less” viewing mode

The size of the BLAST output is limited only by the free disk space you have in your computer. It’s virtually impossible to open a large text file. Let alone going through the file line by line.

The purpose of parsing BLAST output is to extract user-defined information from the BLAST output file for clear visualization and summarization.

Search result parsing

The

Bio::SearchIO

system was designed for parsing sequence database searches (BLAST, sim4, waba, FASTA, HMMER, exonerate, etc.)

One-line summary of the search results Load Bio::SearchIO module Usage information It will appear if the program is invoked without arguments Define the class Print out the header information Process each result

Process each hit Process each HSP Control for the number of hits to be extracted Indicator showing the work is done

Confirm that the perl script and the BLAST output are in place Change directory (cd) to where the perl script and the BLAST output file are stored

Oops… an error message It’s due to Windows and Unix compatibility.

Find the file in Windows system and open it with Notepad++

Select “convert to UNIX format” in the “Format” drop-down menu After the conversion, save the file and exit Notepad++

Another error message This is because the perl interpreter has been installed in another location (/usr/bin/) while the script is looking for the perl interpreter in /usr/local/bin

Solution: Create a symbolic link of /usr/bin/perl in /usr/local/bin Command: ln-s/usr/bin/perl/usr/local/bin/perl Now it’s working !

Congratulation! You’ve just parsed a BLAST output!

This is the file you’ve just generated.

Let’s see how the file looks like, using “less”.

Here is how it looks like.

The parsed output is tab-delimited and can be imported into Excel for better visualization.

Locate the file in Windows system

Header row Query sequence Accession numbers of the top 3 hits E values of the top 3 hits Descriptions of the top 3 hits Information of each HSP of the top 3 hits