English 中文(简体)
Customizing output of BLAST?
原标题:

I know this is a very specific question relating to BLAST and Bioinformatics but here goes:

I am attempting to use standalone BLAST (I already have downloaded it and tested it running on the command line) to perform a DNA sequence alignment (blastn). I need to be able to provide both my own query file (fasta format) and my own database file (also fasta format).

The key is that I want to have the program only output 2 fields rather than the detailed reports that it usually outputs. I only want the highest score and the e-value for the alignment to be output. The idea is that once I have this working, I can wrap this in my own control program and automatically run it many times with different query sequences and log the scores and e-values.

I know this is a long shot, but does anybody have an idea on how I can go about doing this? The two hurdles for me are using my own database file and customizing the output.

最佳回答

in fact it s simple: blastall has several command line option that will help you:

  • to output only the single strongest hit for each query: -v 1 -b 1
  • to output in table format: -m 8

so you ll be running something like this:

blastall -p blastn -i queries.fasta -d database -v1 -b1 -m8 > resultTable.txt

The table output has several columns however. I don t recall the order of columns, but you can use the cut tool to select only your columns of interest. For example the following command would select only columns 1, 7 and 8 from the blastoutput

cut -d  	  -f 1,7,8 < resultTable.txt

yannick

问题回答

Yannick s answer covers how to get the specific output you need from blastall - the second thing you re concerned about is using your own database file. Standalone BLAST provides the tools you need for this too.

Along with blastall, you should also have a copy of a program called formatdb, you can provide this with your fasta sequence database, and it will format it correctly for BLAST. For a nucleotide database, run the following:

formatdb -i input_database.fa -p F

This will produce a number of files in your working directory (input_database.fa.nhr, input_database.fa.nin, input_database.fa.nsq) which you can use in your blastall command by using the original name of your database (ie, miss off the .n* suffix).

HTH

PS formatdb -h will give you a full list of options for formatdb





相关问题
Configure flymake to use specific directory for temp files?

I ve been looking through the documentation and tried customizing and a variety of things, but no matter what I do it seems like Flymake just always places its temporary files into whatever directory ...

Customizing output of BLAST?

I know this is a very specific question relating to BLAST and Bioinformatics but here goes: I am attempting to use standalone BLAST (I already have downloaded it and tested it running on the command ...

Extendable .Net Document management for under $30K

i am looking to answer one of those problems that sometimes get give to us devs by sales staff trying to get a sale in under budget. We have a client that requires the following: Document management ...

热门标签