How to Blast sequences against a genome

1. Get to a DOS window (e.g. by RUN command)

2. Type the following command to run Blast:

        blastp -db databaseName -query contigFile -out filename -evalue e-value

    For example:

        blastp -db octdata -query maydata.fna -out myResults.txt -evalue .00001

3. Be prepared to wait a while. With only a few contigs, you shouldn't have to wait more than some 10's of seconds, but with the number of sequences we are using, the output may be hours in coming. The program gives no indication of its progress; it simply brings you back to a DOS prompt (>) when it's done.

4. Output could be modest when comparing two small sequences, but with lots of sequences, you can fill your disk drive with LOTS of output (dozens of megabytes).

5. How do you know whether the program worked? If you have a large output file (i.e. dozens of megabytes), don't try to read it into something like Word (you risk choking it). I don't think that Microsoft has any solution for us, but there is an ancient freeware program from the pre-Windows era that will do the job. Click here to download DR (standing for DiRectory). Put it in the Blast directory. Type DR at a DOS prompt to run.

6. To run DR, type DR at a DOS prompt to get a list of files in \Blast, then press the F10 key to sort the files by date of creation, then press the End key to go to the end of the list. You should see the file you just made. Press the Enter key to see the contents of the file (you can scroll through the file using the usual keys).

7. However you look at the output file, you should see something like:

BLASTP 2.2.9 [May-01-2004]
 

Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A.
Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J.
Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402.

Query= Contig240-R (500 letters)

Database: octdata.fna
           1 sequences; 2,160,837 total letters

If so, you win!