Jump to content

CrocoBLAST:Terminology: Difference between revisions

From WebChemistry Wiki
Created page with "There are a few basic terms you need to keep in mind when running BLAST within CrocoBLAST. =Input file and Database= It its essence, BLAST takes an unknown nucleotide or prot..."
 
Line 18: Line 18:


=Job=
=Job=
Within CrocoBLAST, a job is defined by the BLAST program, the database, the input file, and the output location (folder). When created, each job receives a unique job ID that can be references whenever you wish to perform an operation on that job.


=Queue=
=Queue=

Revision as of 23:34, 23 July 2016

There are a few basic terms you need to keep in mind when running BLAST within CrocoBLAST.

Input file and Database

It its essence, BLAST takes an unknown nucleotide or protein sequence, tries to align it against a set of reference sequences, and then reports the score of each alignment, in an effort to help you identify the unknown sequence. In practice, this translates into taking an input file with many query sequences, and aligning each of the query sequences against a database of known sequences. Such databases are typically stored in suitable repositories such as NCBI, or may be obtained in-house.

Therefore, in order to run BLAST, you will need to specify an input file containing the query sequences, and a database file containing the reference sequences. CrocoBLAST accepts input files in FASTA and FASTQ format. BLAST uses a specific database format for database file. You may indicate the database file either in database format or in FASTA or FASTQ format, which will be converted to database format before BLAST is run. Within CrocoBLAST you may directly download databases from the NCBI server.

BLAST program

Depending on the nature of the query and reference sequences, there are several BLAST programs you may use within CrocoBLAST:

  • blastp - compares an amino acid query sequence against a protein sequence database
  • blastn - compares a nucleotide query sequence against a nucleotide sequence database
  • blastx - compares a nucleotide query sequence translated in all reading frames against a protein sequence database
  • tblastn - compares a protein query sequence against a nucleotide sequence database dynamically translated in all reading frames
  • tblastx - compares the six-frame translations of a nucleotide query sequence against the six-frame translations of a nucleotide sequence database

Therefore, in order to run BLAST, you will need to indicate which BLAST program you intend to use.

Job

Within CrocoBLAST, a job is defined by the BLAST program, the database, the input file, and the output location (folder). When created, each job receives a unique job ID that can be references whenever you wish to perform an operation on that job.

Queue