Jump to content

CrocoBLAST:Terminology

From WebChemistry Wiki
Revision as of 23:32, 23 July 2016 by Crina (talk | contribs) (Created page with "There are a few basic terms you need to keep in mind when running BLAST within CrocoBLAST. =Input file and Database= It its essence, BLAST takes an unknown nucleotide or prot...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

There are a few basic terms you need to keep in mind when running BLAST within CrocoBLAST.

Input file and Database

It its essence, BLAST takes an unknown nucleotide or protein sequence, tries to align it against a set of reference sequences, and then reports the score of each alignment, in an effort to help you identify the unknown sequence. In practice, this translates into taking an input file with many query sequences, and aligning each of the query sequences against a database of known sequences. Such databases are typically stored in suitable repositories such as NCBI, or may be obtained in-house.

Therefore, in order to run BLAST, you will need to specify an input file containing the query sequences, and a database file containing the reference sequences. CrocoBLAST accepts input files in FASTA and FASTQ format. BLAST uses a specific database format for database file. You may indicate the database file either in database format or in FASTA or FASTQ format, which will be converted to database format before BLAST is run. Within CrocoBLAST you may directly download databases from the NCBI server.

BLAST program

Depending on the nature of the query and reference sequences, there are several BLAST programs you may use within CrocoBLAST:

  • blastp - compares an amino acid query sequence against a protein sequence database
  • blastn - compares a nucleotide query sequence against a nucleotide sequence database
  • blastx - compares a nucleotide query sequence translated in all reading frames against a protein sequence database
  • tblastn - compares a protein query sequence against a nucleotide sequence database dynamically translated in all reading frames
  • tblastx - compares the six-frame translations of a nucleotide query sequence against the six-frame translations of a nucleotide sequence database

Therefore, in order to run BLAST, you will need to indicate which BLAST program you intend to use.

Job

Queue