Step 4: Confirming Your Frame! (Part II)

OK, so you think you have an ATP synthase... or you should by now, anyway. Now we can use the most common tool for identification of a sequence, the BLAST. BLAST searches take your sequence and compare them against all the sequences in the GenBank repository, the biggest such database on Earth. This kind of search returns the most similar hits in the database, sequences most similar to the one you submitted, with the most familiar sequences coming up on top of the list.

Sequences are given with a Score, and E-Value and a % Identity. With Score, the larger the number, the closer your sequences are. With E-value, the lower the number, the more similar your sequences are. With identity, the higher the percentage, the more similar your sequences are.
There are multiple kinds of BLAST.
BLAST searches can be limited to specific categories using the "Limit by entrez query" field. For instance, if one is specifically interested in reductases, one can search only for entries with "reductase" in them. One can BLAST against only a specific organism or taxon by specifying "name[Organism]", e.g. "escherichia coli[Organism]". Multiple word termss require quotes, and boolean logic is accepted by the engine when making limits (this AND "that OR this" NOT the other thing).
To BLAST your protein sequence, Your search will return hits that look like the below. Notice how the most prominent hits are all the expected type of protein, ATP synthases. Also note how many of the hits return with an E-value of zero; this means BLAST believes there is fundamentally no chance these proteins are not equivalent. ATP synthase is an essential and strongly conserved enzyme, so it shouldn't be a big shocker to see a bunch of highly similar hits.Please do not close this window out as you will need it to select sequences for the next part.


Previous Page
Home Page
Next Page