Research Simulation Scenario
Identifying the extent of genes
Something is very wrong -- the start codons?
You didn't expect such a ragged collection of sequences from this highly conserved protein, and you don't quite believe it. Annotated gene callers sometimes miss on calling the beginning of genes, but it's difficult to believe that it have missed for the majority of the instances of this protein.
Nonetheless, you decide to take a look at the sequences upstream from the annotated start sites, to see if perhaps part of the true protein sequence might exist in these regions. So you obtain amino acid sequence extended backwards from each annotated start site to the first upstream (in-frame) stop site.
You get this resulting alignment.
Back to main Scenario page continue