Identifying the extent of genes
So many unknown protein! (and so little time)
Many of the proteins you saw on the list are expected -- -- but a hefty fraction are annotated as "hypothetical", which means we have no idea what they do. These hypothetical protein are clearly important, as they have been conserved over a billion years of evolution. There is evidently a great deal about being a photosynthetic organism we don't know about.
You decide to take the bull by the horns and try to make what sense you can of some of these hypothetical proteins. The first step is to see how the proteins vary over all the photosynthetic bacteria in your set.
One of the hypotheticals in your list is ssr1600, a short protein. When you align the sequence of this protein with its orthologs in the other five bacteria, you get a surprise. The alignment shows that the sequence is highly conserved -- where there is sequence -- but the proteins show remarkable variation in their size.
Back to main Scenario page continue
|