Maintaining Continuity in a Genome Project
Assembly of the genomic sequence
You've worked arduously at the task, and as of this spring, 23,254,518 bases of nucleotide sequence have been obtained from the 2-million-nucleotide chromosome of S. sanguis , read in segments of about 500 nucleotides. On average, then, each nucleotide has been read about 11 times. Even so, after overlapping the reads as best you can, there still remains about 200 fragments, called contigs, unconnected to each other.
Although the project won't be completed until all the pieces of the puzzle are connected, even now there is an incredible amount of information at hand. Almost all the genes of S. sanguis, and therefore almost all its secrets, are contained in this imperfect assembly. So far, however, we've been operating at the level of G's, A's, T's, and C's. We need to extract higher order entities, such as genes.
Back to main Scenario page back one page continue
|