VCU Bioinformatics and Bioengineering Summer Institute
Virginia Commonwealth University
imageimageHomeBio What?The InstituteThe People
The Institute
Goals of the Institute
Two-year Plan
Course web pages
News
Archives
Application process
About the BBSI

Research Simulation Scenario
Maintaining Continuity in a Genome Project

Assembly of the genomic sequence
You've worked arduously at the task, and as of this spring, 23,254,518 bases of nucleotide sequence have been obtained from the 2-million-nucleotide chromosome of S. sanguis , read in segments of about 500 nucleotides. On average, then, each nucleotide has been read about 11 times. Even so, after overlapping the reads as best you can, there still remains about 200 fragments, called contigs, unconnected to each other.

Although the project won't be completed until all the pieces of the puzzle are connected, even now there is an incredible amount of information at hand. Almost all the genes of S. sanguis, and therefore almost all its secrets, are contained in this imperfect assembly. So far, however, we've been operating at the level of G's, A's, T's, and C's. We need to extract higher order entities, such as genes.

Back to main Scenario page      back one page     continue