Introduction:
P2-related prophages have been found in a large variety of enteric bacteria. Comparative studies of the genomes of these phages have revealed numerous so-called "lysogenic conversion genes", which introduce new phenotypic properties to the bacterial host. Some of these genes are known virulence factors. The long-term goal of this project is to analyze pathogenic E. coli strains for the presence of P2-related prophages and to determine whether genes carried on these prophages contribute to the virulence of these bacteria. These bacteria are responsible for serious illness (diarrhea) and even death in several developing nations. Developing a better understanding of these strains and a possible connection to P2-related phages will help to control these virulent strains.

In this study we plan to screen a collection of phlyogenetically classified enteroaggregative E. coli strains (Czeczulin, J.R. et al. 1999. Infect Immun. 67:2692-2699) for the presence of P2-related prophages and then determine what lysogenic conversion genes are carried by these prophages.

In the first part of this project (summer 2003, see reference 3), PCR primers were designed to target the most highly conserved region in the P2-related phages—the capsid gene cluster. Using NCBI’s tBlastn and Blastn, the genomes of P2-related phages and prophages in Enterobacteriaceae were compared in order to identify sequences suitable for PCR primers. Two stretches of twenty-three nucleotides from the P gene (in the capsid gene cluster) were used. The sequences were located about 180 basepairs away from each another. The reverse complement of the second sequence was used. Here are the alignments (the primer is the middle section):

P2-----------------aaaaa taaaacgtcgcgccaatctggcg ggatttcaggat
L-413C-------------aaaaa taaaacgtcgcgccaatctggcg ggatttcaggat
186----------------aaaaa tagaacgtcgcgccaatctggcg ggatttgagaat
Colipahge186-------aaaaa tagaacgtcgcgccaatctggcg ggatttgagaat
Salmonella---------caaaa taaaacgtggcgccaatctggcg cgacttgagcag
Salmonella---------caaaa taaaacgtggcgccaatctggcg cgacttgagcag
Salmonella---------caaaa taaaacgtggcaccaatctggcg cgacttgagcag
Salmonella---------caaag taaaaggtcgcgccgatctggcg tgacttcagcag
Salmonella---------caaag taaaaggtcgcgccgatctggcg tgacttcagcag
E. Coli------------caaag taaaaggtcgcgccgatctggcg tgacttcagcag

P2---------------------g cctttgttacggttagcgacgtt cggatta
L-413C-----------------g cctttgttgcggttagcgacgtt cggatta
186--------------------a cccttgttgcggttggcgacgtt ggggtta
Coliphage186-----------a cccttgttgcggttggcgacgtt ggggtta
Salmonella-------------a cctttgttacggttggctacttt cgggttt
Salmonella-------------a cctttgttacggttggctacttt cgggttt
Salmonella-------------g cctttgttgcggttggcgacgtt agggttt
Salmonella-------------g cctttgttgcggttggcgacgtt agggttt
Salmonella-------------g cctttgttgcggttggcgacgtt agggttt
E. Coli----------------g cccttgttgcggttggcgacgtt aggattt

RMM1 5’ TAA AAC GTC GCG CCA ATC TGG CG 3’
RRM2 5’ AAC GTC GCY AAC CGY AAC AAA GG 3’
These primers were ordered from Integrated DNA Technologies, Inc, and used to screen a collection of twenty-six strains of phlyogenetically classified enteroaggregative E. coli strains (Czeczulin). The PCR products were then run on a 1.4% agarose gel. Fifteen of the strains had a positive PCR product that was about 180bp in size. Each gel contained a 100bp ladder, a positive control (C-1920, containing Wphi), and a negative control (PCR water was used instead of a DNA template). This is the gel from the second set of PCR products.



The following diagram is a modified version of a diagram found in the Czeczulin paper. The green spots mark strains that were screened and did not have a positive PCR product. The red spots signify strains that had positive PCR products.


Academic Year:
Now that I have (presumably) found fifteen strains that contain P2-related prophages, I can begin to identify what genes these prophages contain. Additionally, I can determine if any of these genes are known lysogenic conversion genes. There appears to be a “hot spot” for acquisition insertion of lysogenic conversion genes in the P2-related phages (see insert.ppt). This step the replaces original plan’s (summer 2003 proposal) use of Southern blots to identify genes in the prophages.

I have already begun to look for candidates for PCR primers that flank the insertion site of P2 (between G and F1). These genes are not as highly conserved as the capsid genes, so this search has been slightly more difficult. I will either look further into the G and F1 genes or I will design two sets of primers: one for phages more closely related to P2 and another for phages more closely related to 186. I will then run PCR on the 15 strains that had positive PCR products with the P gene primers. Any candidates that appear to have foreign DNA inserted between genes G and FI will be further characterized by sequence analysis of this region. This process will then be repeated at the other insertion point between genes A and Q.

At this point, the project will look very similar to the part conducted during the summer of 2003. PCR will be run using these primers on each of the strains. Then the PCR product will be run on an agarose gel. Any sample that contains a PCR product that is longer than the minimum expected will be sequenced. The resulting sequences will be examined for known genes, especially virulence factors and lysogenic conversion genes.

Additionally, I plan to take several courses at Haverford College to increase my base of knowledge in bioinformatics and microbiology. I will be auditing Advanced Genetic Analysis (BiolH 301d) because scheduling conflicts would not allow me to enroll in the course. I will take Computational Genomics (BiolH 354e), Structure and Function of Macromolecules (BiolH 303h), and Molecular Microbiology (BiolH310g). For more information see the Haverford College Biology Department course descriptions: http://www.haverford.edu/catalog/Biology.html#A%20CORE%20PROGRAM%20OF%20COURSES.

Budget (approximate/estimations, subject to change and revision):
Oligionucleotides $100
Taq dna polymerase $150
Other PCR supplies $100
TA-clone kits x4 $400
Sequencing x30 $600
TOTAL $1350



References:

Czeczulin, J.R. et al. 1999. Infect Immun. 67:2692-2699

Mullowney, Robert; Christie, Gail; summer 2003 summary of research presentation (unpublished).

Mullowney, Robert; summer 2003 BBSI research proposal (available at http://ramsites.net/~rmullowney, see summer 2003 research).