Sequencing the Genome of Staphylococcal Phage 80

 

 

Abstract



The complete genome of Staphylococcus aureus phage 80 was sequenced. The circularly permuted phage genome is 42,160 bp with an overall GC content of 35.6%.  Sixty-one potential open reading frames (ORFs) encoding proteins of 50 aa or more were identified. A predicted function was assigned to 26 of these. Thirty other ORFs are homologous with hypothetical staphylococcal phage proteins or putative prophage genes with unknown function. One protein is homologous to a protein belonging to a methicillin resistant strain of S. aureus, MRSA252, which has not previously been attributed to prophage DNA. Four ORFs seem to belong only to phage 80, having no homologies in the GenBank database. Phage 80 was most closely related to
fETA in a genome comparison with other sequenced staphylococcal phages.       

 

 

 

Introduction

 

 

Staphylococcus aureus is known to have a family of related pathogenicity islands, or PI’s, that are mobilized and replicated by specific helper phages. These PI’s are 15-20kb characterized by carriage of virulence genes, presence of flanking direct repeats, and presence of phage-like mobility genes encoding integrase, helicase, and terminase. SaPI1 and SaPI2 encode TSST-1, the toxic shock toxin, and each island is located on a specific location in the S. aureus genome. SaPI1 is located at the trp locus while SaPI2 is located at the tyrB locus. Both are excised and replicated by certain staphylococcal phages and then packaged into phage-like particles. These phage-like particles can then infect another S. aureus cell, and the SaPI DNA then integrates back into its distinct location on the genome. These infectious phage-like particles have smaller heads than the helper phages, associated with the smaller size of the PI genome.  When the helper phages are absent the islands are stable with no detectable mobility.

 

SaPI1 is 15.2kb in length and has direct terminal repeats of 17 nucleotides. It contains a reading frame whose product is from the integrase family, but does not by itself code for excision. In the presence of phage 80a, SaPI1 can excise, replicate, and become encapsidated.  80a is not a naturally occurring staphylococcal phage. It was developed while trying to select a host range variant of another staphylococcal phage, 80. It is most likely a recombinant of 80 with two other temperate staphylococcal transducing phages, f11, and f13.  Like 80a, f13 can also cause excision of SaPI1, but not replication; no excision or replication is caused by f11.  It is not known whether 80 itself can induce SaPI1, because it does not grow in the strains carrying this pathogenicity island. However, phage 80 does induce excision and replication of a second TSST-1 PI, SaPI2. The sequence of SaPI2 is not yet known, but it differs from SaPI1 in several respects.  Phage 80 induces excision and replication for this island in a similar way to 80a and SaPI1. To identify mobility specific genes, comparative genome analysis must be done.  In this study, we determined the DNA sequence of phage 80.  Phage 80a and SaPI2 are currently being sequenced. 

 

 

 

Materials and Methods

 

 

Phage 80 was grown in liquid culture by infecting S. aureus strain RN322 in CY broth and .06M glycerol phosphate at an multiplicity of infection (MOI) of 0.1.  Once the cells had lysed, the debris was spun down and the phage was precipitated from the lysate with 10% (w/v) polyethylene glycol and .5M NaCl. The phage was further purified by CsCl density gradient centrifugation and the DNA was then extracted. Phage DNA was physically sheared into 2-4kb fragments using a HydroShearTM instrument and the ends of the fragments were repaired to convert overhangs into blunt cloning sites using DNA-Terminator® End Repair Kit. Fragments approximately 1.5-3kb in size were purified by agarose gel electrophoresis. They were then ligated into a pSMART™-HCAmp cloning vector to create a genomic library. Plasmids were purified using QIAprep ® Spin Miniprep Kit and individual plasmids from 4 96-well plates were sequenced and analyzed at NCI, using pSMART sequencing primers SL1 and SL2 and fluorescent dideoxynucleotides (such as the ABI PRISM® BigDye™ Terminators Cycle Sequencing Kit) and analyzed on an automated sequencer. 

 

The sequence was assembled using SeqMan II and phredPhrap/Consed software. Before the genome was fully sequenced there were several areas of the genome with only one strand coverage. Primers were designed for PCR and sequencing reactions to increase the coverage. Primers were also designed to sequence regions containing ambiguous bases in the assembly. ORFs greater than 50aa were identified using GeneMarkTM and TIGR’s Glimmer. Any discrepancies between the two programs were resolved by checking the sequence for candidate ribosomal binding sites. These ORFs were then compared to the database and the sequence was annotated. The genome was compared with those of other sequenced staphylococcal phages using the pairwise alignment algorithm of Wilbur and Lipman and displayed as a homology tree using the UPGMA method of Sneath and Sokal contained in the DNAMAN software package.   

 

 

 

Results

 

 

The circularly permuted genome of phage 80 is 42,140 bp in length with a 35.6% GC content.

There were 61 predicted ORFs larger than 50aa (Fig. 1). Since the exact attachment site is not yet known, we have defined the genome start with ORF1, the int gene, which is predicted to be adjacent to attL. This facilitates alignment of the sequence with those of other staphylococcal phages in the database, most of which have been entered as prophage sequences. The predicted gene products were analyzed to find molecular weight, ribosomal binding site, closest homologue, and predicted gene function (Table 1). A homology tree was created using DNAMAN and shows that phage 80 is most closely related to fETA (Fig. 2). 

 

Of phage 80’s 61 predicted ORF’s, functions have been proposed for 26, while 30 have homology to hypothetical Staphylococcal phage proteins or putative prophage genes.  One ORF has homology to an MRSA252 protein that has not before been associated with phage and four that have no matches to any protein in the GenBank database.  Phage 80 is most homologous to fETA sharing similarity in replication and morphogenesis.  Lysis genes are close to that of f11 while the lysogeny genes resemble that of many different staphylococcal phages.  The integrase gene is most similar to the integrase of phage 77 and 54a.

 

 

 

 

 

Fig. 1.  Predicted ORFs  larger than 50aa.  ORF1 begins the genome with the int gene predicted to be adjacent to attL.  Arrows indicate orientation.

 

 


 

 

ORF

# aa

putative RBS

Mr,kDa

pI

closest homologue

E

 

function; notes

1

401

AAGGAGGaGAaataaaATG

47.1

10.53

77 orf 7

0.0

NP_958623.1

cd01199, INT_Tn1545_C, Tn1545-related integrases

2

166

AtAAAGGaGtataaaacATG

18.6

9.48

 

 

 

 

3

163

AAAGGAGtcgtataaaagATG

19

8.95

 

 

 

 

4

152

AAAGGgGtTtggctcATG

18.3

5.8

MRSA252 hyp. phage protein SAR1557

1e-69

YP_040959.1

pfam06114, DUF955, Domain of unknown function; similar to phiSLT p05

5

92

AAAGGAGaaaattATG

10.7

5.32

hyp protein Mu50 SAV0851

2e-21

NP_371375

cI-like repressor; cd00093,HTH_XRE; helix-turn-helix family

6

69

AAGAATGATacgaATG

8.1

9.8

hyp protein Mu50 SAV0852

6e-8

NP_371376.1

cro-like repressor; cd00099,HTH_XRE, helix-turn-helix family

7

241

AAAGGAGGgaactgaaATG

28.1

9.52

phiPV83 antirepressor

6e-48

NP_061599.1

COG3617, prophage antirepressor

8

69

AGGAGAGgttgaacATG

8.2

6.66

 

 

 

 

9

53

AGGAGGaGtTatcaaATG

6.2

6.4

phiETA orf14

3e-09

NP_510908.1

similar to phiPVL orf38

10

93

AAcGGAGGaagtcaaccATG

10.4

4.37

MRSA252 hyp. phage protein SAR2089

6e-35

YP_041455.1

similar to phiSLT p14

11

86

AAGGAGGactaaaacaATG

10.2

4.4

phiSLT p15

2e-35

NP_075478.1

similar to phiPVL orf39; 77 orf 43

12

178

AGGAGGTatgaaaaGTG

20.6

4.83

phiSLT orf16

5e-70

NP_075479.1

 

13

259

AAAGGtGGgagaatagATG

29.4

6.53

phiETA orf17

e-129

NP_803265.1

similar to phi11, phi13 orf12

14

184

AAAGGAcGgtATaaaaattATG

21.4

4.91

phi11 ssb

5e-87

NP_803266.1

Ssb

15

223

AGGAgGTGatttaaATG

26.1

6.83

77 orf18

e-120

NP_958641.1

pfam06147, DUF968, protein of unknown function

16

285

GGgGGTGAataattATG

32.6

9.72

77 orf9

e-153

NP_958643.1

 

17

256

AGAAAGGAGataacgaaATG

29.7

9.58

phi13 orf 15

4e-89

NP_803370.1

similar to phiETA orf22; putative DnaA analogue  [no similar protein in phi11]

18

259

AGGgGGatattATG

30

10.12

phiSLT orf22

e-138

NP_075485.1

AAA ATPase domain; COG1484 DnaC

19

53

AgGGAGcgagatgcATG

6.3

7.3

77 orf104

6e-05

NP_958646.1

 

20

73

AAGGAGtgttaaaaATG

8.5

4.76

phiETA orf24

2e-27

NP_510918.1

similar to phiSLT orf23; phi13 orf17; phi11 orf 17; phi77 orf 59; PV83 orf22

21

141

GAGGTGgcacATG

16.5

6.76

Shewanella prophage LambdaSo dam

9e-16

NP_718572.1

pfam05869, Dam, DNA N-6-adenine-methyltransferase

22

134

AtaGAGGTGcacaATG

16.2

10.19

Mu50 hyp protein  SAV0872

2e-60

NP_371396.1

pfam06356, DUF1064, protein of unknown function; similar to 77 orf28; ETA orf25; PV83 orf22

23

61

GAAAtGAaGTGATCtaATG

7.2

4.21

phiETA orf26

5e-16

NP_510920.1

similar to 77 orf80

24

123

GAGGTGgaataaATG

15.1

10.51

Mu hyp protein SAV0873

6e-50

NP_371397.1

similar to phiPVL orf50; phi13 orf19; phiSLT orf50; phi11; phi12

25

82

GgcAGGAaGTataaATG

9.8

6.52

phi11 orf22

4e-33

NP_803275.1

similar to phiPVL orf51; phi12 orf17; phi13 orf20

26

121

AgAGGAGGTtATgaaaGTG

14.8

5.5

Mu50 hyp protein SAV1977

8e-51

NP_372501.1

COG1196, Smc, Chromosome segregation ATPases; similar to phi13 orf22; phiPV83 orf27;phi12 orf20

27

82

AAatGGAGGaagacacaaATG

9.2

3.79

phiPVL orf52

1e-26

NP_058490.1

similar to phiSLT orf28; PV83 orf29

28

175

AGGAGGaGcaggaaaATG

19.3

4.99

phiSLT orf29

1e-92

NP_075491.1

pfam00692, dUTPase; similar to phiPVL orf53

29

78

AAGGAGGTttTggggaaGTG

9

9.78

phiPV83 orf 31

3e-19

NP_597920.1

similar to phiSLT orf31

30

78

AAaGAGGgGAgataataATG

9

4.47

phiPV83 orf 30

8e-32

NP_061620.1

identical to phiSLT orf32

31

57

AGGAGaTGacaatgATG

6.5

4.12

phiETA orf 37

1e-17

NP_510931.1

similar to phi11 RinB integrase activator

32

133

GGAGGTGtcagagtagATG

15.4

9.92

phiETA orf38

3e-65

NP_510932.1

similar to phiPVL orf61

33

164

GTGATaCAGtgaaaacaaTTG

18.6

9.45

phiETA orf39

3e-42

NP_510933.1

pfam03592, Terminase_2, Terminase small subunit

34

402

AGttAGcgGGTGtTaataATG

46.2

9.25

phiETA orf40

0.0

NP_510934.1

COG1783, XtmB, Phage terminase large subunit

35

472

AAAGGAGGTaATattTTG

56.8

4.65

phiETA orf41

0.0

NP_510935.1

pfam05133, Phage_prot_Gp6, Phage portal protein

36

320

GGAGGTGcTgacaGTG

37.8

9.8

phiETA orf42

e-166

NP_510936.1

pfam04233, Phage_Mu_F,  head morphogenesis

37

174

AaAAAGGAGtagtttaaATG

19.6

4.23

phiETA orf43

1e-53

NP_510937.1|

pfam06810, Phage_GP20, Phage minor structural protein; putative minor capsid/scaffolding protein

38

319

AGGAGtgTAtacATG

35

6.54

Listeria innocua lin2390

.21

NP_471721.1

similar to main capsid protein Gp34 - Lactobacillusphage mv4

39

108

AGGAGGTagTgacgtATG

12.5

4.65

phiETA orf45

4e-31

NP_510939.1

similar to phage A118 gp7

40

104

AGtgGGTGtTaagtaATG

11.8

5.21

phiETA orf46

3e-48

NP_510940.1

similar to Spp1 gp15 head completion protein

41

111

GgGGTaagcgatATG

12.9

9.05

phiETA orf 47

6e-51

NP_510941.1

similar to Spp1 gp16 head completion protein

42

137

AttGagGGTGcgacctatTTG

15.5

10.2

phiETA orf48

2e-58

NP_510942.1

similar to Spp1 gp16.1 structural protein

43

145

AtGAGGTGgTtaAGatATG

16.9

8.41

phiETA orf49

2e-69

NP_510943.1

similar to Spp1 gp17 tail protein

44

186

AAAGGAGtgtaacgaATG

20.9

4.72

phiETA orf50

2e-92

NP_510944.1

similar to Spp1 gp17.1 tail protein

45

164

AAAcGAGGTatttaatATG

18.7

4.69

phiETA orf51

4e-70

NP_510945.1

similar to Spp1 gp17.5 tail protein

46

113

GAAAtaAGGcagATG

13.4

10.72

phiETA orf52

4e-48

NP_510946.1

 

47

1047

AAAGGAGGTtAggcATG

113.5

10.2

phiETA orf53

0.0

NP_510947.1|

COG5412, Phage-related protein [Function unknown]; similar to phi11 tape measure protein

48

311

AAgGGAGGTttgtttaTTG

36.4

5.37

phiETA orf54

e-167

NP_510948.1

similar to phi11 orf43

49

628

AAGGAGtaGcatATG

71

7.41

phiETA orf55

0.0

NP_510949.1

pfam06605, DUF1142, Protein of unknown function

50

632

AAAGGAGGcaaccaATG

72.8

7.59

phiETA orf56

0.0

NP_510950.1

similar to Mu50 hyp. protein SAV0904; phi11 orf55

51

607

GAAAGGtGGTtgaataATG

66.8

4.63

phiETA orf57

0.0

NP_510951.1

similar to phi53 ORF1; Mu50 hyp. protein SAV0905

52

125

GgGGTGgaaataATG

14.1

4.31

phi11 orf46

3e-44

NP_803299.1

similar to phiETA orf58; phi53 ORF2; Mu50 hyp. protein SAV0906

53

60

AgAGGAGGacgtttaaATG

7.3

5.16

phi53 ORF3

7e-19

AAM49607.1

similar to phiETA orf59; Mu50 hyp. protein SAV0907

54

99

AAAGtgGGTGgTgtaATG

12.1

7.53

phiETA orf60

5e-33

NP_510954.1

identical to 5 additional Staph phage/prophage proteins

55

624

GAAAtGAGGTGcatacATG

70.9

10.32

phiETA orf61

0.0

NP_510955.1

pfam05257,CHAP domain; COG4193, LytD, Beta- N-acetylglucosaminidase; peptidoglycan hydrolase

56

412

AAaGAGGTGtgtaaATG

46.2

5.64

Mu50 hyp protein SAV0910

0.0

NP_371434.1

pfam01391, Collagen triple helix repeat; similar to phiETA orf62; phi11 tail fiber

57

131

AaGgGGTGATtttATG

14.4

9.62

phi11 orf51

1e-58

NP_803304.1

similar to phiETA orf63; Mu50 hyp. protein SAV0911

58

145

AAAGGAGcaaacaaATG

15.7

4.72

Mu50 hyp protein SAV0912

1e-68

NP_371436.1

similar to phi11, 80 alpha holin

59

481

AcGGAGGTGgcgacaATG

54.1

8.88

phi11 amidase

0.0

NP_803306.1

smart00644,Ami_2 domain;pfam05257,CHAP domain.  Amidase.

60

76

GGTGAaaatattaacagATG

8.9

5.23

 

 

 

 

61

245

AgtGAGGTGAtgaaaaGTG

29.1

7.75

MRSA252 hyp protein SAR1306

6e-17

YP_040715.1

 

 click here for larger image

 

Table 1.  Analysis of predicted gene products, aa, ribosomal binding site, molecular weight, homology, and predicted function.

 

 

 



 

Fig. 2.  Phage 80 shows most sequence homology to fETA.

 

 

 

 

Discussion

 

 

Assembly of the sequenced genome fragments yielded a circular sequence.  This circular permutation is characteristic of phage genomes that are packaged by a headful cleavage mechanism that results in a collection of DNA molecules with terminally redundant, circularly permuted ends in the virion DNA.  Such packaging has been demonstrated for some of the related phages, and appears to be the likely mechanism for phage 80 as well.

 

Phage 80 has an overall genetic organization that is highly similar to other staphylococcal phages and gram-positive bacteriophages.  This genetic organization is consistent with the modular phage evolution theory, where modules, interchangeable genetic elements, are functional units with multiple genes.  While the genomes remain different, these modules possessed by phage 80 remain homologous to many different phages indicative of genetic transfer.   

 

The major capsid protein shows no homology to any other staphylococcal phage, while all of the remaining genes in the putative head gene cluster have strong homology to those of phage fETA. This is highly unusual, since the same head scaffolding and maturation proteins would have to interact with two apparently unrelated major capsid subunit precursors during assembly of phi80 or fETA.  We postulate that the capsid protein is the target for the in SaPI2-directed head size determination.  It will be extremely interesting to see whether a similar phenomenon occurs in the capsid gene cluster of phage 80a once that sequence has been determined.  Subsequent experiments in the lab will be directed towards identifying genes involved in capsid assembly and DNA packaging of the helper phages and SaPI elements. 

 

 

 

 

 

References

 

 

Delcher AL, Harmon D, Kasif S, White O, and Salzberg SL.  1999.  Improved microbial gene identification with GLIMMER  Nucleic Acids Research 27:23, 4636-4641. 

 

Hacker J, Blum-Oehler G, Muhldorfer I, and Tschape H.  1997. Pathogenicity islands of virulent bacteria: structure, function and impact of microbial evolution.  Molecular Microbiology 23(6):1089-1097.

 

Iandolo J, Worrell V, Groicher K, Qian Y, Tian R, Kenton S, Dorman A, Ji H, Lin S, Loh P, Qi S, Zhu H, Roe BA. 2002. Comparative analysis of the genomes of the temperate bacteriophages φ11, φ12, and φ13 of Staphylococcus aureus 8325. Gene 289(1-2):109-18.

Lindsay JA, Ruzin A, Ross HF, Kurepina N, Novick RP. 1998. The gene for toxic shock toxin is carried by a family of mobile pathogenicity islands in Staphylococcus aureus. Molecular Microbiology 29(2):527-543.

Novick, RP. 2002. Mobile genetic elements and bacterial toxinoses: the superantigen-encoding pathogenicity islands of Staphylococcus aureus. Plasmid 49:93-105.

Phred-Phrap Version: 4.0 Copyright (C) 2002-2006 by Deborah A. Nickerson, Scott Taylor, Natali Kolker and Jim Sloan University of Washington

Ruzin A, Lindsay J, Novick P. Molecular genetics of SaPI1 - a mobile pathogenicity island in Staphylococcus aureus. Molecular Microbiology 41(2):365-377.

Sneath PH and Sokal RR  1973. Numerical Taxonomy.  Freeman, San Francisco, USA.

Wilbur, WJ and Lipman DJ. 1983. Rapid similarity searches of nucleic acid and protein data banks.   Proc. Natl Acad Sci USA 80:726-30.