MAST - Motif Alignment and Search Tool
MAST version 3.5.4 (Release date: 3.5.4)
For further information on how to interpret these results or to get
a copy of the MAST software please access http://meme.nbcr.net.
REFERENCE
If you use this program in your research, please cite:
Timothy L. Bailey and Michael Gribskov,
"Combining evidence using p-values: application to sequence homology
searches", Bioinformatics, 14(48-54), 1998.
DATABASE AND MOTIFS
DATABASE cparvum_hsp.fa (nucleotide)
Last updated on Tue Jun 12 20:14:08 2007
Database contains 12 sequences, 9267 residues
Scores for positive and reverse complement strands are combined.
MOTIFS (nucleotide)
MOTIF WIDTH BEST POSSIBLE MATCH
----- ----- -------------------
1 6 GGGGGG
2 15 GGCGGTTTGGCGCCA
3 15 AAAAAAGAGGAAAAA
PAIRWISE MOTIF CORRELATIONS:
MOTIF 1 2
----- ----- -----
2 0.44
3 0.69 0.09
Correlations above 0.60 may cause some combined p-values and
E-values to be underestimates.
Removing motif 3 from the query may be advisable.
Random model letter frequencies (from non-redundant database):
A 0.281 C 0.222 G 0.229 T 0.267
SECTION I: HIGH-SCORING SEQUENCES
- Each of the following 12 sequences has E-value less than 10.
- The E-value of a sequence is the expected number of sequences
in a random database of the same size that would match the motifs as
well as the sequence does and is equal to the combined p-value of the
sequence times the number of sequences in the database.
- The combined p-value of a sequence measures the strength of the
match of the sequence to all the motifs and is calculated by
- finding the score of the single best match of each motif
to the sequence (best matches may overlap),
- calculating the sequence p-value of each score,
- forming the product of the p-values,
- taking the p-value of the product.
- The sequence p-value of a score is defined as the
probability of a random sequence of the same length containing
some match with as good or better a score.
- The score for the match of a position in a sequence to a motif
is computed by by summing the appropriate entry from each column of
the position-dependent scoring matrix that represents the motif.
- Sequences shorter than one or more of the motifs are skipped.
- The table is sorted by increasing E-value.
Links | Sequence Name | Description | E-value | Length
|
---|
| cgd2_20
| heat shock 70 (HSP70) pro...
| 6.8e-05
| 1755
|
| cgd6_4970
| Hsp60; GroEL-like chapero...
| 0.0019
| 150
|
| cgd3_3440
| heat shock protein HSP70,...
| 0.002
| 661
|
| cgd2_3230
| heat shock protein DnaJ P...
| 0.0082
| 1010
|
| cgd6_2650
| heat shock protein, putat...
| 0.021
| 140
|
| cgd3_3770
| Hsp90
| 0.023
| 1098
|
| cgd2_1800
| heat shock 40 kDa protein...
| 0.058
| 360
|
| cgd6_1090
| DnaJ(hsp40)'DnaJ(hsp40)'
| 0.11
| 629
|
| cgd4_3270
| heat shock 105kD; heat sh...
| 0.45
| 1175
|
| cgd7_360
| heat shock protein, Hsp70
| 0.46
| 1253
|
| cgd7_3670
| heat shock protein 90 (Hs...
| 1.5
| 217
|
| cgd2_3330
| APG-1 like HSP70 domain c...
| 3.7
| 819
|
SECTION II: MOTIF DIAGRAMS
- The ordering and spacing of all non-overlapping motif occurrences
are shown for each high-scoring sequence listed in Section I.
- A motif occurrence is defined as a position in the sequence whose
match to the motif has POSITION p-value less than 0.0001.
- The POSITION p-value of a match is the probability of
a single random subsequence of the length of the motif
scoring at least as well as the observed match.
- For each sequence, all motif occurrences are shown unless there
are overlaps. In that case, a motif occurrence is shown only if its
p-value is less than the product of the p-values of the other
(lower-numbered) motif occurrences that it overlaps.
- The table also shows the E-value of each sequence.
- Spacers and motif occurences are indicated by
- occurrence of motif `n' with p-value less than 0.0001.
A minus sign indicates that the occurrence is on the
reverse complement strand.
- Sequences longer than 1000 are not shown to scale and are indicated by thicker lines.
Links | Name | Expect | Motifs
|
---|
| cgd2_20
| 6.8e-05
|
| -3
|
| +3
|
| +2
|
| -3
|
| -3
|
| +2
|
| +3
|
| +3
|
| +3
|
| +2
|
| +2
|
|
|
|
| cgd6_4970
| 0.0019
|
|
| cgd3_3440
| 0.002
|
|
| cgd2_3230
| 0.0082
|
|
| cgd6_2650
| 0.021
|
|
| cgd3_3770
| 0.023
|
|
| cgd2_1800
| 0.058
|
|
| cgd6_1090
| 0.11
|
|
| cgd4_3270
| 0.45
|
|
| cgd7_360
| 0.46
|
|
| cgd7_3670
| 1.5
|
|
| cgd2_3330
| 3.7
|
|
SCALE
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
1 |
25 |
50 |
75 |
100 |
125 |
150 |
175 |
200 |
225 |
250 |
275 |
300 |
325 |
350 |
375 |
400 |
425 |
450 |
475 |
500 |
525 |
550 |
575 |
600 |
625 |
650 |
675 |
700 |
725 |
750 |
775 |
800 |
825 |
850 |
875 |
900 |
925 |
950 |
975 |
1000 |
1025 |
1050 |
1075 |
|
---|
SECTION III: ANNOTATED SEQUENCES
- The positions and p-values of the non-overlapping motif occurrences
are shown above the actual sequence for each of the high-scoring
sequences from Section I.
- A motif occurrence is defined as a position in the sequence whose
match to the motif has POSITION p-value less than 0.0001 as
defined in Section II.
- For each sequence, the first line specifies the name of the sequence.
- The second (and possibly more) lines give a description of the
sequence.
- Following the description line(s) is a line giving the length,
combined p-value, and E-value of the sequence as defined in Section I.
- The next line reproduces the motif diagram from Section II.
- The entire sequence is printed on the following lines.
- Motif occurrences are indicated directly above their positions in the
sequence on lines showing
- the motif number of the occurrence (a minus sign indicates that
the occurrence is on the reverse complement strand),
- the position p-value of the occurrence,
- the best possible match to the motif (or its reverse complement), and
- columns whose match to the motif has a positive score (indicated
by a plus sign).
cgd2_20
heat shock 70 (HSP70) protein
LENGTH = 1755 COMBINED P-VALUE = 5.68e-06 E-VALUE = 6.8e-05
DIAGRAM: 126-[-3]-126-[+3]-67-[+2]-96-[-3]-240-[-3]-270-[+2]-82-[+3]-20- [+3]-11-[+3]-197-[+2]-63-[+2]-292
[-3]
1.3e-06
TTTTTCCTCTTTTTT
++++++ ++++ +++
76 ATATATTAATATTATAAAGATGAGGAATTCATTCAAATAAATATAGTTGGATTTTTTTTTTTGTTTTATTAATTT
[+3]
3.9e-07
AAAAAAGAGGAAAAA
+++++++++++++++
226 ATAGGAAATTGCCTTTTATTTCTTTTAATTACTTTCTCATTTAAAAGAAAGAAAGAAAGTTGTTTTAGCTTTAAC
[+2]
2.1e-07
GGCGGTTTGGCGCCA
++ ++++++++++++
301 AATATATTTCTTTATATTGGAAATTTTTTTCTTGTCATGTGAAATTCTCGCACTTTTGGCGCCAATCTTGGATAT
[-3]
9.9e-08
TTTTTCCTCTTTTTT
++++++ ++++++++
451 TTTGAGTATATTTTTCTTCTTTTTTCATTTAAATTTATGTTTGTAAATTGTCTTAATAGTGAATTCTCATTCTAT
[-3]
1.2e-05
TTTTTCCTCTTTTTT
+ ++++ +++++ ++
676 ACTTTATTTTATTAGAATTTTTATTAAATAAAAGTTGTATTATTTTTTTTTTATTGTAATTATTAAAAATAATGG
[+2]
3.4e-07
GGCGGTTTGGCGCCA
++++++++++ ++++
976 AATAACGAAGCATGAACAACAACATGGCGGTTAGCTGCTAAAGTCAAATATTTACATTAATTATTATTATAAGAA
[+3]
3.1e-06
AAAAAAGAGGAAAAA
+++++++ ++ ++++
1051 AACGAGGAGTTGATTTATTCGGAAAGTAAAGTGATAAAAGTAAATGGAAAAAAGGGAGAAAATGAGGAATAAGAG
[+3] [+3]
2.6e-06 2.4e-05
AAAAAAGAGGAAAAA AAAAAAGAGGAAAAA
+++ +++++++++++ ++++++ ++++++
1126 GGGGAAGAAATGAAAGAAAGAAAAATATAAGAGAAAGAATGGGAAGAGTAGTAGTAGGAAGAAGGAAACAATGTA
[+2]
7.2e-08
GGCGGTTTGGCGCCA
++++++++++++++
1351 TCCATGTTATAGAAAATTGAAGGGTTTAGGCGCCAAATCGAGAGTTACTACTTTGTATAAAATAATTTATATATT
[+2]
4.3e-08
GGCGGTTTGGCGCCA
+++++++++++++++
1426 AATTGCGCATTAAATAAAAATTAGGGGGTTTGGCGGTAATTCTGAGACGCAATAATATTTAAAATAATAATAAAT
cgd6_4970
Hsp60; GroEL-like chaperone (ATpase), predicted mitochondrial
LENGTH = 150 COMBINED P-VALUE = 1.58e-04 E-VALUE = 0.0019
DIAGRAM: 72-[+3]-9-[-3]-39
[+3
1.0
AAA
+++
1 AATGACATTTATTTTGGCGGGAAAAGTTCTTGACAGAAAAAACATTTTTTTTTCATCACGAAGCCTGTAATTAAA
] [-3]
e-06 9.1e-05
AAAGAGGAAAAA TTTTTCCTCTTTTTT
++++++++ +++ ++++++++ ++ ++
76 GAAAAGGAGGAACTTTGGTGTTTTTTTCTGATTATTAGAAGGAATTGAGATATCATAGAGAAATAGCTTGAGAAG
cgd3_3440
heat shock protein HSP70, mitochondrial
LENGTH = 661 COMBINED P-VALUE = 1.69e-04 E-VALUE = 0.002
DIAGRAM: 425-[-3]-39-[-2]-167
[-3]
2.9e-05
TTTTTCCTCTTTTTT
++++++ ++++ +
376 ATCTACTTATTATTTAAAACTCTACTTGTAAATATTGAACATTATTGCATTTTTTTGTTTTAACTAATCTTTTTA
[-2]
3.5e-08
TGGCGCCAAACCGCC
+++++++++++ +++
451 TCTATTTTTGACAATCAGAAGAACACCTATGGCGCCAAAAAGCCCCAGTATAGCAAAATTTATTTACGGAATTCG
cgd2_3230
heat shock protein DnaJ Pfj2, putative
LENGTH = 1010 COMBINED P-VALUE = 6.81e-04 E-VALUE = 0.0082
DIAGRAM: 192-[-3]-45-[-3]-455-[+3]-152-[+3]-106
[-3]
1.6e-07
TTTTTCCTCTTTTTT
++++++++++++ ++
151 TATTTGGGAGCCTTTTTATTGCTAAAGGTTCACTAATGTTTATTTTTCCTTTTTCTTTGTTGATTATGATTCACT
[-3]
7.2e-07
TTTTTCCTCTTTTTT
+++++ +++++ +++
226 AGATCACATCTAATCCTTCTATCTCCTTTTTTACTCTTGTTTTTCTATTTTAATATTATCATATTCCACCTTTTT
[+3]
1.7e-08
AAAAAAGAGGAAAAA
++ ++++++++++++
676 CCTCCCTATATAAATATATTTATATATTGAGGGGAGTTTCAGAATTAAATAAAGAGGAAAAAGGTGATAATAAGG
[+3]
1.3e-06
AAAAAAGAGGA
++ +++ ++++
826 ATCACTTAAAGAATCTAATTTGGAATAGGAATAATAAATTCCTATAAGGTTCAAGCAAAGAATAAATAAACAGGA
AAAA
+++
901 AAAGTGGTATCATAATTATTAAGAATCTTAAATACTAAATTGTGAGGAGTGGTTTTGACACAGTTTTCAGAGTAG
cgd6_2650
heat shock protein, putative
LENGTH = 140 COMBINED P-VALUE = 1.78e-03 E-VALUE = 0.021
DIAGRAM: 9-[-3]-116
[-3]
7.7e-07
TTTTTCCTCTTTTTT
++++++++ ++++++
1 TTAGGCAAATTTTTCCTATTCTTTTTTATTGCCGGTATGGCAAGGGGGAAAAATTTAGCGCCTTATTTTTTAGTA
cgd3_3770
Hsp90
LENGTH = 1098 COMBINED P-VALUE = 1.95e-03 E-VALUE = 0.023
DIAGRAM: 66-[-3]-161-[+3]-729-[+3]-97
[-3]
5.1e-08
TTTTTCCTC
+++++++++
1 TAAAAAATTCAAAAATTTTTCGAGTAAAATTTCATTTAGTAAATATAAACAAAGGAAGACATTAATTTTTTTCTT
TTTTTT
++++++
76 TTTTTTTAATACATGTCTATTATCTCTGCAATAACGGTATTTCATAAATTGTTTATCCACATGAGTGCATGCATG
[+3]
7.7e-07
AAAAAAGAGGAAAAA
+++++++++++++ +
226 AATTGAGATTTGGACATAAAAAAAAGGAAATAAAGGTAAATTTGATGCTGAAAGAATGGGATAAGAAATTTGGGA
[+3]
5.9e-05
AAAAAAGAGGAAAAA
+ +++ ++ +++
976 TTTAATAGATAAGTCAAGGGGGGGAAGTTTAAATTAAAATTGAATAAAATAGGGGCAGGTGAGAGAAGAGATAAA
cgd2_1800
heat shock 40 kDa protein, putative
LENGTH = 360 COMBINED P-VALUE = 4.87e-03 E-VALUE = 0.058
DIAGRAM: 102-[+3]-243
[+3]
3.1e-07
AAAAAAGAGGAAAAA
++++++++ ++++++
76 AATATGTTAAATTTAAAGATAAATTACAAAAAAGAAAAAAAATGTTGAGAAGTGGTTATTTTAAAATTGACTTTT
cgd6_1090
DnaJ(hsp40)'DnaJ(hsp40)'
LENGTH = 629 COMBINED P-VALUE = 9.38e-03 E-VALUE = 0.11
DIAGRAM: 194-[-3]-28-[+3]-2-[+3]-113-[-3]-232
[-3]
2.6e-06
TTTTTCCTCTTTTTT
++++++ ++++++ +
151 TATTACAAGCGAGCGAATTTAATATCACAGTTATATTAGCATGGTTTTTCGTCTTTTCTACAACCCCTCATCGCT
[+3] [+3]
5.1e-07 1.2e-06
AAAAAAGAGGAAAAA AAAAAAGAGGAAAAA
++++++++ ++++++ ++ +++++ ++++++
226 CAACTTTAAATTAAAAAAAAAAAAAAACTAATAAAAAAAAAAAAATTAGAGAATTTTCACTTTAAATGTGAATTA
[-3]
3.5e-06
TTTTTCCTCTTTTTT
++++++ + +++ ++
376 ATTTCTTTTTTTTTTGTTCATTGTGTAGATATTACACACACCCAAATTACATGTACGTGCTCACACGTCTAAATT
cgd4_3270
heat shock 105kD; heat shock 105kD alpha; heat shock 105kD beta; heat shock 105kDa protein 1
LENGTH = 1175 COMBINED P-VALUE = 3.73e-02 E-VALUE = 0.45
DIAGRAM: 687-[+3]-429-[+3]-29
[+3]
1.1e-05
AAAAAAGAGGAAAAA
++ +++ +++ +++
676 CCAACAGGTTATAATAAACAGGGGGAACCCTCTAAAAGAGAGAAGGGTCAAGAAATAATCAACCGATTTGGTTGC
[+3]
2.0e-06
AAAAAAGAGGAAAAA
++ ++++++ +++++
1126 AAATTTAATAGAAAGTAAAAAAAGAACACAGAAACGTTAAATTTCGAGAA
cgd7_360
heat shock protein, Hsp70
LENGTH = 1253 COMBINED P-VALUE = 3.85e-02 E-VALUE = 0.46
DIAGRAM: 299-[-3]-50-[-3]-874
[
1
T
+
226 TACTAGAGAGAGTCAAAAAAAGGGATCAGCCAAGTATCTCTGACCTTCTAGAGCTTTACGGATCGTGTAAGTAAT
-3] [-3]
.3e-06 1.3e-05
TTTTCCTCTTTTTT TTTTTCCTCTT
++++++++++ ++ +++++ ++++
301 TTTTTCTTTTAATTTTTAAGCCTTTTAATAACTTACTTCCTTCTTGATTTGGCGTAAATTACACTTTTTATTCTC
TTTT
+++
376 ATTTTTTTTTCTCCTCAAATAGATACTGATCTAAGGAAAAGGTTCATCTCAGCTCAACCCTGCAACTCATCAACT
cgd7_3670
heat shock protein 90 (Hsp90), signal peptide plus ER retention motif
LENGTH = 217 COMBINED P-VALUE = 1.28e-01 E-VALUE = 1.5
DIAGRAM: 60-[-3]-142
[-3]
2.9e-05
TTTTTCCTCTTTTTT
+ ++++ ++++ ++
1 GAAAGAATTTAGAATAGCATATTAGCTACATTAATATAATTATCTCAAGCGTGTGCAGATTATTTTGTTTTGATT
cgd2_3330
APG-1 like HSP70 domain containing protein, signal peptide plus likely ER retention motif
LENGTH = 819 COMBINED P-VALUE = 3.12e-01 E-VALUE = 3.7
DIAGRAM: 819
Debugging Information
CPU: compute-0-3.local
Time 0.010000 secs.
mast /home/meme/meme354/LOGS/meme.11940.results /home/meme/meme354/LOGS/meme.11940.data -mf -df cparvum_hsp.fa -stdout
Button Help
Links to Entrez database at NCBI
Links to sequence scores (section I)
Links to motif diagrams (section II)
Links to sequence/motif annotated alignments (section III)
This information