! Simulated_Site_Search ! Special purpose program to search for GTA N8 TAC N20-24 TANNNT ! in random sequences ! (This program does not represent a very good way to do this -- ! for one thing it's very slow -- but it may correspond to the ! method by which a human might think about doing the task) ! **************** LIBRARIES and FUNCTIONS ************ DECLARE FUNCTION Random_Integer ! ********************** VARIABLES ******************** LET times_site_found = 0 ! Count of number of times site is found in sequences LET sequence$ = "" ! Stores random sequence ! ********************** CONSTANTS ******************** LET true = 1 LET false = 0 LET total_number_of_trials = 1000 LET size_of_sequence = 1000 LET bases$ = "ACGT" LET site_size_min = 3 + 8 + 3 + 20 + 6 ! GTA NNNNNNNN TAC NNN...NNN TANNNT ! ******************** MAIN PROGRAM ******************* ! Strategy: The program goes through however many trials you specify ! Each trial, a random sequence is made up and the ! sequence is examined to see if it contains an NtcA binding site ! If it does, it adds one to the count of successes. ! Every 100'th trial, the program prints the trial number, to give ! you something to look at on the screen while the program thinks. FOR trial = 1 TO total_number_of_trials IF Mod(trial,100) = 0 THEN print trial; CALL Make_up_random_sequence CALL Examine_sequence_for_site IF site_found = true THEN LET times_site_found = times_site_found + 1 NEXT trial CALL Print_results ! ************* SUBROUTINES and FUNCTIONS ************* SUB Make_up_random_sequence ! Initializes sequence$ to a string of all blanks (length = size_of_sequence) ! Then replaces blanks with random bases ! Assumes that all four bases are equally likely to occur LET sequence$ = Repeat$(" ", size_of_sequence) FOR position = 1 TO size_of_sequence LET base = Random_Integer(1,4) ! base = 1 for A, 2 for C, 3 for G, 4 for T LET sequence$[position:position] = bases$[base:base] NEXT position END SUB SUB Examine_sequence_for_site ! Sets site_found as true if site found in random sequence ! Sets site_found as false if site not found in random sequence ! Strategy: Ask whether first part of site (GTANNNNNNNNTAC) ! corresponds to the 14 bases in the current window ! under consideration ! If so, then look ahead 20-24 bases to see if second ! part of site is also present ! Do this considering the entire length of sequence ! Stop looking just before end, when finding a site ! is no longer possible. LET first_base = 1 DO LET window_start = first_base LET window_end = window_start + 13 CALL Check_for_first_part(sequence$[window_start:window_end]) IF site_found = true THEN FOR spacer = 20 TO 24 LET window_start = first_base + 14 + spacer LET window_end = window_start + 5 CALL Check_for_second_part(sequence$[window_start:window_end]) IF site_found = true THEN EXIT SUB NEXT spacer END IF LET first_base = first_base + 1 LOOP UNTIL first_base > (size_of_sequence - site_size_min) END SUB SUB Check_for_first_part(subsequence$) ! Sets site_found as true if GTANNNNNNNNTAC found ! Otherwise sets site_found as false ! Strategy: Checks first three bases. If they're "GTA" then ! checks last three bases to see if they're "TAC" IF subsequence$[1:3] = "GTA" THEN IF subsequence$[12:14] = "TAC" THEN LET site_found = true ELSE LET site_found = false END IF ELSE LET site_found = false END IF END SUB SUB Check_for_second_part(subsequence$) ! Sets site_found as true if TANNNT found ! Otherwise sets site_found as false ! Strategy: Checks first two bases. If they're "TA" then ! checks last base to see if it's "T" IF subsequence$[1:2] = "TA" THEN IF subsequence$[6:6] = "T" THEN LET site_found = true ELSE LET site_found = false END IF ELSE LET site_found = false END IF END SUB SUB Print_results PRINT PRINT PRINT times_site_found/total_number_of_trials; PRINT "= fraction of random sequences of length"; PRINT size_of_sequence; PRINT "with matches to GTA NNNNNNNN TAC NNN (20-24) NNN TANNNT" END SUB FUNCTION Random_Integer(low_integer, high_integer) ! Returns an integer between low_integer and high_integer, inclusive LET range = high_integer - low_integer + 1 LET Random_Integer = Int(low_integer + range*rnd) END FUNCTION END