Introduction to Microarrays
Contents Why microarrays?

Consider some of the most fundamental questions in biology:

  • How does one cell or tissue type differ from another?

  •  - What makes a cell tumorigenic?
     - How is smooth muscle different molecularly from skeletal muscle?
     
  • How does an organism respond to external events?

  •   - What are brain cells doing when they are creating a new memory?
      - How do liver cells change when challenged by a drug?
     
  • What is the basis for changes in an organism over time?

  •   - What is the process through which an embryo develops into a mature organism?
      - What marks the process of apoptosis (programmed cell death)?
     
  • What is the mechanism by which key regulatory proteins exert their effects?

  •   - How does the loss or mutation of FMR lead to the array of observable symptoms
         collected under the umbrella Fragile-X syndrome?
These questions can be addressed by comparing the state of the cell, tissue, or organism under two conditions: at different developmental states (e.g. normal vs tumorigenic), in different external environments (e.g. with or without drug), at different moments in time (e.g. throughout the time course of development), and in different genetic backgrounds (e.g. with or without a specific mutation). 

But what is “state”? Included in that term is the collection of all (small) molecules within a cell, tissue, or organism, studied through the tools of metabolomics. We’d also like to know the collection of all protein, the molecules that most directly affect cell behavior, studied through the tools of proteomics. The set of proteins is determined in part and can be monitored indirectly by the collection of messenger RNA, studied through the tools of transcriptomics. The set of genes and noncoding DNA can be studied through the tools of genomics, but the questions listed above cannot be addressed in this way, as almost all cells within an organism possess the same DNA.

At present, the best developed set of tools are those of transcriptomics. It’s far easier to monitor RNA than protein, and the chief tool in doing so is the microarray.

What are microarrays? (consider skipping if you already know all about microarrays)

To see what a microarray is and how it can be used, see any/some/all/none of the web sites below for possibly useful animations:

Animation of procedure to use microarrays (short animation)
     Malcolm Campbell, Davidson College

Animation of procedure to make and use microarrays (long animation)
     Genetic Science Learning Center, University of Utah

Anatomy of a Comparative Gene Expression Study (non-animation)
     Jeremy Buhler, Washington University of St. Louis

There are two general kinds of microarrays: those that give absolute measures of RNA abundance (most of them marketed by Affymetrix) and those that give relative measures, comparing two conditions. The two animations illustrate the latter, generally less expensive kind of microarray, and we’ll also focus henceforth on that kind, termed spotted arrays. The process of spotting is intrinsically variable: the spots may contain different amounts of DNA, the shapes of the spot may vary, the amount of DNA lying outside of the spotted area may differ from one region of the microarray to another. So absolute levels of RNA measured by a spotted array are not as meaningful as you might hope. Useful analysis of the results come from comparing the binding of RNA isolated under one condition) to the binding of RNA isolated under the other condition -- all to the same microarray. The two RNA's are distinguished by the color of the fluorescent tag, as illustrated in the animations.

Closer look at microarrays through the lens of a scientific problem

Enough generalities. Let's look at a real problem addressed by real microarrays. 

To plants and photosynthetic bacteria, light is life. No light, no food. But too much light... death. That's because if more light is absorbed by photosynthetic pigments than can be harnassed or effectively dissipated, the excess energy can find its way to oxygen, creating reactive species (superoxide, peroxides, and free radicals) that can damage the cell. Under conditions of high light, photosynthetic cells have to radically retool to prevent this fate. 

But how does the cell alter its metabolism to cope with the stress imposed by the new environment? What regulatory proteins sense the condition of high light and how do they alter gene expression? Microarrays can help us answer these questions by providing information regarding the following:
 

Which genes are up-regulated or down-regulated in response to high light?

To find some microarrays to address this problem:

  • Go to the Stanford MicroArray Database 

  •  
  • Click on Public Login (on the left)

  •  
  • Under Category, highlight stress. Click Display data.

  •  
  • Too many experiments! Go back to the menus, and under SubCategory, highlight light-treatment. Click Display data.

  •  
  • The list of experiments is certainly difficult to understand. The names were chosen by the experimenters for their own use, not for ours. The first four experiments (39587 - 39590) ask the question of how gene expression is affected when the gene encoding the regulatory protein DspA is disrupted. Skip over those for now.

  •  
  • The next four experiments (39583 - 39586) examine the response of wild-type (wt) cells to high light (HL) intensity for periods of 0.5, 1.0, 3.0, and 6.0 hours. That sounds pertinent! Consider Experiment 39584, examining the effect of the second condition: 1 hr of high light intensity. In this experiment the RNA level in cells exposed to 1 hr of high light intensity is compared to the RNA level in cells maintained continuously under low light intensity.

  •  
  • Look at the Options column on the row of that experiment. You'll see seven icons. The sixth icon is a multicolored box (representing the microarray image). Click on that (and be prepared to wait a while). You should see a microarray in front of you.
SQ1. How many spots are there on this microarray?

SQ2. How would you describe the range of characteristics of the spots?

SQ3. What is the significance of each kind of spot? E.g. What does red mean?
          What does green mean? (Not sure? Stay tuned!)

Find a particularly green spot and click on it. After a while a new screen will appear, showing a blow-up of the spot plus information about the gene whose DNA is present in the spot and information about the fluorescence emanating from the spot. There's plenty -- too much -- information here. For now focus on just the following:
Biological Information (left column)
  • Systematic name (this will enable us to refer to the same gene in BioBIKE)
  • Subcategory (this tells you the general class to which the protein belongs)
  • Product (this tells you the specific function of the protein, if known)
Spot Information (right column)
  • Channel 1 and 2 intensities (mean)
  • Channel 1 and 2 backgrounds (median)
  • Channel 1 and 2 nets (mean)
  • Spot Flag
  • Number of spot pixels
  • G/R mean
  • R/G mean
Note the quantities above (copy them down someplace) and repeat the process on another spot, this time red. Do this with a few other spots of your choice.
    SQ4. What is the significance of Channel 1 and 2 intensities?
SQ5. What do you think is the significance of the backgrounds? (More on this in a moment)
SQ6. Where does the net come from?
SQ7. How many pixels are there per spot? How does this number relate to the size of the white box at the top of the left column?
    SQ8. Where does G/R mean and R/G mean come from?
Now for some strangeness. Go back to the picture of the microarray and identify the 19th spot on the top row (it will be SLR0311). 
    SQ9. From your knowledge of how microarrays are made, how do you explain its peculiar form? Notice that there are many other spots with the same form.
Go back to the menu and select the microarray image for experiment 39588. Scroll about 75% down the image. There you'll see something awful. Click on a dot obscured by the noise. Surely there's no useful information in some of these spots. But the computer can't see the image, just numbers. 
    SQ10. What in the Spot Information for this spot could alert you to the problem?
    SQ11. What problems do you see or imagine there are in interpretting spots from a single microarray?

Resolution: What are over- and under-expressed genes in response to high light intensity?

The Stanford MicroArray Database site is a place to get and look at microarray data, but it's not a particularly good place to analyze microarray data. For that, we turn to CyanoBIKE.

  • Log onto CyanoBIKE (VCU mirror)
     
  • Enter the following:
    (you can get the function either from the alphabetical list or from the GENOME/Microarray-Analysis menu). The function gives you values from a microarray experiment for a given gene and experimental condition.
     
    • For the experiment, go to the DATA menu, Microarrays submenu, Synechocystis submenu, and click Hihara_2001, a series of microarray experiments measuring the effect of high light conditions on gene expression in the cyanobacterium Synechocystis PCC 6803.
    • For the gene, insert the gene you copied from the Stanford Microarray Database that gave a green spot
    • For the condition, enter 2, since the second condition was 1 hr of high light intensity
    • Choose as an option +TARGET-VALUES, directing the function to return the experimental values, i.e. fluorescence intensity of the microarray spot probed with mRNA isolated under the high light condition.

     
  • Execute the function. Notice that you get back a list of six numbers. Hihara et al repeated each experimental condition six times.
     
  • Modify the function by deleting +TARGET-VALUES and use +CONTROL-VALUES instead, directing the function to return the fluorescence intensity of the microarray spot probed with mRNA isolated under normal light conditions.
     
  • Execute the function and compare the control values to the experimental values.
    SQ12. Did the gene expression go up under high light? Down? No change? Not sure?
    SQ13. How would you calculate a ratio of expression? Consider using DIVIDE and MEAN.
  • Repeat the same commands but with gene names you found in your perusal of the microarray spots.

  • Enter the following:
    (you can get the function either from the alphabetical list or from the GENOME/Microarray-Analysis menu). The function searches the values within a microarray experiment for genes within a set of genes under a given experimental condition for those values that meet your specifications.
     
    • For the experiment, go to the DATA menu, Microarrays submenu, Synechocystis submenu, and click Hihara_2001. Don't be tempted to type in the name. It won't work.
    • For the gene, insert the GENES-OF function with the argument S6803, so that you'll search amongst all genes of Synechocystis
    • For the condition, enter 2, as before
    • For the search condition (the menu labeled "ratio"), leave alone the default -- ratio of experimental value to control value
    • For the comparitor (the menu labeled ">"), leave alone the default -- searching for values greater than the specified value
    • For the value, specify 3 (requiring that the ratio of experimental to control exceed 3)
    • Choose as options +RATIO (show the ratio) and +DESCRIPTION-SHORT (show a short description)

     
  • Execute the function. You can display the results in a nice format by surrounding the function with DISPLAY-LIST and clicking the EACH icon.
    SQ14. Does the list of genes supposedly upregulated by high intensity light make sense from their descriptions?
  • Change the function so that it looks for genes with ratios less than .333
    SQ15. Does the list of genes supposedly downregulated by high intensity light make sense from their descriptions?
    SQ16. Can you trust this list? Is a factor of 3 reliable? Why not 2? Why not 4? How to decide?
  • Go back to the INSIDE-MICROARRAY function, change the gene to SLL1683, and choose as options +RATIO, +TARGET-VALUES, and +CONTROL-VALUES.
    SQ17. Do you believe that this gene is significantly downregulated by high light intensity? How to decide?