Sci. Aging Knowl. Environ., 12 December 2001
Cornfields Fertilize Microarray Techniques: Old-time agricultural statistics dig up new set of age-related genes
Key Words: microarray ANOVA variance statistics gene expression
Abstract: From an airplane window, the brown-green checkerboard of plowed and planted farmland resembles the red and green pattern of a DNA microarray. This laboratory tool can identify--on a single slide--which of an organism's genes are active at any time. The similarity to agricultural fields extends beyond appearance: Classical statistics, originally developed to evaluate crop performance, are now being applied to microarray interpretations. A new study analyzed how fruit fly gene activities differ between male and female as well as young and old. Although many genes were affected by gender, only a small fraction changed with age--a different set of genes than that found by nonstatistical methods in a previous study. The new results bolster the view that microarrays yield more detailed and meaningful information when traditional statistical methods are used (see Making Aging Visible and related discussion).
DNA microarrays register what a cell is doing by measuring its messenger RNA (mRNA). Each of many thousands of spots on a microarray slide is a unique fragment of a single gene. In a microarray experiment, researchers first convert mRNAs from cell contents into their matching DNA (cDNA) strands and attach a distinctive fluorescent dye to each sample--for instance, those from old and young animals. Then they pour the cDNA mixture onto the slide and measure the glow at every spot. To assess how much cDNA stuck to each gene--which reflects the amount of mRNA the animals carried--researchers typically compare the brightness of each spot with that from a reference sample on the same slide. Usually they test each sample only once, and if the signal from the two samples differs by a large amount (usually twofold), they conclude that the genes' activities differ. But microarray results vary in the same way that corn productivity from multiple plots in the same field can--even when the farmer tries to provide them all with equal amounts of water, light, and fertilizer. The same sample might produce different amounts of fluorescence depending on the way it was diluted, the particular slide used, or some other unknown variable. Critics charge that the lack of replication precludes researchers from considering the extent to which the same sample can produce different results on separate trials. Moreover, reference samples add bias and additional variation as well as unnecessary inefficiency, because the researchers measure them many times even though they are of no scientific interest.
Several groups, including Jin's, have challenged the status quo by bringing traditional statistical techniques to bear on microarray analysis. Jin's group appears to be the first to apply the mixed-model method: a statistical approach that simultaneously compares many characteristics of different samples without using a reference and hinges on having several sets of results for each sample. The model is "mixed" because it accounts for accidental variation, such as the diameter of a spot, while examining traits such as gender, age, and strain. The analysis allows researchers to probe whether these characteristics of interest enhance or inhibit each other rather than operate independently.
On each microarray, Jin and colleagues deposited cDNA from young and old mice of a single sex and one of two possible strains; they duplicated this procedure six times for each of the four combinations of sex and strain to generate a total of 24 slides. Then they looked for fluctuations in fluorescence between the same spots from different aged animals on the same or on different slides. But first, they rendered the signals from the slides comparable to one another by adjusting their brightness so that, on average, they emitted the same overall intensity. The researchers then compared the fluorescence readings for each spot to the average brightness of the entire collection rather than to a reference sample. The result: Gene expression was affected most strongly by sex, less by strain, and only weakly by age. In addition, the small set of age-related genes they identified differed from that found by Zou and colleagues a year earlier using a traditional reference-sample technique, twofold changes, and only one microarray per sample. The disparity between the findings might stem from the experimental material itself: The two groups used animals of different ages and strains. But it could also be due to differences in statistical techniques: Because Jin and colleagues conducted the test on six duplicate slides, they were able to determine statistically that some twofold measurements from a single slide were insignificant. At the other extreme, some age-related genes showed only small--1.2-fold--increases in activity levels between samples, but the changes were statistically significant because they were consistently reproduced on all six slides. The results hint that the statistical methods used could bring much-needed rigor to the interpretation of microarrays, turning them from fallow fields into fertile ground.
--Katharine Miller, suggested by Greg Liszt.
Making Aging Visible http://www.maths.lth.se/matstat/bioinformatics/sageke
Citation: K. Miller, Cornfields Fertilize Microarray Techniques: Old-time agricultural statistics dig up new set of age-related genes Science's SAGE KE (12 December 2001), http://sageke.sciencemag.org/cgi/content/abstract/sageke;2001/11/nw41
Science of Aging Knowledge Environment. ISSN 1539-6150