Copyright © 2005 Elsevier Ltd All rights reserved.
Comments on selected fundamental aspects of microarray analysis
Received 26 May 2005;
Abstract
Microarrays are becoming a ubiquitous tool of research in life sciences. However, the working principles of microarray-based methodologies are often misunderstood or apparently ignored by the researchers who actually perform and interpret experiments. This in turn seems to lead to a common over-expectation regarding the explanatory and/or knowledge-generating power of microarray analyses.
In this note we intend to explain basic principles of five (5) major groups of analytical techniques used in studies of microarray data and their interpretation: the principal component analysis (PCA), the independent component analysis (ICA), the t-test, the analysis of variance (ANOVA), and self organizing maps (SOM). We discuss answers to selected practical questions related to the analysis of microarray data. We also take a closer look at the experimental setup and the rules, which have to be observed in order to exploit microarrays efficiently. Finally, we discuss in detail the scope and limitations of microarray-based methods. We emphasize the fact that no amount of statistical analysis can compensate for (or replace) a well thought through experimental setup. We conclude that microarrays are indeed useful tools in life sciences but by no means should they be expected to generate complete answers to complex biological questions. We argue that even well posed questions, formulated within a microarray-specific terminology, cannot be completely answered with the use of microarray analyses alone.
Keywords: Microarrays; Fundamental tools; FAQ section
Article Outline
- 1. Introduction
- 2. Fundamentals and basic terminology
- 2.1. General introduction to microarrays
- 2.1.1. The historical background
- 2.1.2. Today's microarrays
- 2.2. Applications
- 3. Data representation and analysis
- 3.1. The raw data
- 3.2. The data table and some preliminary considerations and manipulations
- 3.2.1. Translation
- 3.2.2. Normalization
- 3.3. Graphic exploration
- 3.3.1. Preliminary considerations
- 3.3.2. By hand (with a spreadsheet)
- 3.3.3. PCA
- 3.3.4. ICA
- 3.3.5. A brief remark
- 3.4. Statistical tests
- 3.4.1. Preliminary considerations
- 3.4.2. ANOVA
- 3.4.3. Paired t-test
- 3.4.4. t-Test
- 3.4.5. In conclusion
- 3.5. Graphic exploration and statistical tests in comparison
- 3.6. And the clustering approach?
- 3.7. And SOM?
- 4. Intricacies of microarray-based methods
- 4.1. FAQ
- 4.1.1. Missing values
- 4.1.2. The correction of the background noise on the membranes, glass plates or silicon chips
- 4.1.3. Dealing with data containing a large number of very small or zero values
- 4.1.4. Taking the ratio or not?
- 4.1.5. The problem posed by the two fluorescent dyes used with glass plates
- 4.1.6. How many genes should I put on my microarray?
- 4.1.7. What can I do if my signal is outside the linear range (of my machine)?
- 4.1.8. How does one tackle a temporal series?
- 4.1.9. How do we find genes for an accurate diagnosis of a disease?
- 4.1.10. How do we determine the relative importance of a factor?
- 4.1.11. What does the p-value tell me? What about false positives?
- 4.1.12. Will I not miss out on a few genes?
- 4.2. How to plan one's experiment?
- 4.2.1. The type of factors
- 4.2.2. The ideal situation: a fully crossed factorial design
- 4.2.3. The reality
- 4.2.4. The combination of factors
- 5. The answer to all our questions?
- 6. Software and data used
- Acknowledgements
- References






E-mail Article
Add to my Quick Links

Cited By in Scopus (7)






+ (3.737 × 0.251) = 19.0. The other coordinates are obtained accordingly.
