Establishing glucose- and ABA-regulated transcription networks in Arabidopsis by microarray analysis and promoter classification using a Relevance Vector Machine

  1. Yunhai Li1,
  2. Kee Khoon Lee3,
  3. Sean Walsh2,
  4. Caroline Smith1,
  5. Sophie Hadingham1,
  6. Karim Sorefan1,
  7. Gavin Cawley3, and
  8. Michael W. Bevan1,4
  1. 1 Department of Cell and Developmental Biology, John Innes Centre, Norwich NR4 7UH, United Kingdom
  2. 2 Computational Biology Department, John Innes Centre, Norwich NR4 7UH, United Kingdom
  3. 3 The School of Computing Sciences, University of East Anglia, Norwich NR4 7TJ, United Kingdom

Abstract

Establishing transcriptional regulatory networks by analysis of gene expression data and promoter sequences shows great promise. We developed a novel promoter classification method using a Relevance Vector Machine (RVM) and Bayesian statistical principles to identify discriminatory features in the promoter sequences of genes that can correctly classify transcriptional responses. The method was applied to microarray data obtained from Arabidopsis seedlings treated with glucose or abscisic acid (ABA). Of those genes showing >2.5-fold changes in expression level, ∼70% were correctly predicted as being up- or down-regulated (under 10-fold cross-validation), based on the presence or absence of a small set of discriminative promoter motifs. Many of these motifs have known regulatory functions in sugar- and ABA-mediated gene expression. One promoter motif that was not known to be involved in glucose-responsive gene expression was identified as the strongest classifier of glucose-up-regulated gene expression. We show it confers glucose-responsive gene expression in conjunction with another promoter motif, thus validating the classification method. We were able to establish a detailed model of glucose and ABA transcriptional regulatory networks and their interactions, which will help us to understand the mechanisms linking metabolism with growth in Arabidopsis. This study shows that machine learning strategies coupled to Bayesian statistical methods hold significant promise for identifying functionally significant promoter sequences.

Footnotes

  • [Supplemental material is available online at www.genome.org. The microarray data from this study have been submitted to ArrayExpress under accession no. E-MEXP-475.]

  • Article published online ahead of print. Article and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.4237406.

  • 4 Corresponding author. E-mail michael.bevan{at}bbsrc.ac.uk; fax 01603 450025.

    • Accepted November 14, 2005.
    • Received June 6, 2005.
| Table of Contents

Preprint Server