ABSTRACT
Microarray experiments can reveal useful information on the transcriptional regulation. We try to find regulatory elements in the region upstream of translation start of coexpressed genes. Here we present a modification to the original Gibbs Sampling algorithm [12]. We introduce a probability distribution to estimate the number of copies of the motif in a sequence. The second modification is the incorporation of a higher-order background model. We have successfully tested our algorithm on several data sets. First we show results on two selected data set: sequences from plants containing the G-box motif and the upstream sequences from bacterial genes regulated by O2-responsive protein FNR. In both cases the motif sampler is able to find the expected motifs. Finally, the sampler is tested on 4 clusters of coexpressed genes from a wounding experiment in Arabidopsis thaliana. We find several putative motifs that are related to the pathways involved in the plant defense mechanism.
- {1} T. L. Bailey and C. Elkan. Unsupervised learning of multiple motifs in biopolymers using expectation maximization. Machine Learning, 21:51-80, 1995. Google ScholarDigital Library
- {2} G. F. Birkenmeier and C. A. Ryan. Wound signaling in tomato plants, evidence that aba is not a primary signal for defense gene activation. Plant Physiol, 117(2):687-693, 1998.Google ScholarCross Ref
- {3} P. Bucher. Regulatory elements and expression profiles. Current Opinion in Structural Biology, 9:400-407, 1999.Google ScholarCross Ref
- {4} F. De Smet, G. Thijs, K. Marchal, B. De Moor, and Y. Moreau. Quality-based clustering of gene expression profiles, submitted, 2000.Google Scholar
- {5} A. L. Delcher, D. Harman, S. Kasif, O. White, and S. L. Salzberg. Improved micorbial gene identification with glimmer. Nucleic Acid Research, 27(23):4636-4641, 1999.Google ScholarCross Ref
- {6} J. L. DeRisi, V. R. Iyer, and P. O. Brown. Exploring the metabolic and genetic control of gene expression on a genomic scale. Science, 278:680-, 1997.Google Scholar
- {7} M. B. Eisen, P. T. Spellman, P. O. Brown, and D. Botstein. Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. USA, 95:14863-14868, 1998.Google ScholarCross Ref
- {8} L. J. Heyer, S. Kruglyak, and S. Yooseph. Exploring expression data: Identification and analysis of coexpressed genes. Genome Research, 9:1106-1115, 1999.Google ScholarCross Ref
- {9} J. D. Hughes, P. W. Estep, S. Tavazoie, and G. M. Church. Computational identification of cis-regulatory elements associated with groups of functionally related genes in Saccharomyces cerevisiae. Journal of Molecular Biology, 296:1205-1214, 2000.Google ScholarCross Ref
- {10} L. J. Jensen and S. Knudsen. Automatic discovery of regulatory patterns in promoter regions based on whole cell expression data and functional annotation. Bioinformatics, 16(4):326-333, 2000.Google ScholarCross Ref
- {11} A. Krogh. Two methods for improving performance of an hmm and their application for gene finding. In Proceedings ISMB'97, pages 179-186, 1997. Google ScholarDigital Library
- {12} C. E. Lawrence, S. F. Altschul, M. S. Boguski, J. S. Liu, A. F. Neuwald, and J. C. Wootton. Detecting subbtle sequence signals: a Gibbs sampling strategy for multiple alignment. Science, 262:208-214, 1993.Google ScholarCross Ref
- {13} R. J. Lipschutz, S. P. A. Fodor, T. R. Gingeras, and D. J. Lockheart. High density synthetic oligonucleotide arrays. Nature Genetics Supplement, 21:20-24, january 1999.Google ScholarCross Ref
- {14} J. S. Liu, A. F. Neuwald, and C. E. Lawrence. Bayesian models for multiple local sequence alignment and gibbs sampling strategies. Journal of the American Statistical Association, 90(432):1156-1170, 1995.Google ScholarCross Ref
- {15} A. V. Lukashin and M. Borodowsky. Genemark.hmm: new solutions for gene finding. Nucleic Acid Research, 26:1107-1115, 1998.Google ScholarCross Ref
- {16} Kathleen Marchal. The O2 paradox of Azospirillum brasilense under diazotrophic conditions. PhD thesis, FLTBW, KULeuven, 1999.Google Scholar
- {17} E. Mjolsness, T. Mann, R. Castaño, and B. Wold. From coexpression to coregulation: An approach to inferring transcriptional regulation among gene classes from large-scale expression data. In Proceedings NIPS 2000, volume 12, pages 928-934, 2000.Google Scholar
- {18} A. F. Neuwald, J. S. Liu and C. E. Lawrence. Gibbs motif sampling: detection of bacterial outer membrane protein repeats. Protein Science, 4:1618-1632, 1995.Google ScholarCross Ref
- {19} N. Pavy, S. Rombauts, P. Déhais, C. Mathé, D. V. V. Ramana, P. Leroy, and P. Rouzé. Evaluation of gene prediction software using a genomic data set: Aplication to Arabidopsis thaliana sequences. Bioinformatics, 15:887-899, 1999.Google ScholarCross Ref
- {20} H. Pena-Cortes, J. J. Sanchez-Serrano, R. Mertens, L. Willmitzer, and S. Prat. Abscisic acid is involved in the wound-induced expression of the proteinase inhibitor ii gene potato and tomato. Proc. Natl. Acad. Sci USA, 86:9851-9855, 1989.Google ScholarCross Ref
- {21} P. Reymond and E. E. Farmer. Jasmonate and salicylate as global signals for defense gene expression. Curr Opin Plant Biol, 1(5):404-411, 1998.Google ScholarCross Ref
- {22} P. Reymond, H. Weber, M. Damond, and E. E. Farmer. Differential gene expression in response to mechanical wounding and insect feeding in Arabidopsis. Plant Cell, 12:707-719, 2000.Google ScholarCross Ref
- {23} S. Rombauts, P. Déhais, M. Van Montagu, and P. Rouzé. PlantCARE, a plant cis-acting regulatory element database. Nucleic Acids Research, 27:295-296, 1999.Google ScholarCross Ref
- {24} F. P. Roth, J. D. Hughes, P. W. Estep, and G. M. Church. Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole genome mRNA quantitation. Nature Biotechnology, 16:939-945, 1998.Google ScholarCross Ref
- {25} J. Rouster, R. Leah, J. Mundy, and V. Cameron-Mills. Identification of a methyl jasmonate-responsive region in the promoter of a lipoxygenase 1 gene expressed in barley grain. Plant Journal, 11(3):513-523, 1997.Google ScholarCross Ref
- {26} M. Schena. Genome analysis with gene expression microarrays. BioEssays, 18(5):427-431, 1996.Google ScholarCross Ref
- {27} M. Schena, D. Shalon, R. W. Davis, and P. O. Brown. Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science, 270:467-470, 1995.Google ScholarCross Ref
- {28} G. Sherlock. Analysis of large-scale gene expression data. Curr. Opin. Immunol., 12:201-205, 2000.Google ScholarCross Ref
- {29} P. T. Spellman, G. Sherlock, M. Q. Zhang, V. R. Iyer, K. Anders, M. B. Eisen, P. O. Brown, D. Botstein, and B. Futcher. Comprehensive identification of cell cycle-regulated genes of the yeast S. Cerevisiae by microarray hybridization. Molecular Biology of the Cell, 9:3273-3297, 1998.Google ScholarCross Ref
- {30} Z. Szallasi. Genetic network analysis in light of massively parallel biological data acquisition. In Proceedings PSB'99, volume 4, pages 5-16, 1999.Google Scholar
- {31} S. Tavazoie, J. D. Hughes, M. J. Campbell, R. J. Cho, and G. M. Church. Systematic determination of genetic network architecture. Nature Genetics, 22(7):281-285, 1999.Google ScholarCross Ref
- {32} G. Thijs, M. Lescot, K. Marchal, S. Rombauts, B. De Moor, P. Rouzé, and Y. Moreau. A higher order background model improves the detection by Gibbs sampling of potential promoter regulatory elements in DNA sequences. Technical Report 00-128, ESAT-SISTA/COSIC, KULeuven, 2000. submitted, Genome Research.Google Scholar
- {33} J. van Helden, B. André, and L. Collado-Vides. Extracting regulatory sites from upstream region of yeast genes by computational analysis of oligonucleotide frequencies. Journal of Molecular Biology, 281:827-842, 1998.Google ScholarCross Ref
- {34} J. van Helden, A. F. Rios, and J. Collado-Vides. Discovering regulatory elements in noncoding sequences by analysis of spaced dyads. Nucleic Acids Research, 28(8):1808-1818, 2000.Google ScholarCross Ref
- {35} A. Vanet, L. Marsan, A. Labigne, and M. F. Sagot. Inferring regulatory elements from a whole genome, an analysis of helicobacter pylori sigma(80) family of promoter signals. Journal of Molecular Biology, 297(2):335-353, 2000.Google ScholarCross Ref
- {36} X. Wen, S. Fuhrman, G. S. Michaels, D. B. Carr, S. Smith, J. L. Barker, and R. Somogyi. Large-scale temporal gene expression mapping of central nervous system development. Proc. Natl. Acad. Sci. USA, 95:334-339, 1998.Google ScholarCross Ref
- {37} C. T. Workman and G. D. Stormo. Ann-spec: a method for discovering transcription binding sites with improved specificity. In Proceedings PSB'2000, volume 5, Honolulu, Hawai, 2000.Google Scholar
- {38} M. Q. Zhang. Large-scale gene expression data analysis: A new challenge to computational biologists. Genome Research, 9:681-688, 1999.Google Scholar
- {39} J. Zhu and M. Q. Zhang. Cluster, function and promoter: analysis of yeast expression array. In Proceedings PSB'2000, volume 5, pages 467-486, 2000.Google Scholar
Index Terms
- A Gibbs sampling method to detect over-represented motifs in the upstream regions of co-expressed genes
Recommendations
Identification of specific sequence motifs in the upstream region of 242 human miRNA genes
We have identified novel over-represented and conserved motifs in the upstream regions of human and mouse miRNA stem-loop sequences by means of a new bioinformatic processing regimen. We observed sequence conservation -500bp upstream in 189 human and ...
In silico analysis of motifs in promoters of Differentially Expressed Genes in rice (Oryza sativa L.) under anoxia
The aim of this study was to characterise the molecular mechanisms of transcriptional regulation of Differentially Expressed Genes (DEGs) in rice coleoptiles under anoxia by identifying motifs that are common in the promoter region of co-regulated ...
Elucidation of directionality for co-expressed genes: predicting intra-operon termination sites
Motivation: In this paper, we present a novel framework for inferring regulatory and sequence-level information from gene co-expression networks. The key idea of our methodology is the systematic integration of network inference and network ...
Comments