Comparing early and late data fusion methods for gene expression prediction

Re, Matteo

doi:10.1007/s00500-010-0599-6

Comparing early and late data fusion methods for gene expression prediction

Focus
Published: 21 March 2010

Volume 15, pages 1497–1504, (2011)
Cite this article

Soft Computing Aims and scope Submit manuscript

Matteo Re¹

166 Accesses
5 Citations
Explore all metrics

Abstract

The most basic molecular mechanism enabling a living cell to dynamically adapt to variation occurring in its intra and extracellular environment is constituted by its ability to regulate the expression of many of its genes. At biomolecular level, this ability is mainly due to interactions occurring between regulatory motifs located in the core promoter regions and the transcription factors. A crucial question investigated by recently published works is if, and at what extent, the transcription patterns of large sets of genes can be predicted using only information encoded in the promoter regions. Even if encouraging results were obtained in gene expression patterns prediction experiments the assumption that all the signals required for the regulation of gene expression are contained in the gene promoter regions is an oversimplification as pointed out by recent findings demonstrating the existence of many regulatory levels involved in the fine modulation of gene transcription levels. In this contribution, we investigate the potential improvement in gene expression prediction performances achievable by using early and late data integration methods in order to provide a complete overview of the capabilities of data fusion approaches in a problem that can be annoverated among the most difficult in modern bioinformatics.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

dynGENIE3: dynamical GENIE3 for the inference of gene networks from time series expression data

Article Open access 21 February 2018

Vân Anh Huynh-Thu & Pierre Geurts

Gene Expression Analysis Through Network Biology: Bioinformatics Approaches

Estimating genome-wide regulatory activity from multi-omics data sets using mathematical optimization

Article Open access 27 March 2017

Saskia Trescher, Jannes Münchmeyer & Ulf Leser

References

Beer M, Tavazoie S (2004) Predicting gene expression from sequence. Cell 117
desJardins M et al (1997) Prediction of enzyme classification from protein sequence without the use of sequence similarity. In: Proceedings of the 5th international conference on intelligent systems for molecular biology. AAAI Press, Menlo Park, pp 92–99
Friedman J et al (2000) Additive logistic regression: a statistical view of boosting. Ann Stat 38(2):337–374
Article Google Scholar
Gasch P et al (2000) Genomic expression programs in the response of yeast cells to environmental changes. Mol Biol Cell 11:4241–4257
Google Scholar
Hartigan J (1975) Clustering algorithms. Wiley, New York
MATH Google Scholar
Iorio F et al (2009) Identifying network of drug mode action by gene expression profiling. J Comput Biol 16
Kuncheva LI et al (2001) Decision templates for multiple classifier fusion: an experimental comparison. Pattern Recognit 34(2):299–314
Article MATH Google Scholar
Lamb J et al (2006) The connectivity map: using gene-expression signatures to connect small molecules genes and diseases. Science 313
Lanckriet G et al (2004) A statistical framework for genomic data fusion. Bioinformatics 20:2626–2635
Article Google Scholar
Lin H, Lin C, Weng R (2007) A note on Platt’s probabilistic outputs for support vector machines. Mach Learn 68:267–276
Article Google Scholar
McIsaac K et al (2006) An improved map of conserved regulatory sites map for Saccharomyces cerevisiae. BMC Bioinf 7
Millar C, Grunstein M (2006) Genome-wide patterns of histone modifications in yeast. Nat Rev Mol Cell Biol 7
Noble W, Ben-Hur A (2007) Integrating information for protein function prediction. In: Lengauer T (ed) m genomes to therapies, vol 3, Wiley, New York, pp 1297–1314
O’Connor T, Wryck J (2007) Chromatindb: a database of genome-wide histone modification patterns for saccharomyces cerevisiae. Bioinformatics 23
Pavesi G, Valentini G (2009) Classification of co-expressed genes from dna regulatory regions. Information Fusion 10
Pavlidis P et al (2002) Learning gene functional classification from multiple data. J Comput Biol 9
Rosset S et al (2004) Boosting as a regularized path to a maximum margin classifier. J Mach Learn Res 5
Spellman P et al (1998) Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomices cerevisiae by microarray hybridization. Mol Biol Cell 9:3273–3297
Google Scholar
Subramanian A et al (2005) Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci USA 102
Yuan Y et al (2007) Prediction gene expression from sequence: a reexamination. PLOS Comp Biol 3
Zhu J et al (2004) Multi-class adaboost. Statistics and its Interface 2

Download references

Acknowledgments

The authors would like to gratefully acknowledge partial support by the PASCAL2 Network of Excellence under EC grant no. 216886. This publication only reflects the authors’ views. The author would also like to expressly thank Giorgio Valentini for the examination of early versions of the manuscript.

Author information

Authors and Affiliations

Dipartimento di Scienze dell’Informazione, DSI, Universitá degli studi di Milano, via Comelico 39, Milan, Italy
Matteo Re

Authors

Matteo Re
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Matteo Re.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Re, M. Comparing early and late data fusion methods for gene expression prediction. Soft Comput 15, 1497–1504 (2011). https://doi.org/10.1007/s00500-010-0599-6

Download citation

Published: 21 March 2010
Issue Date: August 2011
DOI: https://doi.org/10.1007/s00500-010-0599-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Comparing early and late data fusion methods for gene expression prediction

Abstract

Access this article

Similar content being viewed by others

dynGENIE3: dynamical GENIE3 for the inference of gene networks from time series expression data

Gene Expression Analysis Through Network Biology: Bioinformatics Approaches

Estimating genome-wide regulatory activity from multi-omics data sets using mathematical optimization

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Comparing early and late data fusion methods for gene expression prediction

Abstract

Access this article

Similar content being viewed by others

dynGENIE3: dynamical GENIE3 for the inference of gene networks from time series expression data

Gene Expression Analysis Through Network Biology: Bioinformatics Approaches

Estimating genome-wide regulatory activity from multi-omics data sets using mathematical optimization

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation