ScienceDirect® Home Skip Main Navigation Links
You have guest access to ScienceDirect. Find out more.
 
Home
Browse
My Settings
Alerts
Help
 Quick Search
 Search tips (Opens new window)
    Clear all fields    
Information Sciences
Volume 146, Issues 1-4, October 2002, Pages 75-88
 
Font Size: Decrease Font Size  Increase Font Size
 Abstract - selected
Article
Purchase PDF (138 K)

 
 
 
Related Articles in ScienceDirect
View More Related Articles
 
View Record in Scopus
 
doi:10.1016/S0020-0255(02)00216-5    How to Cite or Link Using DOI (Opens New Window)
Copyright © 2002 Elsevier Science Inc. All rights reserved.

Interpreting microarray expression data using text annotating the genes

Michael MollaCorresponding Author Contact Information, E-mail The Corresponding Author, a, Peter AndreaeE-mail The Corresponding Author, a, 1, Jeremy GlasnerE-mail The Corresponding Author, b, Frederick BlattnerE-mail The Corresponding Author, b and Jude ShavlikE-mail The Corresponding Author, a, c

a Department of Computer Sciences, University of Wisconsin––Madison, Madison, WI, USA b Department of Genetics, University of Wisconsin––Madison, Madison, WI, USA c Department of Biostatistics and Medical Informatics, University of Wisconsin––Madison, Madison, WI, USA

Received 1 June 2001; 
revised 2 November 2001; 
accepted 1 February 2002. 
Available online 4 October 2002.

Purchase the full-text article



References and further reading may be available for this article. To view references and further reading you must purchase this article.

Abstract

Microarray expression data is being generated by the gigabyte all over the world with undoubted exponential increases to come. Annotated genomic data is also rapidly pouring into public databases. Our goal is to develop automated ways of combining these two sources of information to produce insight into the operation of cells under various conditions. Our approach is to use machine-learning techniques to identify characteristics of genes that are up-regulated or down-regulated in a particular microarray experiment. We seek models that are (a) accurate, (b) easy to interpret, and (c) stable to small variations in the training data. This paper explores the effectiveness of two standard machine-learning algorithms for this task: Image ïImage (based on probability) and PFImage (based on building rules). Although we do not anticipate using our learned models to predict expression levels of genes, we cast the task in a predictive framework, and evaluate the quality of the models in terms of their predictive power on genes held out from the training. The paper reports on experiments using actual E. coli microarray data, discussing the strengths and weaknesses of the two algorithms and demonstrating the trade-offs between accuracy, comprehensibility, and stability.

Article Outline

1. Introduction
2. Algorithm descriptions
2.1. Naïve Bayes
2.2. PFImage
2.3. Improving stability of PFImage
3. Experimental methodology
4. Experimental results and discussion
5. Discussion
6. Related work
7. Future directions and conclusions
Acknowledgements
References



Information Sciences
Volume 146, Issues 1-4, October 2002, Pages 75-88
 
Home
Browse
My Settings
Alerts
Help
Elsevier.com (Opens new window)
About ScienceDirect  |  Contact Us  |  Information for Advertisers  |  Terms & Conditions  |  Privacy Policy
Copyright © 2008 Elsevier B.V. All rights reserved. ScienceDirect® is a registered trademark of Elsevier B.V.