ScienceDirect® Home Skip Main Navigation Links
You have guest access to ScienceDirect. Find out more.
 
Home
Browse
My Settings
Alerts
Help
 Quick Search
 Search tips (Opens new window)
    Clear all fields    
advertisementadvertisement
Information Sciences
Volume 178, Issue 1, 2 January 2008, Pages 88-105
 
Font Size: Decrease Font Size  Increase Font Size
 Abstract - selected
Article
Purchase PDF (781 K)

  E-mail Article   
  Add to my Quick Links   
Bookmark and share in 2collab (opens in new window)
Request permission to reuse this article
  Cited By in Scopus (0)
 
 
 
Related Articles in ScienceDirect
View More Related Articles
 
View Record in Scopus
 
doi:10.1016/j.ins.2007.08.013    How to Cite or Link Using DOI (Opens New Window)
Copyright © 2007 Elsevier Inc. All rights reserved.

Direct integration of microarrays for selecting informative genes and phenotype classification

Youngmi Yoona, c, Jongchan Leea, Sanghyun Parka, Corresponding Author Contact Information, E-mail The Corresponding Author, Sangjay Biena, Hyun Cheol Chungb and Sun Young Rhab

aDepartment of Computer Science, Yonsei University, 134 Sinchon-dong, Seodaemun-gu, Seoul 120-749, South Korea bDepartment of Internal Medicine, Cancer Metastasis Research Center, Yonsei University College of Medicine, South Korea cDepartment of Information Technology, Gachon University of Medicine and Science, South Korea

Received 7 February 2007; 
revised 25 July 2007; 
accepted 1 August 2007. 
Available online 23 August 2007.

Purchase the full-text article



References and further reading may be available for this article. To view references and further reading you must purchase this article.

Abstract

The ability to provide thousands of gene expression values simultaneously makes microarray data very useful for phenotype classification. A major constraint in phenotype classification is that the number of genes greatly exceeds the number of samples. We overcame this constraint in two ways; we increased the number of samples by integrating independently generated microarrays that had been designed with the same biological objectives, and reduced the number of genes involved in the classification by selecting a small set of informative genes. We were able to maximally use the abundant microarray data that is being stockpiled by thousands of different research groups while improving classification accuracy. Our goal is to implement a feature (gene) selection method that can be applicable to integrated microarrays as well as to build a highly accurate classifier that permits straightforward biological interpretation. In this paper, we propose a two-stage approach. Firstly, we performed a direct integration of individual microarrays by transforming an expression value into a rank value within a sample and identified informative genes by calculating the number of swaps to reach a perfectly split sequence. Secondly, we built a classifier which is a parameter-free ensemble method using only the pre-selected informative genes. By using our classifier that was derived from large, integrated microarray sample datasets, we achieved high accuracy, sensitivity, and specificity in the classification of an independent test dataset.

Keywords: Data mining; Microarray data analysis; Microarray data integration; Microarray data classification; Informative gene selection

Article Outline

1. Introduction
2. Related work
2.1. Microarray data integration
2.2. Informative gene selection
2.3. Classification
3. System overview
4. System implementation
4.1. Microarray data integration and informative gene selection
4.2. k-GeneTriple classification method
5. Experimental results
5.1. Determining the optimal number of rules (k) by LOOCV
5.2. Accuracy of the informative gene selection method
5.3. Accuracy of the classification method
5.3.1. Accuracy of the classification method using Affymetrix data
5.3.2. Accuracy of the classification method using cDNA microarray
5.4. Run-time comparison of k-GeneTriple and TSP
5.5. Effectiveness of the rank-based microarray data integration in classification
6. Conclusion
Acknowledgements
References















Information Sciences
Volume 178, Issue 1, 2 January 2008, Pages 88-105
 
Home
Browse
My Settings
Alerts
Help
Elsevier.com (Opens new window)
About ScienceDirect  |  Contact Us  |  Information for Advertisers  |  Terms & Conditions  |  Privacy Policy
Copyright © 2008 Elsevier B.V. All rights reserved. ScienceDirect® is a registered trademark of Elsevier B.V.