ScienceDirect® Home Skip Main Navigation Links
You have guest access to ScienceDirect. Find out more.
 
Home
Browse
My Settings
Alerts
Help
 Quick Search
 Search tips (Opens new window)
    Clear all fields    
advertisementadvertisement
Computational Statistics & Data Analysis
Volume 52, Issue 1, 15 September 2007, Pages 578-595
 
Font Size: Decrease Font Size  Increase Font Size
 Abstract - selected
Article
Purchase PDF (324 K)

 
 
 
Related Articles in ScienceDirect
View More Related Articles
 
View Record in Scopus
 
doi:10.1016/j.csda.2007.02.003    How to Cite or Link Using DOI (Opens New Window)
Copyright © 2007 Elsevier B.V. All rights reserved.

A simple and efficient method for variable ranking according to their usefulness for learning

José R. Quevedoa, Antonio Bahamondea and Oscar LuacesCorresponding Author Contact Information, a, E-mail The Corresponding Author

aArtificial Intelligence Center, Campus de Viesques, University of Oviedo at Gijón, E-33204 Gijón, Spain

Available online 15 February 2007.

Purchase the full-text article



References and further reading may be available for this article. To view references and further reading you must purchase this article.

Abstract

The selection of a subset of input variables is often based on the previous construction of a ranking to order the variables according to a given criterion of relevancy. The objective is then to linearize the search, estimating the quality of subsets containing the topmost ranked variables. An algorithm devised to rank input variables according to their usefulness in the context of a learning task is presented. This algorithm is the result of a combination of simple and classical techniques, like correlation and orthogonalization, which allow the construction of a fast algorithm that also deals explicitly with redundancy. Additionally, the proposed ranker is endowed with a simple polynomial expansion of the input variables to cope with nonlinear problems. The comparison with some state-of-the-art rankers showed that this combination of simple components is able to yield high-quality rankings of input variables. The experimental validation is made on a wide range of artificial data sets and the quality of the rankings is assessed using a ROC-inspired setting, to avoid biased estimations due to any particular learning algorithm.

Keywords: Variable ranking; Dimensionality reduction

Article Outline

1. Introduction
2. Some state-of-the-art rankers
2.1. The wrapper approach
2.2. The filter approach
3. A simple ranker
3.1. Nonlinear correlation-based ranking
3.2. Redundancy detection
4. Experimental results
4.1. Performance estimation
4.2. Artificial data sets construction
4.3. Parameter setting for the algorithms
4.4. Summary of results
4.4.1. Redundancy analysis
4.4.2. Number of irrelevant input variables
4.4.3. Degree of the polynomial relation with the class
4.4.4. Weston et al.'s nonlinear problems
5. Some limitations of View the MathML source
5.1. Apparently useless variables
5.2. Highly anti-correlated complementary variables
6. Conclusions
Acknowledgements
References












 
Home
Browse
My Settings
Alerts
Help
Elsevier.com (Opens new window)
About ScienceDirect  |  Contact Us  |  Information for Advertisers  |  Terms & Conditions  |  Privacy Policy
Copyright © 2008 Elsevier B.V. All rights reserved. ScienceDirect® is a registered trademark of Elsevier B.V.