ScienceDirect® Home Skip Main Navigation Links
You have guest access to ScienceDirect. Find out more.
 
Home
Browse
My Settings
Alerts
Help
 Quick Search
 Search tips (Opens new window)
    Clear all fields    
Computational Statistics & Data Analysis
Volume 52, Issue 1, 15 September 2007, Pages 249-257
 
Font Size: Decrease Font Size  Increase Font Size
 Abstract - selected
Article
Purchase PDF (159 K)

 
 
 
Related Articles in ScienceDirect
View More Related Articles
 
View Record in Scopus
 
doi:10.1016/j.csda.2007.01.012    How to Cite or Link Using DOI (Opens New Window)
Copyright © 2007 Elsevier B.V. All rights reserved.

Robust variable selection using least angle regression and elemental set samplingstar, open

Lauren McCanna, b, E-mail The Corresponding Author and Roy E. Welscha, b, Corresponding Author Contact Information, E-mail The Corresponding Author

aGlaxoSmithKline, 1250 S. Collegeville Road, Collegeville, PA 19426, USA bSloan School of Management, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Room E53-383, Cambridge, MA 02139, USA

Available online 1 February 2007.

Purchase the full-text article



References and further reading may be available for this article. To view references and further reading you must purchase this article.

Abstract

The problem of selecting variables or features in a regression model in the presence of both additive (vertical) and leverage outliers is addressed. Since variable selection and the detection of anomalous data are not separable problems, the focus is on methods that select variables and outliers simultaneously. For selection, the fast forward selection algorithm, least angle regression (LARS), is used, but it is not robust. To achieve robustness to additive outliers, a dummy variable identity matrix is appended to the design matrix allowing both real variables and additive outliers to be in the selection set. For leverage outliers, these selection methods are used on samples of elemental sets in a manner similar to that used in high breakdown robust estimation. These results are compared to several other selection methods of varying computational complexity and robustness. The extension of these methods to situations where the number of variables exceeds the number of observations is discussed.

Keywords: Robust regression; Variable selection; LARS; Outliers; Elemental sets

Article Outline

1. Introduction
2. Vertical or additive outliers
3. Variable selection
4. High breakdown methods
5. Selecting the variables when sampling
6. Simulation results
7. Mortality data example
7.1. Selected models
7.2. Which model is best?
8. Comparison with KVZ
9. Algorithmic speed
10. Conclusion
References

 
Home
Browse
My Settings
Alerts
Help
Elsevier.com (Opens new window)
About ScienceDirect  |  Contact Us  |  Information for Advertisers  |  Terms & Conditions  |  Privacy Policy
Copyright © 2008 Elsevier B.V. All rights reserved. ScienceDirect® is a registered trademark of Elsevier B.V.