Copyright © 2006 Elsevier B.V. All rights reserved.
Received 1 March 2005;
References and further reading may be available for this article. To view references and further reading you must purchase this article.
Abstract
The problem of fitting a straight line to a finite collection of points in the plane is an important problem in statistical estimation. Robust estimators are widely used because of their lack of sensitivity to outlying data points. The least median-of-squares (LMS) regression line estimator is among the best known robust estimators. Given a set of n points in the plane, it is defined to be the line that minimizes the median squared residual or, more generally, the line that minimizes the residual of any given quantile q, where 0<q
1. This problem is equivalent to finding the strip defined by two parallel lines of minimum vertical separation that encloses at least half of the points.
The best known exact algorithm for this problem runs in O(n2) time. We consider two types of approximations, a residual approximation, which approximates the vertical height of the strip to within a given error bound εr
0, and a quantile approximation, which approximates the fraction of points that lie within the strip to within a given error bound εq
0. We present two randomized approximation algorithms for the LMS line estimator. The first is a conceptually simple quantile approximation algorithm, which given fixed q and εq>0 runs in O(nlogn) time. The second is a practical algorithm, which can solve both types of approximation problems or be used as an exact algorithm. We prove that when used as a quantile approximation, this algorithm's expected running time is . We present empirical evidence that the latter algorithm is quite efficient for a wide variety of input distributions, even when used as an exact algorithm.
Keywords: Least median-of-squares regression; Robust estimation; Line fitting; Approximation algorithms; Randomized algorithms; Line arrangements
Article Outline
- 1. Introduction
- 1.1. Exact algorithms for LMS
- 1.2. Approximating LMS
- 1.3. Summary of results
- 2. Computational methods
- 3. A randomized quantile approximation algorithm
- 4. A practical approach: slope decomposition
- 5. Analysis of the slope-decomposition algorithm
- 6. Experimental results
- 6.1. Input size and quantile error factor
- 6.2. Quantile error and residual error factors
- 6.3. Inlier noise
- 6.4. Different distributions and actual error
- 7. Concluding remarks
- References







E-mail Article
Add to my Quick Links

Cited By in Scopus (1)

]0,0.3] and the signal history length 





