A weighted least-squares approach to clusterwise regression

Schlittgen, Rainer

doi:10.1007/s10182-011-0155-4

A weighted least-squares approach to clusterwise regression

Original Paper
Published: 01 March 2011

Volume 95, pages 205–217, (2011)
Cite this article

AStA Advances in Statistical Analysis Aims and scope Submit manuscript

Rainer Schlittgen¹

566 Accesses
19 Citations
Explore all metrics

Abstract

Clusterwise regression aims to cluster data sets where the clusters are characterized by their specific regression coefficients in a linear regression model. In this paper, we propose a method for determining a partition which uses an idea of robust regression. We start with some random weighting to determine a start partition and continue in the spirit of M-estimators. The residuals for all regressions are used to assign the observations to the different groups. As target function we use the determination coefficient \(R^{2}_{w}\) for the overall model. This coefficient is suitably defined for weighted regression.

Target functions for the clusterwise regression problem may have a large number of local optima that cannot be handled with optimization methods based on derivatives. The approach commonly employed to overcome this problem is to start several times from random partitions and then to improve the resulting partition. Because our procedure is very fast it can be used with many random starts. Eventually, the solution with the highest determination coefficient \(R^{2}_{w}\) for the overall model is chosen. The performance of the method is investigated with the help of Monte Carlo simulations. It is also compared to the finite-mixture approach to clusterwise regression. A sequence of bootstrap tests is proposed to determine the number of clusters.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Baier, D.: A Constrained clusterwise regression procedure for benefit segmentation. In: Studies in Classification, Data Analysis, and Knowledge Organization. vol. 11, pp. 676–683 (1997)
Google Scholar
Cohen, E.: Some effects of inharmonic partials on interval perception. In: Music Perception. vol. 1, pp. 323–349 (1984)
Google Scholar
Cox, D.R.: Test of separate families of hypotheses. In: Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability. vol. 1, pp. 105–123. (1961)
Google Scholar
Cox, D.R.: Further results on tests of separate families of hypotheses. J. R. Stat. Soc. 24, 406–24 (1962)
MATH Google Scholar
Davidson, R., MacKinnon, J.G.: Econometric Theory and Methods. Oxford University Press, New York (2004)
Google Scholar
Davison, A.C., Hinkley, D.V.: Bootstrap Methods and Their Application. Cambridge University Press, Cambridge (1997)
MATH Google Scholar
DeSarbo, W.S., Cron, W.L.: A maximum likelihood methodology for clusterwise linear regression. J. Classif. 5, 249–282 (1988)
Article MATH MathSciNet Google Scholar
Frühwirth-Schnatter, S.: Finite Mixture and Markov Switching Models. Springer, Berlin (2006)
MATH Google Scholar
Furrer, R., Nychka, D., Sain, S.: (2010) Fields: Tools for spatial data; R package version 6.3. http://cran.r-project.org/web/packages/fields
Gruen, B., Leisch, F.: Fitting finite mixtures of generalized linear regressions in R. Comput. Stat. Data Anal. 51(11), 5247–5252 (2007)
Article MATH Google Scholar
Gruen, B., Leisch, F.: FlexMix Version 2: finite mixtures with concomitant variables and varying and constant parameters. J. Stat. Softw. 28(4), 1–35 (2008). http://www.jstatsoft.org/v28/i04/
Google Scholar
Hennig, C.: Fixed point clusters for linear regression: computation and comparison. J. Classif. 19, 249–276 (2002)
Article MATH MathSciNet Google Scholar
Hennig, C.: Clusters, outliers, and regression: fixed point clusters. J. Multivar. Anal. 86, 183–212 (2003)
Article MATH MathSciNet Google Scholar
Hennig, C.: fpc: Fixed point clusters, clusterwise regression and discriminant plots. R package version 1.2-7. http://CRAN.R-project.org/package=fpc (2009)
Hurn, M., Justel, A., Robert, C.P.: Estimating mixtures of regressions. J. Comput. Graph. Stat. 12(1), 55–79 (2003)
Article MathSciNet Google Scholar
Jacobs, R.A., Jordan, M.I., Nowlan, S.J., Hinton, G.E.: Adaptive mixtures of local experts. Neural Comput. 3(1), 79–87 (1991)
Article Google Scholar
Jeong, J.: R²-based bootstrap tests for nonnested hypotheses in regression models. InterStat, http://interstat.statjournals.net/YEAR/2006/abstracts/0608001.php (2006). Accessed 21 January 2009
Lau, K., Leung, P., Tse, K.: A mathematical programming approach to clusterwise regression model and its extensions. Eur. J. Oper. Res. 116, 640–652 (1999)
Article MATH Google Scholar
Leisch, F.: FlexMix: a general framework for finite mixture models and latent class regression in R. J. Stat. Softw. 11(8), 1–18 (2004). http://www.jstatsoft.org/v11/i08/
Google Scholar
Luo, Z., Chou, E.Y.J.: Pavement condition prediction using clusterwise regression. TRB 85th Annual Meeting Compendium of Papers CD-ROM, www.eng.mu.edu/crovettj/courses/ceen175/06-2463.pdf (2006). Accessed 20 September 2009
McLachlan, G.J.: On bootstrapping the likelihood ratio test statistic for the number of components in a normal mixture. Appl. Stat. 36(3), 318–324 (1987)
Article MathSciNet Google Scholar
R Development Core Team: R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. http://www.R-project.org (2010)
Späth, H.: Clusterwise linear regression. Computing 22, 367–373 (1979)
Article MATH MathSciNet Google Scholar
Späth, H.: A fast algorithm for clusterwise linear regression. Computing 29, 175–181 (1981)
Article Google Scholar
Späth, H.: Clusterwise linear least absolute deviations regression. Computing 37, 371–378 (1986)
Article MATH Google Scholar
Turner, T.R.: Estimating the propagation rate of a viral infection of potato plants via mixtures of regressions. Appl. Stat. 49(3), 371–384 (2000)
MATH Google Scholar
Viele, K., Tong, B.: Modeling with mixtures of linear regressions. Stat. Comput. 12(4), 315–330 (2002)
Article MathSciNet Google Scholar
Wayne, S.D., Edwards, E.A.: Typologies of compulsive buying behavior: a constrained clusterwise regression approach. J. Consum. Psychol. 5, 231–262 (1996)
Article Google Scholar
Wedel, M., Kistemaker, C.: Consumer benefit segmentation using clusterwise linear regression. Int. J. Res. Mark. 6, 45–59 (1989)
Article Google Scholar
Wulf, S.: Traditionelle nicht-metrische Conjointanalyse–ein Verfahrens vergleich. Münster, LIT-Verlag (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Statistics and Econometrics, University of Hamburg, Von-Melle-Park 5, 20146, Hamburg, Germany
Rainer Schlittgen

Authors

Rainer Schlittgen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rainer Schlittgen.

Electronic Supplementary Material

Below are the links to the electronic supplementary material.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Schlittgen, R. A weighted least-squares approach to clusterwise regression. AStA Adv Stat Anal 95, 205–217 (2011). https://doi.org/10.1007/s10182-011-0155-4

Download citation

Received: 24 March 2010
Accepted: 10 February 2011
Published: 01 March 2011
Issue Date: June 2011
DOI: https://doi.org/10.1007/s10182-011-0155-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A weighted least-squares approach to clusterwise regression

Abstract

Access this article

Similar content being viewed by others

Covariance matrix estimation of the maximum likelihood estimator in multivariate clusterwise linear regression

Cluster Validation for Mixtures of Regressions via the Total Sum of Squares Decomposition

Hierarchical Means Clustering

References

Author information

Authors and Affiliations

Corresponding author

Electronic Supplementary Material

(R 13 kB).

(R 3.58 kB).

(R 6.35 kB).

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A weighted least-squares approach to clusterwise regression

Abstract

Access this article

Similar content being viewed by others

Covariance matrix estimation of the maximum likelihood estimator in multivariate clusterwise linear regression

Cluster Validation for Mixtures of Regressions via the Total Sum of Squares Decomposition

Hierarchical Means Clustering

References

Author information

Authors and Affiliations

Corresponding author

Electronic Supplementary Material

(R 13 kB).

(R 3.58 kB).

(R 6.35 kB).

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation