Abstract
One of the common ways to cope with the multicollinearity problem in multiple regression analysis is to use dimension reduction techniques. Among these techniques, the present study focuses on the Partial Least Square Regression (PLSR) and the Principle Component Regression (PCR) techniques. The study tries to determine in which cases the two techniques give similar results and in which cases and to what extent they are different in terms of dimension reduction. For this purpose, the performance of the techniques is examined on two real dataset. In addition, a Monte Carlo simulation is made to evaluate the performances of these techniques based on the criterion of Root Mean Square Error of Cross Validation (RMSECV) under different conditions.
Similar content being viewed by others
References
Abdi H (2003) Partial least square regression (PLS regression). Encyclop Res Methods Soc Sci 6(4):792–795
Allen DM (1974) The relationship between variable selection and data augmentation and a method for prediction. Technometrics 16(1):125–127
Bodzioch K, Baczek T, Kaliszan R, Vander Heyden Y (2009) The molecular descriptor logSum AA and its alternatives in QSRR models to predict the retention of peptides. J Pharm Biomed Anal 50(4):563–569
D’ambra A, Sarnacchiora P (2010) Some data reduction methods to analyze the dependence with highly collinear variables: a simulation study. Asian J Math Stat 3(2):69–81
Diaz TG, Guiberteau A, Burguillos JO, Salinas F (1997) Comparison of chemometric methods: derivative ratio spectra and multivariate methods (CLS, PCR and PLS) for the resolution of ternary mixtures of the pesticides carbofuran carbaryl and phenamifos after their extraction into chloroform. Analyst 122(6):513–517
Druilhet P, Mom A (2006) PLS regression: a directional signal-to-noise ratio approach. J Multivar Anal 97(6):1313–1329
Du YP, Kasemsumran S, Maruo K, Nakagawa T, Ozaki Y (2006) Ascertainment of the number of samples in the validation set in Monte Carlo cross validation and the selection of model dimension with Monte Carlo cross validation. Chemometr Intel Lab 82(1):83–89
Ebegil M, Gokpinar F (2012) A test Static to choose between Liu-type and least-squares estimator based on mean square error criteria. J Appl Stat 39(10):2081–2096
Geladi P, Kowalski BR (1986) Partial least-squares regression: a tutorial. Anal Chim Acta 185:1–17
Gibbons DG (1981) A simulation study of some ridge estimators. J Am Stat Assoc 76(373):131–139
He G, Sentell T, Schillinger D (2010) A new public health tool for risk assessment of abnormal glucose levels. Prev Chronic Dis 7(2):1–9
Helland IS (1988) On the structure of partial least squares regression. Commun Stat Simulat 17(2):581–607
Helland I (2006) Partial least squares regression. In: Kotz S, Read B, Balakrishnan N, Vidakovic B (eds) Encylopedia of Statical sciences. Wiley, New Jersey, pp 5957–5962
Hemmateenejad B, Akhond M, Samari F (2007) A comparative study between PCR and PLS in simultaneous spectrophotometric determination of diphenylamine, aniline, and phenol: effect of wavelength selection. Spectrochim Acta Part A 67(3):958–965
Hoerl AE, Kennard RW (1970) Ridge regression: biased estimation for nonorthogonal problems. Technometrics 12(1):55–67
Jong SD (1993) PLS fits closer than PCR. J Chemometr 7(6):551–557
Kibria BMG (2003) Performance of some new ridge regression estimators. Commun Stat Simulat 32(2):419–435
Li Y, Udén P, Von Rosen D (2013) A two-step PLS inspired method for linear prediction with group effect. Sankhya A 75(1):96–117
Li Y, Udén P, Von Rosen D (2015) A two-step estimation method for grouped data with connections to the extended growth curve model and partial least squares regression. J Multivar Anal 139:347–359
Magidson J (2013) Correlated component regression: re-thinking regression in the presence of near collinearity. New perspectives in partial least squares and related methods. Springer, New York, pp 65–78
Mahesh S, Jayas DS, Paliwal J, White NDG (2015) Comparison of partial least squares regression (PLSR) and principal components regression (PCR) methods for protein and hardness predictions using the near-infrared (NIR) hyperspectral images of bulk samples of Canadian wheat. Food Bioprocess Technol 8(1):31–40
Maitra S, Yan J (2008) Principle component analysis and partial least squares: two dimension reduction techniques for regression. Applying multivariate statical models, vol 79. Discussion Paper Program. Casualty Actuarial Society, Arlington, pp 79–90
Månsson K, Shukur G, Kibria BMG (2010) A simulation study of some ridge regression estimators under different distributional assumptions. Commun Stat Simul Comput 39(8):1639–1670
Massy WF (1965) Principal components regression in exploratory statical research. J Am Stat Assoc 60(309):234–256
McDonald GC, Galarneau DI (1975) A Monte Carlo evaluation of some ridge-type estimators. J Am Stat Assoc 70(350):407–416
McDonald GC, Schwing RC (1973) Instabilities of regression estimates relating air pollution to mortality. Technometrics 15(3):463–481
Mohiddin SB (2006) Development of novel unsupervised and supervised informatics methods for drug discovery applications. PhD thesis, United States: The Ohio State University
Montgomery DC, Askin RG (1981) Problems of nonnormality and multicollinearity for forecasting methods based on least squares. AIIE T 13(2):102–115
Montgomery DC, Peck EA, Vining GG (2001) Introduction to linear regression analysis. Wiley, New York
Naes T, Martens H (1985) Comparison of prediction methods for multicollinear data. Commun Stat Simulat 14(3):545–576
Naes T, Mevik BH (2001) Understanding the collinearity problem in regression and discriminant analysis. J Chemometr 15(4):413–426
Newhouse JP, Oman SD (1971) An evaluation of ridge estimators. Rand, Santa Monica
Ni Y, Gong X (1997) Simultaneous spectrophotometric determination of mixtures of food colorants. Anal Chim Acta 354(1):163–171
Rao CR, Toutenburg H, Heumann SC (2008) Linear models and generalizations: least squares and alternatives. Springer, Germany
Rawlings JO, Pantula SG, Dickey DA (1998) Applied regression analysis: a research tool. Springer, New York
Rosipal R, Krämer N (2006) Overview and recent advances in partial least squares. Lect Notes Comput Sci 3490:34–51
Saleh AKME (2014) A ridge regression estimation approach to the measurement error model. J Multivar Anal 123:68–84
Serneels S, Filzmoser P, Croux C, Van Espen PJ (2005) Robust continuum regression. Chemometr Intel Lab. 76(2):197–204
Stone M, Brooks RJ (1990) Continuum regression: cross-validated sequentially constructed prediction embracing ordinary least squares, partial least squares and principal components regression. J R Stat Soc Series B Stat Methodol 52(2):237–269
Tobias RD (1995) An introduction to partial least squares regression. In: Proceedings of the twentieth annual SAS users group international conference, SAS Institute Cary, NC, pp 1250–1257
Vigneau E, Bertrand D, Qannari EM (1996) Application of latent root regression for calibration in near-infrared spectroscopy. Comparison with principal component regression and partial least squares. Chemometr Intel Lab 35(2):231–238
Vigneau E, Devaux MF, Qannari EM, Robert P (1997) Principal component regression, ridge regression and ridge principal component regression in spectroscopy calibration. J Chemometr 11(3):239–249
Wold S, Trygg J, Berglung A, Antti H (2001a) Some recent developments in PLS modeling. Chemometr Intel Lab 58(2):131–152
Wold S, Sjostrom M, Eriksson L (2001b) PLS-regression: a basic tool of chemometrics. Chemometr Intel Lab 58(2):109–130
Xu QS, Liang YZ (2001) Monte Carlo cross validation. Chemometr Intel Lab 56(1):1–11
Yeniay O, Göktaş A (2002) A comparison of partial least squares regression with other prediction methods. Hacet J Math Stat 31(99):99–101
Zeng XQ, Li GZ, Wu G, Zou HX (2007) On the number of partial least squares components in dimension reduction for tumor classification. Lect Notes Comput Sci Springer Berlin Heidelberg 4819:206–217
Ziegel ER (2004) A user-friendly guide to multivariate calibration and classification. Technometrics 46(1):108–110
Acknowledgements
The authors are grateful to reviewers for their valuable comments and suggestions to improve the quality of this paper.
Funding
This work was supported by Scientific Research Projects of Eskisehir Osmangazi University [grand number 201519A112].
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Guven, G., Samkar, H. Examination of Dimension Reduction Performances of PLSR and PCR Techniques in Data with Multicollinearity. Iran J Sci Technol Trans Sci 43, 969–978 (2019). https://doi.org/10.1007/s40995-018-0565-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40995-018-0565-1