Skip to main content
Log in

Comparison of PLS algorithms when number of objects is much larger than number of variables

  • Published:
Statistical Papers Aims and scope Submit manuscript

Abstract

NIPALS and SIMPLS algorithms are the most commonly used algorithms for partial least squares analysis. When the number of objects, N, is much larger than the number of explanatory, K, and/or response variables, M, the NIPALS algorithm can be time consuming. Even though the SIMPLS is not as time consuming as the NIPALS and can be preferred over the NIPALS, there are kernel algorithms developed especially for the cases where N is much larger than number of variables. In this study, the NIPALS, SIMPLS and some kernel algorithms have been used to built partial least squares regression model. Their performances have been compared in terms of the total CPU time spent for the calculations of latent variables, leave-one-out cross validation and bootstrap methods. According to the numerical results, one of the kernel algorithms suggested by Dayal and MacGregor (J Chemom 11:73–85, 1997) is the fastest algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Abbreviations

X :

N × K matrix of explanatory variables

Y :

N × M matrix of response variables

F :

N × M matrix of residuals

B PLS :

K × M matrix of PLS regression coefficients

T :

N × A matrix of PLS latent variables for X

U :

N × A matrix of PLS latent variables for Y

W :

K × A matrix of weights of deflated X matrix on latent variables T

R :

K × A matrix of weights of original X matrix on latent variables T

C :

M × A matrix of weights of Y on latent variables U

P :

K × A matrix of loadings for X

t a :

A column vector of T

u a :

A column vector of U

w a :

A column vector of W

r a :

A column vector of R

c a :

A column vector of C

p a :

A column vector of P

 :

Uppercase bold variables will represent matrices and lower case bold variables will represent column vectors in the paper. The transpose of a matrix will be given with “ ′ ”. N, K, M and A are the number of objects, the number of explanatory variables, the number of response variables and the number of latent variables, respectively. The notations used in the paper are given above. It is assumed that the columns of X and Y are mean-centered and scaled prior to PLS model estimation to have mean zero and standard deviation one.

References

  • Boos DD (2003) Introduction to the bootstrap world. Stat Sci 18: 168–174

    Article  MathSciNet  Google Scholar 

  • Dayal BS, MacGregor JF (1997) Improved PLS algorithms. J Chemom 11: 73–85

    Article  Google Scholar 

  • De Jong S (1993) SIMPLS: an alternative approach to partial least squares regression. Chemom Intell Lab Syst 18: 251–263

    Article  Google Scholar 

  • De Jong S, Ter Braak CJF (1994) Short communication: comments on the PLS kernel algorithm. J Chemom 8: 169–174

    Article  Google Scholar 

  • Efron B, Tibshirani R (1986) Bootstrap methods for standard errors, confidence intervals, and other measures of statistical accuracy. Stat Sci 1: 54–77

    Article  MathSciNet  Google Scholar 

  • Lindgren F, Rännar S (1998) Alternative partial least squares (PLS) algorithms. Perspectives Drug Discov Des 12/13/14: 105–113

    Article  Google Scholar 

  • Lindgren F, Geladi P, Wold S (1993) The kernel algorithm for PLS. J Chemom 7: 45–59

    Article  Google Scholar 

  • Picard RR, Cook RD (1984) Cross-validation of regression models. J Am Stat Assoc 79: 575–583

    Article  MATH  MathSciNet  Google Scholar 

  • Rännar S, Lindgren F, Geladi P, Wold S (1994) A PLS kernel algorithm for data sets with many variables and fewer objects. Part 1: theory and algorithm. J Chemom 8: 111–125

    Article  Google Scholar 

  • Shao J (1993) Linear model selection by cross-validation. J Am Stat Assoc 88: 486–494

    Article  MATH  Google Scholar 

  • Wold S, Sjöström M, Eriksson L (2001) PLS-regression: a basic tool of chemometrics. Chemom Intell Lab Syst 58: 109–130

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aylin Alin.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Alin, A. Comparison of PLS algorithms when number of objects is much larger than number of variables. Stat Papers 50, 711–720 (2009). https://doi.org/10.1007/s00362-009-0251-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00362-009-0251-7

Keywords

Navigation