Skip to main content
Log in

A novel wavelength interval selection based on split regularized regression for spectroscopic data

  • Original Paper
  • Published:
Journal of Mathematical Chemistry Aims and scope Submit manuscript

Abstract

Wavelength selection has become a critical step in the analysis for near-infrared (NIR) spectroscopy with high co-linearity and large number of spectral variables. In this study, a novel wavelength interval selection method based on split regularized regression and partial least squares (SplitReg-PLS) is developed. SplitReg-PLS is a two-step approach, which combines the advantage of the SplitReg and PLS methods. SplitReg presents interesting properties, which can split the variables into groups and pool the regularized estimation of the regression coefficients together as groups. The PLS regression is one of the most popular methods for multivariate calibration, and is performed on the selected group variables by using the SplitReg. The SplitReg-PLS method can automatically select successive strongly correlated and interpretable spectral variables related to the response, which provides a flexible framework for variable selection. The performance of the proposed procedure is evaluated by three real NIR datasets. The results indicate that SplitReg-PLS is a good wavelength interval selection strategy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. K.A. Bakeev, Process Analytical Technology: Spectroscopic Tools and Implementation Strategies for the Chemical and Pharmaceutical Industries (Wiley, New York, 2010)

    Book  Google Scholar 

  2. I.M. Johnstone, D.M. Titterington, Statistical challenges of high-dimensional data. Philos. Trans. A 367, 4237–4253 (2009)

    Article  Google Scholar 

  3. P. Geladi, B. Kowalski, Partial least-squares regression: a tutorial. Anal. Chim. Acta 185, 1–17 (1986)

    Article  CAS  Google Scholar 

  4. V. Centner, D. Massart, O.E. de Noord, S. de Jong, B. Vandeginste, C. Sterna, Elimination of uninformative variables for multivariate calibration. Anal. Chem. 68(21), 3851–3858 (1996)

    Article  CAS  PubMed  Google Scholar 

  5. H.D. Li, Y.Z. Liang, Q.S. Xu, D.S. Cao, Key wavelengths screening using competitive adaptive reweighted sampling method for multivariate calibration. Anal. Chim. Acta 648(1), 77–84 (2009)

    Article  CAS  PubMed  Google Scholar 

  6. R. Leardi, M. Seasholtz, R. Pell, Variable selection for multivariate calibration using a genetic algorithm: prediction of additive concentrations in polymer films from Fourier transform-infrared spectral data. Anal. Chim. Acta 461(2), 189–200 (2002)

    Article  CAS  Google Scholar 

  7. L. Nørgaard, A. Saudland, J. Wagner, J.P. Nielsen, L. Munck, S.B. Engelsen, Interval partial least-squares regression (iPLS). Appl. Spectrosc. 54(3), 413–419 (2000)

    Article  Google Scholar 

  8. J.H. Jiang, R.J. Berry, H.W. Siesler, Y. Ozaki, Wavelength interval selection in multi-component spectral analysis by moving window partial least-squares regression with applications to mid-infrared and near-infrared spectroscopic data. Anal. Chem. 74, 3555–3565 (2002)

    Article  CAS  PubMed  Google Scholar 

  9. R.F. Shan, W.S. Cai, X.G. Shao, Variable selection based on locally linear embedding mapping for near-infrared spectral analysis. Chemom. Intell. Lab. Syst. 131, 31–36 (2014)

    Article  CAS  Google Scholar 

  10. N.F. Zhao, Q.S. Xu, M.L. Tang, H. Wang, Variable screening for near infrared (NIR) spectroscopy data based on ridge partial least squares regression. Comb. Chem. High Throughput Screen. 23(8), 740–756 (2020)

    Article  CAS  PubMed  Google Scholar 

  11. X. Huang, Q.S. Xu, Y.Z. Liang, PLS regression based on sure independence screening for multivariate calibration. Anal. Method 4, 2815–2821 (2012)

    Article  Google Scholar 

  12. L.F. Zhou, H. Wang, A combined feature screening approach of random forest and filter-based methods for ultra-high dimensional data. Curr. Bioinform. (2022). https://doi.org/10.2174/1574893617666220221120618

    Article  Google Scholar 

  13. Y.H. Yun, H.D. Li, B.C. Deng, D.S. Cao, An overview of variable selection methods in multivariate analysis of near-infrared spectra. Trends Anal. Chem. 113, 102–115 (2019)

    Article  CAS  Google Scholar 

  14. S. Wold, E. Johansson, M. Cocchi, PLS-Partial Least Squares Projections to Latent Structures in 3D-QSAR. In: Drug design; theory methods and applications, vol. 1, ed. by H. Kubinyi (Netherlands: ESCOM Science Publishers, Leiden, 1993), pp. 523–550

    Google Scholar 

  15. T. Rajalahti, R. Arneberg, A.C. Kroksveen, M. Berle, K.M. Myhr, O.M. Kvalheim, Discriminating variable test and selectivity ratio plot: quantitative tools for interpretation and variable and biomarker selection in complex spectral or chromatographic profiles. Anal. Chem. 81(7), 2581–2590 (2009)

    Article  CAS  PubMed  Google Scholar 

  16. C.M. Andersen, R. Bro, Variable selection in regression—a tutorial. J. Chemom. 24(11–12), 728–737 (2011)

    Google Scholar 

  17. R. Tibshirani, Regression shrinkage and selection via the lasso. J. R. Stat. Soc. B 58, 267–288 (1996)

    Google Scholar 

  18. H. Zou, T. Hastie, Regularization and variable selection via the elastic net. J. R. Stat. Soc. B 67(5), 301–320 (2005)

    Article  Google Scholar 

  19. G. Cannon, D.B. Steven, Using elastic net regression to perform spectrally relevant variable selection. J. Chemom. 32, 3034–3047 (2018)

    Article  Google Scholar 

  20. A. Christidis, L. Lakshmanan, E. Smucler, R. Zamar, Split regularized regression. Technometrics 62(3), 330–338 (2020)

    Article  Google Scholar 

  21. T. Speed, A correlation for the 21st century. Science 334, 1502–1503 (2011)

    Article  CAS  PubMed  Google Scholar 

  22. P.J. Lewi, Pattern recognition, reflections from a chemometric point of view. Chemom. Intell. Lab. Syst. 28, 23–33 (1995)

    Article  CAS  Google Scholar 

  23. R.W. Kennard, L.A. Stone, Computer Aided Design of Experiments. Technometrics 11, 137–148 (1969)

    Article  Google Scholar 

  24. M. Forina, G. Drava, C. Armanino, R. Boggia, S. Lanteri, R. Leardi, P. Corti, P. Conti, R. Giangiacomo, C. Galliena, R. Bigoni, I. Quartari, C. Serra, D. Ferri, O. Leoni, L. Lazzeri, Transfer of calibration function in near-infrared spectroscopy. Chemom. Intell. Lab. Syst. 27, 189–203 (1995)

    Article  CAS  Google Scholar 

  25. D.J. Rimbaud, D.L. Massart, R. Leardi, O.E. De Noord, Genetic algorithms as a tool for wavelength selection in multivariate calibration. Anal. Chem. 67, 4295–4301 (1995)

    Article  Google Scholar 

Download references

Acknowledgements

This study is financially supported by Hunan Provincial Department of Education Foundation of China (Grant No. 20A086). The study meets with the approval of the university’s review board. We are grateful to all employees of this institute for their encouragement and support of this research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xin Huang.

Ethics declarations

Conflict of interest

The authors declare they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Huang, X., Xia, L. A novel wavelength interval selection based on split regularized regression for spectroscopic data. J Math Chem 61, 877–892 (2023). https://doi.org/10.1007/s10910-022-01444-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10910-022-01444-6

Keywords

Navigation