doi:10.1016/j.chroma.2007.10.100
Copyright © 2007 Elsevier B.V. All rights reserved.
Robust partial least squares model for prediction of green tea antioxidant capacity from chromatograms
M. Daszykowskia, Y. Vander Heydenb and B. Walczaka,
, 
aDepartment of Chemometrics, Institute of Chemistry, Silesian University, 9 Szkolna Street, 40-006 Katowice, Poland
bDepartment of Analytical Chemistry and Pharmaceutical Technology, Vrije Universiteit Brussel, Laarbeeklaan 103, B-1090 Brussels, Belgium
Received 24 September 2007;
revised 29 October 2007;
accepted 30 October 2007.
Available online 6 November 2007.
References and further reading may be available for this article. To view references and further reading you must
purchase this article.
Abstract
In this paper a robust version of the partial least squares model (partial robust M-regression, PRM) was built to predict the total antioxidant capacity of green tea extracts. In order to construct a calibration model, chromatograms obtained by a fast high-performance liquid chromatographic method on a monolithic silica column were related with the total antioxidant capacity of green tea extracts as determined by the Trolox antioxidant capacity method. Since natural samples are the subject of the study, some outlying samples are present in the data, as shown in an earlier work. Therefore, to construct reliable calibration models, they were detected and removed prior to modeling. With the applied robust partial least squares approach, where a weighting scheme is embedded to down-weight the negative influence of outliers upon the model it is possible to construct a robust calibration model, without prior identification of outlying objects. It was shown that a robust model, allowing satisfactory prediction for test samples, can be used in controlling green tea antioxidant capacity based on their chromatograms. The constructed robust partial least squares model was shown to have virtually the same fit and predictive power as the classical partial least squares model when outlying samples were removed from the data.
Keywords: Robust PLS; Partial robust M-regression; Fingerprints; Chemometrics; Outliers
Fig. 1. The classical partial least squares model with one factor for total antioxidant capacity of green tea extracts, presented as y predicted vs. y observed for calibration set (○) and test samples (*).
Fig. 2. Weights of the PRM model with three factors: (a) global weights and (b) leverage weights vs. residual weights.
Fig. 3. Distance–distance plot presenting leverage and residual distances for calibration set objects computed for the PRM model with three factors.
Fig. 4. Two superimposed signals; a median chromatogram of calibration set samples with six major peaks (gray solid line) and a chromatogram of sample 42 (black line), with an extra peak denoted as (*).
Fig. 5. Calibration models with three factors for prediction of total antioxidant capacity of green tea extracts based on their chromatograms, presented as y predicted vs. y observed for calibration set (○) and test samples (*): (a) PRM model for complete data (enlarged region without two outlying samples 41 and 43), and (b) classical PLS for clean data outliers (without samples 41–43).
Table 1.
(a) Classical PLS model with one factor constructed for complete green tea data (* is the RMSE value computed without samples 41 and 43); (b) RPM model constructed for complete green tea data (assumed fraction of data contamination 10% and L1-median data centering), and (c) classical PLS model with three factors constructed for green tea data without outliers
