Comparison of two multiplicative signal correction strategies for calibration transfer without standards

https://doi.org/10.1016/j.chemolab.2007.11.009Get rights and content

Abstract

Two multiplicative signal correction (MSC) algorithms are compared for the standardization of data from two near-infrared (NIR) spectrometers. Absorbance spectra were measured from 1000-2200 nm for a set of 45 jet fuel samples. Data from one instrument were standardized to match data from a second instrument using windowed MSC (W-MSC) and moving window MSC (MW-MSC). For W-MSC user-defined windows were selected and for MW-MSC the window size was optimized based on a two-step procedure: 1) assigning a cut off window to avoid over-processing and 2) selection of a specific window size based on sample leverage. For reproducibility studies performed over time on a single instrument, data extending through the last day of the study (63 days outside the calibration) required no preprocessing except a peak alignment correction on day 58. For analysis between the two instruments, successful results were obtained using a sub-region of the data from 1000–1700 nm processed by MW-MSC using a 441 point window. A method of selecting an appropriate window size is proposed based on statistical significance testing.

Introduction

Near-infrared (NIR) spectroscopy is becoming a widely used analytical technique for a variety of applications [1]. These include quality control monitoring of pharmaceutical, petroleum, agriculture and other products, environmental analysis, medical diagnostics, and academic research. The rise in popularity of NIR is due largely to the advances in chemometric data processing strategies which have allowed for the weak, semi-selective spectral signatures to be analyzed effectively [2]. The advantages of NIR include the potential to provide a non-destructive, rapid analysis with little or no sample preparation. The main disadvantage is the difficulty in extracting the relevant analytical information from the raw data. Typically, the analytical signal is a small component of the overall signal, which may include features due to matrix constituents or instrumental drift. Multivariate calibration techniques such as principal component regression (PCR) or partial least-squares (PLS) are usually required for quantitative analysis [3].

Formulating a quantitative calibration with NIR data generally involves preparation and measurement of anywhere from 50 to several hundred samples. Since a considerable amount of time and resources are devoted to construction of the calibration model, frequent recalibration is not desirable. There are many factors which cause a calibration model to become obsolete or otherwise inappropriate for external data. External data, in this discussion, refer to any data not included in the calibration model such as that collected at a different time or using a different instrument. A calibration model may be unsuitable for external data due to day-to-day or long-term instrumental drift, instrument-to-instrument variations, alterations in data collection protocol, or unexpected sample composition changes. Day-to-day drift may be caused by temperature or humidity variations, sample cell cleanliness, detector response, or other factors in the environment that fluctuate from day to day. Long-term drift may be caused by deterioration in optical components, changes in detector response, and/or routine maintenance operations such as replacing or swapping out parts. Slight alterations in sample matrix composition may also cause changes in the data. The sources of variation discussed above often result in profound spectral artifacts, making it difficult to construct calibration models that are robust throughout time and across instrumentations. Features due to matrix signatures or instrumental profiles are often much higher in magnitude than the target analytical signal. In turn, the PLS algorithm may require the use of several latent variables (LVs) to model the data effectively.

Analysis of external data which contains drift features absent from the calibration data generally results in high prediction errors. For this reason, preprocessing techniques are commonly applied to NIR data for the removal of drift or other noise components [4]. These include baseline correction, first or second derivative calculations, digital filtering, signal averaging, smoothing functions, signal correction algorithms, and normalization procedures. One or a combination of the above techniques is often sufficient to accommodate day-to-day instrumental drift. Multiplicative signal correction [5], [6], [7] (MSC), also termed multiplicative scatter correction, was originally developed to compensate for the effects of light scattering in reflectance spectroscopy. It has since become a widely used technique for removing general drift features such as day-to-day intensity variations. Modifications of the algorithm have also emerged as preprocessing techniques [8], [9], [10].

For longer-term drift or instrument-to-instrument variations, traditional preprocessing techniques may fail to standardize the data adequately. Two remedies are possible when this occurs. The first is calibration updating [11], [12], [13], either by performing a full re-calibration or by incorporating a sufficient amount of new data into the model. The second approach is calibration transfer [14], [15]. These techniques generally involve mathematical transformation of the spectra, or the coefficients of the model so that external data are appropriately matched to the calibration data. For NIR data exhibiting high instrument-to-instrument variations, piecewise direct standardization [16], [17], [18] (PDS) is often considered the gold standard. For this technique, data pertaining to the new setting or instrument must be collected from several (perhaps five to seven or more) strategically chosen samples in order to compute the transformation coefficients. Aside from the disadvantage of supplementary data collection, if samples degrade over time and must be regenerated, there is additional uncertainty added to the method in the sample preparation step. A calibration transfer technique that does not require external data collection was proposed by Blank, et. al [19]. This technique employs finite impulse response (FIR) filtering based on a reference spectrum, and may also be described as a moving window MSC (MW-MSC) approach. They compared FIR to piecewise direct standardization (PDS), which is considered the gold standard for calibration transfer. PDS requires a subset of samples to be collected on the secondary instrument, while MW-MSC requires no re-collection of data. However, we have had little success with PDS in previous work as well as with fuel data.

The aim of this study was to explore the moving window MSC (MW-MSC) approach for the standardization of NIR data collected on two spectrometers. The data between the two instruments had markedly different intensity profiles in sub-regions of the spectra, therefore a windowed MSC approach was taken. This allowed the preprocessing to focus on narrow portions of the spectra that were difficult to standardize between the two data sets. When MSC is applied in a piece-wise fashion, the main concern is selecting a window size that does not over-process the data. If a window is too narrow, the spectrum undergoing processing may be matched too closely to the reference spectrum such that sample-specific information is removed. In this study, we compare MW-MSC with a windowed MSC based on user-defined spectral windows (W-MSC). For MW-MSC, window size was selected using a two-step procedure. First, a minimum window size cut off was determined by computing model significance values for a range of window sizes. Random permutation models were used as the basis for the significance test [20], [21]. Once the cut off was determined the final window size was based on Hotelling's T2 statistic [12], [22], also referred to as leverage [13], of the external data applied to the model. Choosing wavelengths that are more robust with respect to drift is one way to improve calibration transfer [23]. Successful results required wavelength reduction to eliminate the region of the spectrum that showed high variability between the two instruments. In this study, a calibration set consisting of 45 jet fuel samples was used to model fuel density.

Section snippets

Jet fuel samples

Jet fuel samples consisted of 12 Jet A, 22 Jet A-1, 9 JP-8, and 2 JP-5 (45 total) and were obtained from various parts of the world. The different classes and origins did not appear to have a relationship with measured density values. In principal component analysis plots, samples were not clustered according to class and no outliers were detected. Reference density values were obtained using ASTM method D4052, which employs a digital density meter. The range of values was 0.7902–0.8235 ± 

Discussion

The goal in this study was to standardize NIR data between two instruments using W-MSC or MW-MSC, both of which are methods that preclude re-sampling of data. Results will be presented for calibration on instrument A and prediction on instrument B. The reference spectrum for all forms of MSC was the mean of D45_A.

Fig. 5 illustrates how the T2 statistic may be used as a guide to preprocessing. In plot A, prediction results for various data sets are shown for D45_A calibration models. In this

Conclusions

In this study, two windowed MSC algorithms were investigated for calibration transfer and a method of selecting an appropriate window size is proposed based on statistical significance testing. The advantage of these techniques is that no new data are required for standardization. For MW-MSC, a two-step procedure for selecting window size was proposed. First, a minimum allowable window size was set using only the calibration data (instrument A). The purpose of this was to guard against

Acknowledgements

The authors wish to thank the National Research Council (NRC), the Office of Naval Research and the Naval Air Systems Command (NAVAIR 4.4.5) through the Navy Fuels & Lubes CFT for supporting this work. They also wish to thank the following individuals for their assistance in acquiring fuel samples and specification data: Stan Seto (GE Aircraft Engines c/o Belcan Corporation), and Jeffrey M. Johnson (Boeing Commercial Airplane Group).

References (25)

  • B. Buchanan et al.

    Trac

    (1986)
  • O.E. De Noord

    Chemom. Intell. Lab. Syst.

    (1994)
  • H. Martens et al.

    J. Pharm. Biomed. Anal.

    (1991)
  • I.S. Helland et al.

    Chemom. Intell. Lab. Syst.

    (1995)
  • D.M. Haaland et al.

    Vibr. Spectrosc.

    (2002)
  • C.L. Stork et al.

    Chemom. Intell. Lab. Syst.

    (1999)
  • C. Miller

    Chemom. Intell. Lab. Syst.

    (1995)
  • O.E. De Noord

    Chemom. Intell. Lab. Syst.

    (1994)
  • R.N. Feudale et al.

    Chemom. Intell. Lab. Syst.

    (2002)
  • E. Bouveresse et al.

    Chemom. Intell. Lab. Syst.

    (1996)
  • F. Despagne et al.

    Anal. Chim. Acta

    (2000)
  • H. van der Voet

    Chemom. Intell. Lab. Syst.

    (1994)
  • Cited by (39)

    • Are standard sample measurements still needed to transfer multivariate calibration models between near-infrared spectrometers? The answer is not always

      2021, TrAC - Trends in Analytical Chemistry
      Citation Excerpt :

      One of the challenges of the MSC preprocessing-based methods is to choose the right window size. Although the authors proposed a strategy based on sample leverage to decide the optimal window size, the method failed to find widespread and successful applications in the scientific domain [74]. In some cases, the instrumental differences can manifest themselves mainly as shifts in the peaks of the resulting signals [75].

    • A parameter-free framework for calibration enhancement of near-infrared spectroscopy based on correlation constraint

      2021, Analytica Chimica Acta
      Citation Excerpt :

      In these methods using the provided standard, prediction bias can be reduced through introducing a transfer function that could correct the spectra, model coefficients or predictions acquired under different conditions. PDS has been regarded as a gold standard method, and thus intensively compared with any new method when proposed [14]. A linear regression model is generally built upon the spectra from different instruments in a certain moving window and then the spectra taken on new instrument can be transferred in the model application.

    • A correlation-analysis-based wavelength selection method for calibration transfer

      2020, Spectrochimica Acta - Part A: Molecular and Biomolecular Spectroscopy
      Citation Excerpt :

      Liu et al. [27] assumed that the spectra of samples with similar physical and chemical properties measured on different instruments were linearly correlated, and thus developed a standard-free algorithm (linear model correction, LMC). Since the multiplicative scatter correction (MSC) can reduce the scattering effect in the reflection spectrum, K.E. Kramer et al. [28] proposed two methods (windowed MSC and moving window MSC) based on this pretreatment method to realize the calibration transfer between two different instruments without standards. B. Malli et al. [29] evaluated the performance of a variety of calibration transfer methods under the condition of no real standards or only a few standards, and ranked the transfer performance of these methods, some of which referred to methods in machine learning (e.g., transfer component analysis (TCA)).

    View all citing articles on Scopus

    For Publication in Chemometrics and Intelligent Laboratory Systems.

    View full text