Improved data-based fault detection strategy and application to distillation columns

https://doi.org/10.1016/j.psep.2017.01.017Get rights and content

Highlights

  • Developed a novel data-based fault detection (FD) method.

  • Combined the wavelet-based multiscale PLS modeling and GLR test to the FD.

  • A distillation column is used to evaluate the performances the developed method.

  • The detection results show the superior performance of the new combined strategy.

Abstract

Chemical and petrochemical processes require continuous monitoring to detect abnormal events and to sustain normal operations. Furthermore, process monitoring enhances productivity, efficiency, and safety in process industries. Here, we propose an innovative statistical approach that exploits the advantages of multiscale partial least squares (MSPLS) models and generalized likelihood ratio (GLR) tests for fault detection in processes. Specifically, we combine an MSPLS algorithm with wavelet analysis to create our modeling framework. Then, we use GLR hypothesis testing based on the uncorrelated residuals obtained from the MSPLS model to improve fault detection. We use simulated distillation column data to evaluate the MSPLS-based GLR chart. Results show that our MSPLS-based GLR method is more powerful than the PLS-based Q and GLR method and MSPLS-based Q method, especially in early detection of small faults with abrupt or incipient behavior.

Introduction

Modern automated industrial processes rely on the precise control of process conditions. Detection of anomalies or deviations in such processes is essential. Furthermore, to improve the reliability, safety and efficiency of advanced process control methods, fault detection and fault diagnosis have become important in numerous technical processes. For example, chemical processes require monitoring approaches that can detect abnormalities while sustaining normal operations. Increasing attention to fault detection and safety has led to the development of several fault detection techniques that can be grouped into two main families: model-based approaches and data-based approaches (Venkatasubramanian et al., 2003, Yin and Zhu, 2015, Harrou et al., 2014, Yin et al., 2012, Yin et al., 2014, Gao et al., 2015). The merits of both model-based and data-based process-monitoring techniques have been demonstrated in practice over the past four decades. In model-based approaches, a residual signal is generated from a mathematical model of a system and then used as an indicator of a fault (Venkatasubramanian et al., 2003, Isermann, 2006). The most commonly used analytical model-based approaches for residual signal generation include the observer-based approach, the parity space approach and the parameter estimation-based approach (Venkatasubramanian et al., 2003, Isermann, 2006). Unfortunately, deriving accurate models of monitored systems, especially complex industrial processes including chemical and environmental processes can be difficult. Also, modeling of complex systems can be very time consuming. In the absence of an explicit model and if measurement signals are the only available resource for process monitoring, data-driven implicit models are a suitable alternative. Unlike the model-based approaches, data-based techniques provide efficient tools for extracting useful features for the design of monitoring schemes based on empirical models derived from the available process data (Yin et al., 2012, Yin et al., 2014, Venkatasubramanian et al., 2003, Harrou et al., 2016b, Harrou et al., 2016a, Zhao et al., 2013). Such methods require minimal a prior knowledge of process physics, but they depend on the availability of quality input data (Qin, 2012). Multivariate Statistical Process Control (MSPC) is one such data-based technique. MSPC and its associated statistical techniques are increasingly used in the control of continuous and batch processes in process industries.

Multivariate statistical process monitoring can provide early warnings of abnormal changes in process operations. Principal component analysis (PCA) and partial least squares (PLS) are two basic methods of multivariate analysis and are reputed to be powerful tools for monitoring multivariate processes with highly correlated process data. They have been extensively applied in the field of chemometrics (Liang and Zhang, 2012, Abdi and Williams, 2010, Chiang et al., 2001). Some chemical processes, such as distillation, are usually modeled by two sets of variables, inputs and output). PLS regression is widely used to model multivariate input-output process data (Wold et al., 1984, Yin et al., 2015, Madakyaru et al., 2012). Unlike PCA, PLS finds an optimum pair of latent variables both from the predictor (input) and predicted (output) variables that have the largest covariance (Geladi and Kowalski, 1986, Harrou et al., 2015). Extracting useful data with PLS modeling and then using monitoring indices lead to detection of faults in the monitored process (Harrou et al., 2013c). Several PLS variants have been proposed to overcome the shortcomings of the classical PLS, such as multiway PLS (Nomikos and MacGregor, 1995), multi-block PLS (MacGregor et al., 1994), dynamic PLS (Lee et al., 2004) as well as kernel PLS (Jia and Zhang, 2016). Very recently, an improvement in the PLS method that reduces of reducing the number of required latent variables to achieve a reduction of the computational load compared with the conventional PLS method was reported (Yin et al., 2016a).

However, the presence of measurement errors (noise) in the data and model uncertainties degrade the quality of fault detection techniques. In addition, most chemical process data generally include features and noise occurring over both time and frequency. Nevertheless, the majority of fault detection approaches, including PCA and PLS, are based on time-domain data (operating on a single time scale), and thus they do not take into consideration the multiscale characteristics of the data. As a consequence model-based and model-free data denoising methods are used for data filtering. For example, extended Kalman filtering and particle filtering are utilized to denoise collected data for fault diagnosis (Yin et al., 2016b, Yin and Zhu, 2015). When a filter is not available, multiscale representation of data using wavelets, which is a powerful feature-extraction tool, has been found to separate efficiently deterministic and stochastic features (Bakshi, 1998a). Wavelet-based multiscale representation of data has been used extensively in the literature to ameliorate the effectiveness and robustness of fault detection strategies (Bakshi, 1998a, Yoon and MacGregor, 2004, Ganesan et al., 2004, Li and Yao, 2005).

The detection of incipient anomalies is crucial to maintaining the normal operations of a system by providing early fault warnings. The problem is that incipient anomalies are often too weak to be detected by conventional monitoring methods. However, conventional MSPLS-based monitoring indices such as T2 and Q charts cannot detect small changes in process data (Harrou et al., 2016a). Combining the advantages of MSPLS with those of generalized likelihood ratio (GLR) hypothesis testing should improve fault detection. GLR hypothesis testing, which is very popular in the framework of model-based fault detection, has demonstrated good fault detection capacity (Harrou et al., 2014, Harrou et al., 2013c, Basseville and Nikiforov, 1993). Here, we draw on wavelet-based multiscale representation of data to improve a PLS-based hypothesis testing fault detection method. Specifically, to consider the multivariate and multi-scale nature of process dynamics, we use a MSPLS algorithm combining PLS and wavelet analysis as the modeling framework. Then, we apply GLR hypothesis testing using residuals obtained from the MSPLS model to improve the fault detection abilities of the latent variable-based fault detection method. Results from simulated distillation column data show that the MSPLS-GLR approach can achieve better fault detection efficiency than a PLS-based GLR approach.

The remainder of this paper is organized as follows. Section 2 gives a brief overview of the PLS model. In Section 3, the multiscale PLS approach is briefly reviewed, and Section 4 introduces GLR hypothesis testing and its use in anomaly detection. Next, the concept of combining MSPLS modeling with the GLR test is presented in Section 5. Section 6 applies the proposed MSPLS-GLR procedure to a simulated distillation column process. Finally, Section 7 concludes this paper.

Section snippets

PLS modeling

PLS is a basic multivariate projection method used in multivariate statistic process monitoring (Höskuldsson, 1988). The purpose of PLS is to analyze relationships between input data, X, and output data, Y. Specifically, PLS finds an optimum pair of latent variables in both X and Y such that these transformed variables have the largest covariance (Yin et al., 2012, Harrou et al., 2015). PLS has been widely used in economics, sociology and chemometrics.

Consider a pair of datasets, XN×M and YN

Multiscale PLS modeling

As noted earlier, data collected from engineering processes are usually noisy and correlated in time, which makes fault detection more difficult because the presence of noise degrades detection quality and most methods are developed for independent observations. Wavelet-based multiscale modeling of data is an efficient tool for extracting features and is well suited to denoising and decorrelating time series data (Ganesan et al., 2004). Here, we merge multiscale modeling with PLS to improve the

Generalized likelihood ratio test-based fault detection

Detecting a particular fault that occurs in a monitored process requires checking whether the current measurements are statistically different from the a priori known faultless measurements (i.e., measurements without anomalies). Fault detection, which is a binary decision making process, consists of identifying fault from non-fault events based on some relevant data features. In this work, we present a fault detection algorithm based on the uncorrelated residuals obtained from the MSPLS model.

The MSPLS-based GLR fault-detection scheme

In this section, MSPLS is coupled with GLR hypothesis testing to design an innovative fault detection scheme with improved detection abilities. In general, we obtain the model first and then perform the fault detection procedure accordingly. MSPLS indicates the capabilities of the modeling and monitoring process at different frequency bands. MSPLS using wavelets is used for data denoising and for reducing autocorrelation in the data. After the reference MSPLS model is identified, it is used to

Monitoring a simulated distillation column

In this section, the ability of the proposed MSPLS-GLR technique to detect faults is applied to simulated data and the results are compared with those obtained using the traditional PLS-GLR method. In all monitoring charts, the red-shaded area is the region where the fault is injected to the test data while the 95% control limits are plotted by the horizontal dashed line.

Conclusion

Statistical process control is an important statistical tool for monitoring chemical processes. Data observed from chemical processes are usually noisy and correlated in time, which makes the fault detection more difficult as the presence of noise degrades fault detection quality and most methods are developed for independent observations. Multiscale representation of data using wavelets is a powerful feature-extraction tool that is well suited to denoising and decorrelating time series data.

Acknowledgements

The work reported in this paper was supported by the King Abdullah University of Science and Technology (KAUST) Office of Sponsored Research (OSR) under Award No: OSR-2015-CRG4-2582.

References (56)

  • S. Qin

    Survey on data-driven industrial process monitoring and diagnosis

    Annu. Rev. Control

    (2012)
  • V. Venkatasubramanian et al.

    A review of process fault detection and diagnosis: part III: process history based methods

    Comput. Chem. Eng.

    (2003)
  • S. Yin et al.

    A comparison study of basic data-driven fault diagnosis and process monitoring methods on the benchmark Tennessee Eastman process

    J. Process Control

    (2012)
  • Y. Zhao et al.

    Pattern recognition-based chillers fault detection method using support vector data description (SVDD)

    Appl. Energy

    (2013)
  • H. Abdi et al.

    Principal component analysis

    Wiley Interdiscip. Rev.: Comput. Stat.

    (2010)
  • H. Aradhye et al.

    Multiscale SPC using wavelets: theoretical analysis and properties

    AIChE J.

    (2003)
  • B. Bakshi

    Multiscale PCA with application to multivariate statistical process monitoring

    AIChE J.

    (1998)
  • B. Bakshi

    Multiscale pca with application to multivariate statistical process monitoring

    AIChE J.

    (1998)
  • B. Bakshi

    Multiscale analysis and modeling using wavelets

    J. Chemom.

    (1999)
  • M. Basseville et al.

    Detection of Abrupt Changes: Theory and Application. Vol. 104

    (1993)
  • L. Chiang et al.

    Fault Detection and Diagnosis in Industrial Systems

    (2001)
  • I. Daubechies

    Orthonormal bases of compactly supported wavelets

    Commun. Pure Appl. Math.

    (1988)
  • D. Donoho et al.

    Wavelets on the interval and fast wavelet transforms

    Appl. Comput. Harmon. Anal.

    (1993)
  • D. Donoho et al.

    Ideal spatial adaptation via wavelet shrinkage

    Biometrika

    (1994)
  • D.L. Donoho et al.

    Wavelet shrinkage: Asymptotia?

    J. R. Stat. Soc. B

    (1995)
  • R. Ganesan et al.

    Wavelet-based multiscale statistical process monitoring: a literature review

    IIE Trans.

    (2004)
  • R. Gao et al.

    Wavelets: Theory and Applications for Manufacturing

    (2010)
  • Z. Gao et al.

    A survey of fault diagnosis and fault-tolerant techniques–part II: fault diagnosis with knowledge-based and hybrid/active approaches

    IEEE Trans. Ind. Electron.

    (2015)
  • Cited by (47)

    View all citing articles on Scopus
    View full text