Statistical performance monitoring of dynamic multivariate processes using state space modelling

https://doi.org/10.1016/S0098-1354(02)00012-1Get rights and content

Abstract

There is increasing interest in extending the concept of multivariate statistical process control to incorporate system dynamics into the process performance monitoring representation. A methodology has previously been proposed where the system dynamics and the correlation structure of the process data are captured within a state space representation. The system states and the state space model parameters are identified using the multivariate statistical projection techniques of canonical variate analysis (CVA) and partial least squares (PLS). A number of metrics based on Hotelling's T2 statistic are proposed for the monitoring of the state of the system. Control limits for these metrics are calculated using the empirical reference distribution and assuming the metrics follow the known, theoretically derived, probability distributions. Two model forms are proposed. In the first the number of inputs and outputs in the model are constant for all variables, whilst the second approach assumes that number of inputs and outputs can vary. The modelling and process performance monitoring ability of the CVA and PLS state space representations, and the sensitivity of the various metrics in identifying faults, is investigated using a comprehensive simulation of a continuous stirred tank co-polymerisation reactor.

Introduction

A number of multivariate approaches have been proposed for the development of statistical process control schemes to monitor the performance of dynamic processes. In most of these approaches well known multivariate statistical tools, such as principal component analysis (PCA) and partial least squares (PLS) have been extended to include system dynamics. For example Ku, Storer, and Georgakis (1995) proposed applying PCA to an extended data matrix that included past values of each variable. They termed this dynamic principal component analysis. For monitoring the performance of the process, a T2 statistic based on the k retained principal components and the squared prediction error (SPE) of the variable predictions were calculated. The control limits for these two statistics were calculated assuming that the T2 and SPE statistics followed F and χ2 distributions, respectively.

More recently, Negiz and Cinar (1997b) proposed using a state space model based on canonical variate analysis (CVA). CVA is similar in concept to PLS in that the method calculates linear combinations of the ‘past’ values of the system inputs and/or the outputs that are most highly correlated with linear combinations of the ‘future’ values of the outputs of the process. They demonstrated the superiority of the CVA state space representation over the dynamic PCA algorithm of Ku et al. for modelling dynamic systems through application of the techniques to a transfer function simulation comprising one input and two outputs.

A number of comparisons of CVA with other techniques for developing state space models, including partial least squares (PLS); numerical algorithms for state space subspace system identification (N4SID); and balanced realisation (BR), have been carried out (Juricek, Larimore, and Seborg (1998), Negiz and Cinar (1997a) and Simoglou, Martin, and Morris (1999a)). In all these studies CVA was found to outperform the other methods in terms of model stability and parsimony, i.e. fewer identified parameters were required in the final model. Simoglou, Martin, and Morris (1999b) described a modified T2 statistic based on the knowledge that the CVA states have an identity covariance matrix, in contrast to an arbitrary covariance matrix where the diagonal elements are calculated in descending order. A comparison between CVA and PLS T2 based monitoring schemes was carried out. It was shown that although both approaches were able to detect the faults introduced into the system, CVA provided more rapid detection.

Within this paper, a number of metrics are proposed based on Hotelling's T2 statistic for both the retained as well as the excluded CVA and PLS derived states. The T2 statistic calculated for the excluded latent variables monitors the noise in the process measurements, whilst the T2 metric based on the retained latent variables monitors the systematic information. Monitoring the non-systematic part of the measurements provides an additional fault detection tool that is capable of detecting special events entering the system. T2 statistics based on both the retained and excluded CVA and PLS latent variables corresponding to the future of the process are also proposed. Through the development of these statistics, the space spanned by the outputs of the system can also be monitored. In contrast to PLS, the CVA approach enables the development of such statistics, since the latent variables based on the outputs of the system are orthogonal and thus the calculation of the T2 statistics is computationally reliable.

In addition, the multi-input–multi-output CVA and PLS state space models result in the calculation of two types of residuals, those associated with the errors from the predictions of the system states and those calculated from the system outputs. According to the state space model assumptions, the residuals are independent and identically distributed processes and are thus suitable for developing T2 statistics. The T2 statistics based on the state space model residuals provide additional dynamic MSPC tools that are capable of detecting a range of process faults such as disturbances and biased sensors.

A limitation of previous studies (e.g. Ku et al., 1995; Negiz & Cinar, 1997b; Simoglou et al., 1999b) was that the control limits for the T2 statistic and residuals were calculated based on the assumption that the data was serially independent. However, when process data are collected from a dynamic system, serial correlation will generally be present. Thus, the average run length (the expected number of observations until a control chart identifies an observation outside the control limits) of the resulting T2 control chart will be less than expected for in-control operation if the serial correlation is positive, resulting in too many false alarms (Vasilopoulos & Stamboulis, 1978). To address this issue, control limits based on the empirical reference distribution (Willemain & Runger, 1996) are investigated and compared to those calculated based on the assumption that the statistical measures follow a known statistical distribution. Finally two model types were investigated, the first assumed that the number of inputs and outputs was fixed and the second approach explored the case where the number of inputs and outputs differed for different variables. The various monitoring statistics, model forms and control limits are investigated and compared using a comprehensive simulation of a continuous stirred tank co-polymerisation reactor (Achilias & Kiparissides, 1994).

Dynamic performance monitoring is a particularly challenging area of multivariate statistical process control, being very demanding in terms of process engineering understanding, modelling skills and statistical expertise with regard to the appropriateness of the relevant monitoring statistics. The paper aims to identify and define those metrics that would potentially enable assured monitoring of the performance of dynamic processes. The paper contributes to the on-going discussions of dynamic performance monitoring.

Section snippets

State space models

The concept of state space modelling is based on describing a system in terms of k first-order difference equations that are combined into a first-order vector–matrix difference equation. The stochastic state space model which forms the basis of this work is that proposed by Larimore (1983):x(t+1)=Fx(t)+Gu(t)+w(t)y(t)=Hx(t)+Au(t)+Bw(t)+e(t)where x, u and y are the system states, inputs and outputs respectively; F (k×k) is the state matrix; G (k×nu) is the input matrix; H (ny×k) is the output

Hotelling's T2 based on the states and the future values of the state space model

Statistical monitoring metrics are developed based on the state space model described by , . Although the use of exogenous variables is standard in process control, their inclusion in performance monitoring schemes has been less widely reported (Larimore, 1997). By monitoring them directly, changes in their behaviour can be detected more quickly than if information relating to any changes in their performance is based on other process variables. While it is not essential to incorporate such

Introduction and model form

The metrics described in the previous section and summarised in Table 1 were applied to a comprehensive simulation of a continuous stirred tank copolymerisation reactor (Achilias & Kiparissides, 1994). Monomers A (methyl methacrylate) and B (vinyl acetate) are added continuously to a perfectly mixed tank along with initiator (azobisisobutyronitrile), solvent (benzene), chain transfer agent T (acetaldehyde) and inhibitor (m-dinitrobenzene). Coolant flows through the jacket to remove the heat of

Conclusions

A multivariate approach for the monitoring of a continuous dynamic process has been proposed. It involved the development of state space models using CVA and PLS. The CVA and PLS state space models provided statistics based on the past and future latent variables, state and output residuals. Eight monitoring statistics were considered. The first four were based on the past and future latent variables and were calculated using Hotelling's T2 statistic. The remaining four were based on the state

Acknowledgements

The authors acknowledge the support of the EPSRC IMI Project System Capability Enhancement in High Performance Monitoring (SCIENTIA) GR/L28029.

References (25)

  • T.W. Anderson

    An introduction to multivariate statistical analysis

    (1984)
  • T.J. DiCiccio et al.

    Bootstrap confidence intervals

    Statistical Science

    (1996)
  • Cited by (118)

    • Retrospective comparison of several typical linear dynamic latent variable models for industrial process monitoring

      2022, Computers and Chemical Engineering
      Citation Excerpt :

      For linear dynamic processes under the Gaussian distributed assumption, monitoring statistics and their thresholds are then derived based on the two subspaces for fault detection. These methods have advantages in addressing high-dimensional and highly correlated data with the LV structure (Chen and Zhao, 2021; Lu et al., 2005; Shang et al., 2015b; Simoglou et al., 2002; Zhao, 2014). The most straightforward dynamic extensions based on MSPM models usually augment the input data matrix by several historical samples.

    View all citing articles on Scopus
    View full text