Statistical performance monitoring of dynamic multivariate processes using state space modelling
Introduction
A number of multivariate approaches have been proposed for the development of statistical process control schemes to monitor the performance of dynamic processes. In most of these approaches well known multivariate statistical tools, such as principal component analysis (PCA) and partial least squares (PLS) have been extended to include system dynamics. For example Ku, Storer, and Georgakis (1995) proposed applying PCA to an extended data matrix that included past values of each variable. They termed this dynamic principal component analysis. For monitoring the performance of the process, a T2 statistic based on the k retained principal components and the squared prediction error (SPE) of the variable predictions were calculated. The control limits for these two statistics were calculated assuming that the T2 and SPE statistics followed F and χ2 distributions, respectively.
More recently, Negiz and Cinar (1997b) proposed using a state space model based on canonical variate analysis (CVA). CVA is similar in concept to PLS in that the method calculates linear combinations of the ‘past’ values of the system inputs and/or the outputs that are most highly correlated with linear combinations of the ‘future’ values of the outputs of the process. They demonstrated the superiority of the CVA state space representation over the dynamic PCA algorithm of Ku et al. for modelling dynamic systems through application of the techniques to a transfer function simulation comprising one input and two outputs.
A number of comparisons of CVA with other techniques for developing state space models, including partial least squares (PLS); numerical algorithms for state space subspace system identification (N4SID); and balanced realisation (BR), have been carried out (Juricek, Larimore, and Seborg (1998), Negiz and Cinar (1997a) and Simoglou, Martin, and Morris (1999a)). In all these studies CVA was found to outperform the other methods in terms of model stability and parsimony, i.e. fewer identified parameters were required in the final model. Simoglou, Martin, and Morris (1999b) described a modified T2 statistic based on the knowledge that the CVA states have an identity covariance matrix, in contrast to an arbitrary covariance matrix where the diagonal elements are calculated in descending order. A comparison between CVA and PLS T2 based monitoring schemes was carried out. It was shown that although both approaches were able to detect the faults introduced into the system, CVA provided more rapid detection.
Within this paper, a number of metrics are proposed based on Hotelling's T2 statistic for both the retained as well as the excluded CVA and PLS derived states. The T2 statistic calculated for the excluded latent variables monitors the noise in the process measurements, whilst the T2 metric based on the retained latent variables monitors the systematic information. Monitoring the non-systematic part of the measurements provides an additional fault detection tool that is capable of detecting special events entering the system. T2 statistics based on both the retained and excluded CVA and PLS latent variables corresponding to the future of the process are also proposed. Through the development of these statistics, the space spanned by the outputs of the system can also be monitored. In contrast to PLS, the CVA approach enables the development of such statistics, since the latent variables based on the outputs of the system are orthogonal and thus the calculation of the T2 statistics is computationally reliable.
In addition, the multi-input–multi-output CVA and PLS state space models result in the calculation of two types of residuals, those associated with the errors from the predictions of the system states and those calculated from the system outputs. According to the state space model assumptions, the residuals are independent and identically distributed processes and are thus suitable for developing T2 statistics. The T2 statistics based on the state space model residuals provide additional dynamic MSPC tools that are capable of detecting a range of process faults such as disturbances and biased sensors.
A limitation of previous studies (e.g. Ku et al., 1995; Negiz & Cinar, 1997b; Simoglou et al., 1999b) was that the control limits for the T2 statistic and residuals were calculated based on the assumption that the data was serially independent. However, when process data are collected from a dynamic system, serial correlation will generally be present. Thus, the average run length (the expected number of observations until a control chart identifies an observation outside the control limits) of the resulting T2 control chart will be less than expected for in-control operation if the serial correlation is positive, resulting in too many false alarms (Vasilopoulos & Stamboulis, 1978). To address this issue, control limits based on the empirical reference distribution (Willemain & Runger, 1996) are investigated and compared to those calculated based on the assumption that the statistical measures follow a known statistical distribution. Finally two model types were investigated, the first assumed that the number of inputs and outputs was fixed and the second approach explored the case where the number of inputs and outputs differed for different variables. The various monitoring statistics, model forms and control limits are investigated and compared using a comprehensive simulation of a continuous stirred tank co-polymerisation reactor (Achilias & Kiparissides, 1994).
Dynamic performance monitoring is a particularly challenging area of multivariate statistical process control, being very demanding in terms of process engineering understanding, modelling skills and statistical expertise with regard to the appropriateness of the relevant monitoring statistics. The paper aims to identify and define those metrics that would potentially enable assured monitoring of the performance of dynamic processes. The paper contributes to the on-going discussions of dynamic performance monitoring.
Section snippets
State space models
The concept of state space modelling is based on describing a system in terms of k first-order difference equations that are combined into a first-order vector–matrix difference equation. The stochastic state space model which forms the basis of this work is that proposed by Larimore (1983):where x, u and y are the system states, inputs and outputs respectively; F (k×k) is the state matrix; G (k×nu) is the input matrix; H (ny×k) is the output
Hotelling's T2 based on the states and the future values of the state space model
Statistical monitoring metrics are developed based on the state space model described by , . Although the use of exogenous variables is standard in process control, their inclusion in performance monitoring schemes has been less widely reported (Larimore, 1997). By monitoring them directly, changes in their behaviour can be detected more quickly than if information relating to any changes in their performance is based on other process variables. While it is not essential to incorporate such
Introduction and model form
The metrics described in the previous section and summarised in Table 1 were applied to a comprehensive simulation of a continuous stirred tank copolymerisation reactor (Achilias & Kiparissides, 1994). Monomers A (methyl methacrylate) and B (vinyl acetate) are added continuously to a perfectly mixed tank along with initiator (azobisisobutyronitrile), solvent (benzene), chain transfer agent T (acetaldehyde) and inhibitor (m-dinitrobenzene). Coolant flows through the jacket to remove the heat of
Conclusions
A multivariate approach for the monitoring of a continuous dynamic process has been proposed. It involved the development of state space models using CVA and PLS. The CVA and PLS state space models provided statistics based on the past and future latent variables, state and output residuals. Eight monitoring statistics were considered. The first four were based on the past and future latent variables and were calculated using Hotelling's T2 statistic. The remaining four were based on the state
Acknowledgements
The authors acknowledge the support of the EPSRC IMI Project System Capability Enhancement in High Performance Monitoring (SCIENTIA) GR/L28029.
References (25)
- et al.
On the validity of the steady-state approximations in high conversion diffusion-controlled free-radical polymerisation reactions
Polymer
(1994) The blockwise bootstrap for general empirical processes of stationary-sequences
Stochastic Processes and their Applications
(1995)- et al.
Structure identification of non-linear dynamic systems—a survey on input–output approaches
Automatica
(1990) - et al.
Disturbance detection and isolation by dynamic principal component analysis
Chemometrics and Intelligent Laboratory Systems
(1995) Statistical process control of multivariate processes
Control Engineering Practice
(1995)- et al.
Non-parametric confidence bounds for process monitoring charts
Journal of Process Control
(1996) - et al.
PLS, balanced and canonical variate realisation techniques for identifying VARMA models in state space
Chemometrics and Intelligent Laboratory Systems
(1997) - Akaike, H. (1973). Information theory and an extension of the maximum likelihood principles. 2nd International...
Markovian representation of stochastic processes by canonical variables
Siam Journal of Control
(1975)- et al.
Time series modelling for statistical process control
Journal of Business and Economic Statistics
(1988)
An introduction to multivariate statistical analysis
Bootstrap confidence intervals
Statistical Science
Cited by (118)
Data-driven supply chain monitoring using canonical variate analysis
2023, Computers and Chemical EngineeringVariational Bayesian State Space Model for dynamic process fault detection
2023, Journal of Process ControlStructured fault information-aided canonical variate analysis model for dynamic process monitoring
2023, Journal of Process ControlPerspectives on nonstationary process monitoring in the era of industrial artificial intelligence
2022, Journal of Process ControlRetrospective comparison of several typical linear dynamic latent variable models for industrial process monitoring
2022, Computers and Chemical EngineeringCitation Excerpt :For linear dynamic processes under the Gaussian distributed assumption, monitoring statistics and their thresholds are then derived based on the two subspaces for fault detection. These methods have advantages in addressing high-dimensional and highly correlated data with the LV structure (Chen and Zhao, 2021; Lu et al., 2005; Shang et al., 2015b; Simoglou et al., 2002; Zhao, 2014). The most straightforward dynamic extensions based on MSPM models usually augment the input data matrix by several historical samples.
A MATLAB toolbox for data pre-processing and multivariate statistical process control
2019, Chemometrics and Intelligent Laboratory Systems