Fault detection with Conditional Gaussian Network
Introduction
Nowadays, systems failures can potentially lead to serious consequences for human, environment or material, and sometimes fixing them could be expensive and even dangerous. Thus, in order to avoid these undesirable situations, it becomes very important and essential for current modern complex systems to early detect any changes in the system nominal operations before they become critical. To do so, several detection methods have been developed and enhanced these last years. These methods can be broadly indexed into two principal approaches, named model-based methods and data-driven methods. Model-based methods are powerful and efficient widely used methods. They are related on the system analytical representation (detailed physical model). However, obtaining this representation for complex, large-scale systems is often not possible or very tricky and requests a lot of time and money. To deal with that, data driven methods have received a significant attention. These methods unlike model-based ones use only measures taken directly from the system (or their transformation) at different times (historical data).
Several data driven methods for faults detection have been proposed (Yin et al., 2012, Ding, 2012, Qin, 2012, Venkatasubramanian et al., 2003, Chiang et al., 2001). Many of them are based on rigorous statistical development of system data and one can mention Subspace aided APproach (SAP), powerful data-driven tools developed to address the problems of building an accurate physical model for complex systems. Partial Least Squares (PLS), Principal Component Analysis (PCA) and their variants (dynamic, non-linear, kernel, and probabilistic) are statistical methods widely used for data reduction and fault detection purpose.
PCA is a well-known and powerful data-driven technique significantly used in application for fault detection but also in many other fields due to its simplicity for model building and efficiency to handle a huge amount of data. In order to identify at any moment if the system is In Control (IC) or not (the system is Out of Control OC), it is, according to Ding et al. (2010) and Qin (2003), associated to statistics with quadratic forms. These statistics are not only associated to PCA but also to many others data driven and model-based methods. Among these statistics, two well-known and used statistics are the T2 and SPE (Squared Prediction Error) statistics. These two are generally combined to complement each other and thus enhance the fault sensitivity.
Meanwhile, in the last decades, Bayesian networks (BN) have been also proposed for fault detection (Yu and Rashid, 2013, Verron et al., 2010a, Huang, 2008, Roychoudhury et al., 2006, Schwall and Gerdes, 2002, Lerner et al., 2000). BN׳s are powerful tools designed by experts and/or learned from data. They offer a Probabilistic/statistical framework that able to integrate information from different sources which may be of interest for fault detection. Indeed, the use and the fusion of all the information available on the system (as causal influences (e.g. graphical representations of variables dependencies), probabilistic fault detection decisions, maintainability information, components reliability and so on) could enhance and provide better decisions. On this perspective, we propose to use a BN in order to model PCA fault detection techniques.
Another important challenge is to handle on-line missing observations. The most used approaches are based on the imputation methods, which try to complete the missing values. However, these methods are time consuming and depend strongly on the missing rate of the original sample. The proposed network, unlike most of the proposed Bayesian networks for fault detection, is able to respect a false alarm rate, model PCA fault detection scheme and handle automatically missing observation without delay or imputation. The main interests of this paper can be described in few points : (1) a generalized form of the quadratic statistics (e.g. T2, SPE) under a probabilistic tool, (2) a probabilistic framework for fault detection purpose, managing both PCA (systematic and residual subspaces) and statistics under a single BN using discrete and Gaussian nodes, and (3) probabilities about the system state could be provided, even when data on line are missing (a non-imputation method to handle unobserved variable s).
The remainder of this paper is structured as follows. In Section 2 a brief description of some definitions and tools needed to develop our proposals is given, Section 3 describes and introduces the development of PCA under CGNs for fault detection purpose. This is followed by a comparison between our proposal and the standard PCA, two cases studies are given. Finally, conclusions and outlooks are outlined in the last section.
Section snippets
Definition
A Bayesian Network (BN) (Jensen and Nielsen, 2007) is a probabilistic graphical model. It is associated and consists of the following:
- •
a directed acyclic graph , =(V, E), where V is the vertexes set of (nodes), and E is the edges set of (arcs),
- •
a finite probabilistic space , with a non-empty space, a collection of the subspaces of and, p a probability measure on with ,
- •
a set of random variables associated with the vertexes of the graph and defined on ,
The proposed probabilistic framework
In this section, we propose original CGNs for fault detection. Under these networks, we simultaneously handle PCA and quadratic statistics that come with it. For clarity, we introduce PCA under a CGN, after we propose a probabilistic framework for statistics as T2 and SPE, and ultimately we give the proposed CGNs for fault detection purpose. These CGNs can be used as an alternative to the PCA scheme for fault detection. Note, however, such as PCA they may be suitable for some applications and
Tennessee Eastman Process
In order to compare our proposal to the conventional fault detection PCA scheme (see Section 2.2), we propose to test both of them on the Tennessee Eastman Process (TEP). It is an industrial chemical process (see Fig. 9). Its simulation provided by the Eastman Chemical Company is widely used as a benchmark problem for control techniques and also to compare fault detection and/or diagnosis methods.
The TEP consists of five major units namely, reactor, condenser, compressor, separator and stripper
Conclusions and outlooks
The main interest of this paper is the presentation of a new tool for fault detection purpose. Firstly, we have transposed standard PCA (systematic and residual subspaces) under a BN and more precisely a CGN. Secondly, we have proposed a probabilistic framework for statistics as T2, SPE. For that, it has been necessary to define probabilistic control limits in order to match the decisions made by the comparison of the quadratic statistics to their thresholds. Finally, we have introduced a CGN
Acknowledgments
Mohamed Amine Atoui is supported by a Ph.D. purpose grant from “la Région Pays de la Loire”. The authors gratefully acknowledge the contribution of the reviewers comments.
References (31)
- et al.
On the application of pca technique to fault diagnosis
Tsinghua Sci. Technol.
(2010) - et al.
A plant-wide industrial process control problem
Comput. Chem. Eng.
(1993) Bayesian methods for control loop monitoring and diagnosis
J. Process Control
(2008)- et al.
Process monitoring based on probabilistic pca
Chemom. Intell. Lab. Syst.
(2003) Survey on data-driven industrial process monitoring and diagnosis
Annu. Rev. Control
(2012)- et al.
A review of process fault detection and diagnosispart III: process history based methods
Comput. Chem. Eng.
(2003) - et al.
Fault detection and isolation of faults in a multivariate process with Bayesian network
J. Process Control
(2010) - et al.
Fault diagnosis of industrial systems by Conditional Gaussian Network including a distance rejection criterion
Eng. Appl. Artif. Intell.
(2010) - et al.
A comparison study of basic data-driven fault diagnosis and process monitoring methods on the benchmark Tennessee Eastman Process
J. Process Control
(2012) - (2006)
Fault Detection and Diagnosis in Industrial Systems
Pattern Classification
Bayesian network classifiers
Mach. Learn.
Cited by (27)
A hyper-heuristic inspired approach for automatic failure prediction in the context of industry 4.0
2022, Computers and Industrial EngineeringCitation Excerpt :Regarding heuristic based AD approaches, the more conservatives use expert knowledge to fix or provide tentative initial parameters which make these methodologies high or medium expert knowledge dependent (Xie et al., 2019). Despite requiring quite a lot of background knowledge of the problem, some methodologies attempt to alleviate the dependent parameterization by fixing a time-window size, calculating the hyper-parameters of the algorithms, commonly by statistical methods (Chen et al., 2019b) or Cross Validation (Li et al., 2019), and finally estimating the threshold (Atoui et al., 2015; Chen et al., 2019b; Li et al., 2019; Yu, 2011) or setting it by grid search (Liu et al., 2017). Among the most commonly employed approaches, feature extraction (FE) has become, in recent years, a powerful tool to gain a better understanding of the TS related to the failure.
Unlocked decision making based on causal connections strength
2021, European Journal of ControlCitation Excerpt :Faults are often modeled as deviations from the in-control process mean or variance, or both. Some of the techniques that have been widely studied for process monitoring include principal component analysis (PCA) [2,5,42], partial least squares (PLS) [32], Bayesian networks (BN) [6,28], subspace methods [14], wavelets analysis [12]. Fault detection and diagnosis is a classification problem.
Coupling data-driven and model-based methods to improve fault diagnosis
2021, Computers in IndustryA causal mixture model decomposition for root cause identification
2021, IFAC-PapersOnLineEnhanced fault diagnosis method using conditional Gaussian network for dynamic processes
2020, Engineering Applications of Artificial IntelligenceCitation Excerpt :A change or a new observation about the state of a child node is enough to update the posterior probability of each node of the BN and decide about system”s state. More basic knowledge about BN can be obtained (Atoui et al., 2015). To solve the problem of high false alarm rate (FAR), the probability limit proposed by Atoui et al. (2019) is introduced.
Fault diagnosis integrating physical insights into a data-driven classifier
2020, IFAC-PapersOnLine