Industrial process fault detection based on KGLPP model with Cam weighted distance☆
Graphical abstract
Introduction
With the fast development of digital computation and sensor technology, data-driven process monitoring and fault detection methods are necessary to improve the reliability and safety of industrial processes [1], [2], [3]. For high-dimensional complex production process, multiple correlations exist between variables. To solve this problem, multivariate statistical projection analysis methods are widely used to analyze the correlation of massive and high-dimensional data, by projecting the monitored sample vector from the high-dimensional variable space to the latent space expanded by fewer latent variables [4], [5]. Some of the commonly used methods are principal component analysis (PCA) [6], independent component analysis (ICA) [7], Fisher discriminant analysis (FDA) [8], locality preserving projections (LPP) [9], neighborhood preserving embedding (NPE) [10] and so on. However, these methods only preserve global or local information. Several studies on preserving global and local structures have been proposed [11], [12], [13], [14]. For example, Zhang [12] developed a global–local structure analysis (GLSA) model that combines LPP and PCA. Luo [11] developed a global–local preserving projections (GLPP) based on GLSA and revealed their intrinsic relationship with PCA and LPP. Similarly, Ma [13] designed an NPE-based local and nonlocal embedding algorithm that maximizes nonlocal samples and minimizes neighborhood samples. Tang [15] introduced FDA to GLPP to impose maximize scattering between classes and minimized scattering within classes for an enhanced performance of fault identification.
However, industrial processes have control loops, physicochemical reactions, and complex correlation. Strong nonlinear relationships exist between the process variables, whose linear methods are not conducive to obtaining ideal detection results. A kernel trick has been explored in nonlinear fault detection by mapping an input space into a feature space, such as KPCA [16], KLPP [9], KFLD [17], and KGLPP [18], to approximate the nonlinear function relationship [19]. These approaches represent the similarity between data and their neighbors in terms of Euclidean distance, which assumes the distribution to be normal. Although they can be used to solve nonlinear problems to a certain extent, they pose a risk of choosing false nearest neighbors and incorrectly calculating the similarity between neighbors because non-uniform and non-normal distribution is more obvious in a feature space. Industrial process data itself is usually multivariable and non-uniform distributed [20], [21], [22], [23], [24]. The local structure range directly affects the inaccuracy of fault detection, indicating that the distance measurement of neighborhood influences the performance of fault detection. Therefore, a more suitable similarity measurement method for high-dimensional nonlinear spaces should be developed. Zhang [25] introduced the objective function of the adaptive maximum margin criterion method to the neighborhood preserving projection method to maximize the scatter of data from different classes. To address the fundamental issue on neighborhood construction in manifold learning, Sun [26] used Kernel sparse representation for neighborhood optimization. Wang [27] emphasized that seeking the local structure in an original feature space is error-prone in terms of neighbor finding and similarity measurement. They also proposed a locality adaptive projection approach to preserve neighborhood. However, they did not consider non-uniform characteristics. In the selection of a neighborhood having a highly changeable density with Euclidean distance, most of the neighbors come from a high-density area, while small neighbors originate from sparse regions. Ignoring this characteristic causes a large degree of information redundancy in data-intensive areas and insufficient information in data-sparse areas; consequently, detection performance declines.
This study is based on the concept that the distance of neighborhood should vary with Euclidean distance and be treated differently on the basis of their different surroundings. A nonlinear fault detection method based on a KGLPP with Cam weighted distance neighborhood (KGLPP-Cam) is proposed. KGLPP-Cam maps an input space into a high-dimensional feature space for nonlinear fault detection. It can preserve both local and global information and keep the information more evenly, thereby avoiding the incompleteness of information in neighborhood selection. and are computed in the feature space to monitor the state of the process. Furthermore, when the process information varies, one or several variables change, resulting in some alterations in PCs [28], and the changes triggered by different faults vary. The rate of the change functions of test data and normal training data in each projection directions are calculated to describe the sensitivity of different projection directions and process change information. This function can adaptively select projection PCs and reconstruct the PC space for different faults. Genetic algorithm (GA) is used to optimize the parameters and reduce the influence of parameter selection on the algorithm. Lastly, the proposed approach is tested in a nonlinear numerical simulation system and the Tennessee Eastman (TE) process in terms of effectiveness and superiority. The main contributions are as follows:
- (1)
The variables of industrially processed high-dimensional data are interrelated, and the multivariate statistical projection method is used to embed the high-dimensional space into a low-dimensional manifold for better analysis. Euclidean distance leads to the deviation of informational quantity when projecting a local structure on non-uniform distributed data. As such, Cam weighted distance is used to replace the traditional Euclidean distance and make the information source more reasonable.
- (2)
The sensitivity of different faults on each PC varies. The rates of the change functions of the test data and normal training data in each projection direction are established to better describe the sensitivity of different projection directions for processing the change information. Thus, PCs for different faults can be adaptively selected.
- (3)
Parameters heavily affect the detection performance of the proposed method, and adjusting them is time consuming. They are optimized by using GA, and the optimized parameters improve the detection performance.
The remaining parts of this paper are organized as follows. Section 2 introduces the basic theory of Cam weighted distance and KGLPP. Section 3 proposes a nonlinear fault detection method based on KGLPP-Cam and discusses the main process. Section 4 applies the KGLPP-Cam algorithm to numerical cases and TE processes and provides a comparison of the KGLPP-Cam algorithm with other detection methods to demonstrate its superiority. Section 5 presents the conclusion.
Section snippets
Cam distance
The idea of Cam weighted distance is to measure the distance between the point and its neighborhood discriminatory according to its different surroundings [29]. Given an normal distribution data , a transformation can be used to approximate the non-uniform distribution of the data set.
Definition 1 Cam Distribution For a p-dimensional random vector , complied with a p-dimensional Gauss distribution . The probability density function is: Defining a transformation as follows:
Adjacent matrix calculation
Cam weighted distance applies different weights on the Euclidean distance according to the density and orientation of a neighborhood. It is more suitable for local non-uniform distributed data. Thus, we propose a KGLPP-Cam method to improve the performance of nonlinear fault detection. First, the -nearest neighbor set is determined, and the parameters , , and of Cam weighted distance are estimated. Then neighbors are reselected using Cam weighted distance. It should be point out
Case study
The proposed KGLPP-Cam algorithm of nonlinear fault detection is tested in a numerical example and the Tennessee Eastman process. The Gaussian kernel: is used in both case studies. KPCA, GLPP, and KGLPP with Euclidean distance are introduced for comparing with KGLPP-Cam to test the effectiveness and superiority of KGLPP-Cam algorithm proposed in nonlinear fault detection.
Conclusion
This study proposes a novel nonlinear fault detection method based on KGLPP with Cam weighted distance neighborhood. This incorporation not only preserves global and local information but also makes the information of neighborhood more reasonable. In addition, the new approach can adaptively select PCs and avoid the drawback of manual parameter adjustments. The numerical simulation and the TE case study show that the proposed method can significantly increase the performance of nonlinear fault
CRediT authorship contribution statement
Chenghong Huang: Conceptualization, Methodology, Software, Writing – original draft. Yi Chai: Conceptualization, Supervision, Funding acquisition. Bowen Liu: Formal analysis, Investigation. Qiu Tang: Software, Validation. Fei Qi: Investigation, Data curation.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgment
All authors have read and agreed to the published version of the manuscript.
Chenghong Huang was born in Chongqing, China. She received the B.E. degree in automation from School of Automation, Chongqing University, Chongqing, China in 2017. She is now a Ph.D. candidate in control theory and control engineering from School of Automation, Chongqing University, Chongqing, China. Her current research interests include fault detection and diagnosis, operational safety theory of dynamic system.
References (38)
Review on data-driven modeling and monitoring for plant-wide industrial processes
Chemometr. Intell. Lab. Syst.
(2017)- et al.
A geometry constrained dictionary learning method for industrial process monitoring
Inform. Sci.
(2021) - et al.
Parallel quality-related dynamic principal component regression method for chemical process monitoring
J. Process Control
(2019) - et al.
Statistical process monitoring with independent component analysis
J. Process Control
(2004) - et al.
Fault diagnosis based on Fisher discriminant analysis and support vector machines
Comput. Chem. Eng.
(2004) - et al.
Fault diagnosis method based on incremental enhanced supervised locally linear embedding and adaptive nearest neighbor classifier
Measurement
(2014) - et al.
Nonlocal structure constrained neighborhood preserving embedding model and its application for fault detection
Chemometr. Intell. Lab. Syst.
(2015) - et al.
Nonlinear process monitoring using kernel principal component analysis
Chem. Eng. Sci.
(2004) - et al.
Nonlinear process monitoring based on kernel global–local preserving projections
J. Process Control
(2016) - et al.
Multimode process fault detection using local neighborhood similarity analysis
Chin. J. Chem. Eng.
(2014)
Multi-mode operation of principal component analysis with k-nearest neighbor algorithm to monitor compressors for liquefied natural gas mixed refrigerant processes
Comput. Chem. Eng.
Sparse locality preserving discriminative projections for face recognition
Neurocomputing
Machine health monitoring based on locally linear embedding with kernel sparse representation for neighborhood optimization
Mech. Syst. Signal Process.
Locality adaptive preserving projections for linear dimensionality reduction
Expert Syst. Appl.
Improving nearest neighbor classification with cam weighted distance
Pattern Recognit.
Whole process monitoring based on unstable neuron output information in hidden layers of deep belief network
IEEE Trans. Cybern.
Efficient locality weighted sparse representation for graph-based learning
Knowl.-Based Syst.
Automated feature learning for nonlinear process monitoring – An approach using stacked denoising autoencoder and k-nearest neighbor rule
J. Process Control
Fault detection in industrial processes using canonical variate analysis and dynamic principal component analysis
Chemometr. Intell. Lab. Syst.
Cited by (9)
Deep feature representation with online convolutional adversarial autoencoder for nonlinear process monitoring
2024, Journal of the Taiwan Institute of Chemical EngineersIndustrial process monitoring based on optimal active relative entropy components
2022, Measurement: Journal of the International Measurement ConfederationAn explicit nonlinear mapping-based locality constrained index for nonlinear statistical process monitoring
2023, Canadian Journal of Chemical EngineeringIndustrial Fault Detection Based on Discriminant Enhanced Stacking Auto-Encoder Model
2022, Electronics (Switzerland)
Chenghong Huang was born in Chongqing, China. She received the B.E. degree in automation from School of Automation, Chongqing University, Chongqing, China in 2017. She is now a Ph.D. candidate in control theory and control engineering from School of Automation, Chongqing University, Chongqing, China. Her current research interests include fault detection and diagnosis, operational safety theory of dynamic system.
Yi Chai received the B.E. degree in Department of Electronic Engineering, National University of Defense Technology, Changsha, China in 1982, and the M.S. degree and the Ph.D. degree in Department of Automation in Chongqing University, Chongqing, China, in 1994 and in 2001, respectively. He is currently a professor and doctor advisor at Chongqing University. His research interests include nonlinear dynamic systems, signal processing, information fusion, fault detection and diagnosis, intelligence systems.
Bowen Liu was born in Chongqing, China. He received the B.E. degree and the M.S. degree in Electrical Engineering from Guangxi University, Nanning, China, in 2013 and in 2017, respectively. He is now a Ph.D. candidate in control theory and control engineering from School of Automation, Chongqing University, Chongqing, China. His current research interests include fault detection and diagnosis, pattern recognition, and their application to largescale process system.
Qiu Tang was born in Chongqing, China. She received the B.E. degree in automation from Qingdao University, Qingdao, China, in 2015. She received the Ph.D. degree in Control Theory and Control Engineering in Chongqing University, Chongqing, China. She is now a postdoctoral of Shandong University. Her current research interests include fault detection and diagnosis, pattern recognition, and their application to largescale process system.
Fei Qi received the B.E. degree in Department of computer science and technology from Nanjing University of Technology, Nanjing, China, in 2003, and the M.S. degree in automation from School of Automation, Chongqing University, Chongqing, China in 2011. He is now a Ph.D. candidate in control theory and control engineering from School of Automation, Chongqing University, Chongqing, China. His current research interests include fault detection and diagnosis, and system modeling and control.