Elsevier

Journal of Process Control

Volume 106, October 2021, Pages 110-121
Journal of Process Control

Industrial process fault detection based on KGLPP model with Cam weighted distance

https://doi.org/10.1016/j.jprocont.2021.09.004Get rights and content

Highlights

  • Orientation and scale adaptive to obtain the information of neighbors.

  • The number of PC for different faults is selected adaptively.

  • Use the heuristic method to optimize parameters.

Abstract

The nearest neighbor selection of multivariate statistical projection analysis methods assumes locally constant probabilities. However, ignoring the non-uniform distributed characteristic of data causes information redundancy in data-intensive regions and insufficient information in data-sparse regions, leading to detection performance decline. In this study, a new weighted distance named Cam weighted distance is used to reselect the neighbors and consequently overcome the aforementioned limitation. An nonlinear industrial fault detection method based on KGLPP-Cam is developed. The proposed method can preserve not only global and local information but also orientation and adaptive scale to obtain the information of neighbors according to different surroundings. T2 and SPE statistics are calculated for fault detection. A change ratio function is constructed to select sensitive principal components adaptively and better describe the sensitivity of different projection directions for processing change information. The proposed method is examined through a numerical example and TE process.

Introduction

With the fast development of digital computation and sensor technology, data-driven process monitoring and fault detection methods are necessary to improve the reliability and safety of industrial processes [1], [2], [3]. For high-dimensional complex production process, multiple correlations exist between variables. To solve this problem, multivariate statistical projection analysis methods are widely used to analyze the correlation of massive and high-dimensional data, by projecting the monitored sample vector from the high-dimensional variable space to the latent space expanded by fewer latent variables [4], [5]. Some of the commonly used methods are principal component analysis (PCA) [6], independent component analysis (ICA) [7], Fisher discriminant analysis (FDA) [8], locality preserving projections (LPP) [9], neighborhood preserving embedding (NPE) [10] and so on. However, these methods only preserve global or local information. Several studies on preserving global and local structures have been proposed [11], [12], [13], [14]. For example, Zhang [12] developed a global–local structure analysis (GLSA) model that combines LPP and PCA. Luo [11] developed a global–local preserving projections (GLPP) based on GLSA and revealed their intrinsic relationship with PCA and LPP. Similarly, Ma [13] designed an NPE-based local and nonlocal embedding algorithm that maximizes nonlocal samples and minimizes neighborhood samples. Tang [15] introduced FDA to GLPP to impose maximize scattering between classes and minimized scattering within classes for an enhanced performance of fault identification.

However, industrial processes have control loops, physicochemical reactions, and complex correlation. Strong nonlinear relationships exist between the process variables, whose linear methods are not conducive to obtaining ideal detection results. A kernel trick has been explored in nonlinear fault detection by mapping an input space into a feature space, such as KPCA [16], KLPP [9], KFLD [17], and KGLPP [18], to approximate the nonlinear function relationship [19]. These approaches represent the similarity between data and their neighbors in terms of Euclidean distance, which assumes the distribution to be normal. Although they can be used to solve nonlinear problems to a certain extent, they pose a risk of choosing false nearest neighbors and incorrectly calculating the similarity between neighbors because non-uniform and non-normal distribution is more obvious in a feature space. Industrial process data itself is usually multivariable and non-uniform distributed [20], [21], [22], [23], [24]. The local structure range directly affects the inaccuracy of fault detection, indicating that the distance measurement of neighborhood influences the performance of fault detection. Therefore, a more suitable similarity measurement method for high-dimensional nonlinear spaces should be developed. Zhang [25] introduced the objective function of the adaptive maximum margin criterion method to the neighborhood preserving projection method to maximize the scatter of data from different classes. To address the fundamental issue on neighborhood construction in manifold learning, Sun [26] used Kernel sparse representation for neighborhood optimization. Wang [27] emphasized that seeking the local structure in an original feature space is error-prone in terms of neighbor finding and similarity measurement. They also proposed a locality adaptive projection approach to preserve neighborhood. However, they did not consider non-uniform characteristics. In the selection of a neighborhood having a highly changeable density with Euclidean distance, most of the neighbors come from a high-density area, while small neighbors originate from sparse regions. Ignoring this characteristic causes a large degree of information redundancy in data-intensive areas and insufficient information in data-sparse areas; consequently, detection performance declines.

This study is based on the concept that the distance of neighborhood should vary with Euclidean distance and be treated differently on the basis of their different surroundings. A nonlinear fault detection method based on a KGLPP with Cam weighted distance neighborhood (KGLPP-Cam) is proposed. KGLPP-Cam maps an input space into a high-dimensional feature space for nonlinear fault detection. It can preserve both local and global information and keep the information more evenly, thereby avoiding the incompleteness of information in neighborhood selection. T2 and SPE are computed in the feature space to monitor the state of the process. Furthermore, when the process information varies, one or several variables change, resulting in some alterations in PCs [28], and the changes triggered by different faults vary. The rate of the change functions of test data and normal training data in each projection directions are calculated to describe the sensitivity of different projection directions and process change information. This function can adaptively select projection PCs and reconstruct the PC space for different faults. Genetic algorithm (GA) is used to optimize the parameters and reduce the influence of parameter selection on the algorithm. Lastly, the proposed approach is tested in a nonlinear numerical simulation system and the Tennessee Eastman (TE) process in terms of effectiveness and superiority. The main contributions are as follows:

  • (1)

    The variables of industrially processed high-dimensional data are interrelated, and the multivariate statistical projection method is used to embed the high-dimensional space into a low-dimensional manifold for better analysis. Euclidean distance leads to the deviation of informational quantity when projecting a local structure on non-uniform distributed data. As such, Cam weighted distance is used to replace the traditional Euclidean distance and make the information source more reasonable.

  • (2)

    The sensitivity of different faults on each PC varies. The rates of the change functions of the test data and normal training data in each projection direction are established to better describe the sensitivity of different projection directions for processing the change information. Thus, PCs for different faults can be adaptively selected.

  • (3)

    Parameters heavily affect the detection performance of the proposed method, and adjusting them is time consuming. They are optimized by using GA, and the optimized parameters improve the detection performance.

The remaining parts of this paper are organized as follows. Section 2 introduces the basic theory of Cam weighted distance and KGLPP. Section 3 proposes a nonlinear fault detection method based on KGLPP-Cam and discusses the main process. Section 4 applies the KGLPP-Cam algorithm to numerical cases and TE processes and provides a comparison of the KGLPP-Cam algorithm with other detection methods to demonstrate its superiority. Section 5 presents the conclusion.

Section snippets

Cam distance

The idea of Cam weighted distance is to measure the distance between the point and its neighborhood discriminatory according to its different surroundings [29]. Given an normal distribution data Y, a transformation can be used to approximate the non-uniform distribution of the data set.

Definition 1 Cam Distribution

For a p-dimensional random vector Y=[y1,y2yp]Tp, complied with a p-dimensional Gauss distribution N(0,I). The probability density function is: f(y)=1(2π)p/2e1/2yTyDefining a transformation X as follows: X=(a

Adjacent matrix calculation

Cam weighted distance applies different weights on the Euclidean distance according to the density and orientation of a neighborhood. It is more suitable for local non-uniform distributed data. Thus, we propose a KGLPP-Cam method to improve the performance of nonlinear fault detection. First, the k-nearest neighbor set Ωkxi is determined, and the parameters ai, bi, and τi of Cam weighted distance are estimated. Then neighbors are reselected using Cam weighted distance. It should be point out

Case study

The proposed KGLPP-Cam algorithm of nonlinear fault detection is tested in a numerical example and the Tennessee Eastman process. The Gaussian kernel: k(xi,xj)=exp(xixj22h2) is used in both case studies. KPCA, GLPP, and KGLPP with Euclidean distance are introduced for comparing with KGLPP-Cam to test the effectiveness and superiority of KGLPP-Cam algorithm proposed in nonlinear fault detection.

Conclusion

This study proposes a novel nonlinear fault detection method based on KGLPP with Cam weighted distance neighborhood. This incorporation not only preserves global and local information but also makes the information of neighborhood more reasonable. In addition, the new approach can adaptively select PCs and avoid the drawback of manual parameter adjustments. The numerical simulation and the TE case study show that the proposed method can significantly increase the performance of nonlinear fault

CRediT authorship contribution statement

Chenghong Huang: Conceptualization, Methodology, Software, Writing – original draft. Yi Chai: Conceptualization, Supervision, Funding acquisition. Bowen Liu: Formal analysis, Investigation. Qiu Tang: Software, Validation. Fei Qi: Investigation, Data curation.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgment

All authors have read and agreed to the published version of the manuscript.

Chenghong Huang was born in Chongqing, China. She received the B.E. degree in automation from School of Automation, Chongqing University, Chongqing, China in 2017. She is now a Ph.D. candidate in control theory and control engineering from School of Automation, Chongqing University, Chongqing, China. Her current research interests include fault detection and diagnosis, operational safety theory of dynamic system.

References (38)

Cited by (9)

  • Industrial process monitoring based on optimal active relative entropy components

    2022, Measurement: Journal of the International Measurement Confederation
View all citing articles on Scopus

Chenghong Huang was born in Chongqing, China. She received the B.E. degree in automation from School of Automation, Chongqing University, Chongqing, China in 2017. She is now a Ph.D. candidate in control theory and control engineering from School of Automation, Chongqing University, Chongqing, China. Her current research interests include fault detection and diagnosis, operational safety theory of dynamic system.

Yi Chai received the B.E. degree in Department of Electronic Engineering, National University of Defense Technology, Changsha, China in 1982, and the M.S. degree and the Ph.D. degree in Department of Automation in Chongqing University, Chongqing, China, in 1994 and in 2001, respectively. He is currently a professor and doctor advisor at Chongqing University. His research interests include nonlinear dynamic systems, signal processing, information fusion, fault detection and diagnosis, intelligence systems.

Bowen Liu was born in Chongqing, China. He received the B.E. degree and the M.S. degree in Electrical Engineering from Guangxi University, Nanning, China, in 2013 and in 2017, respectively. He is now a Ph.D. candidate in control theory and control engineering from School of Automation, Chongqing University, Chongqing, China. His current research interests include fault detection and diagnosis, pattern recognition, and their application to largescale process system.

Qiu Tang was born in Chongqing, China. She received the B.E. degree in automation from Qingdao University, Qingdao, China, in 2015. She received the Ph.D. degree in Control Theory and Control Engineering in Chongqing University, Chongqing, China. She is now a postdoctoral of Shandong University. Her current research interests include fault detection and diagnosis, pattern recognition, and their application to largescale process system.

Fei Qi received the B.E. degree in Department of computer science and technology from Nanjing University of Technology, Nanjing, China, in 2003, and the M.S. degree in automation from School of Automation, Chongqing University, Chongqing, China in 2011. He is now a Ph.D. candidate in control theory and control engineering from School of Automation, Chongqing University, Chongqing, China. His current research interests include fault detection and diagnosis, and system modeling and control.

This work is supported by the National Natural Science Foundation of China No. 61633005.

View full text