Variable selection and data pre-processing in NN modelling of complex chemical processes

doi:10.1016/j.compchemeng.2005.01.004

Computers & Chemical Engineering

Volume 29, Issue 7, 15 June 2005, Pages 1647-1659

https://doi.org/10.1016/j.compchemeng.2005.01.004 Get rights and content

Abstract

The neural network models represent nowadays a powerful tool for complicated process identification. However, because of the fact that they belong to the category of data-driven “black box” models, they cannot avoid the consequences of the “garbage in–garbage out” rule. This work proposes a simultaneous data balancing-variable selection procedure, which is based on traditional statistical techniques and modern information theoretic approaches. It is implemented on a complicated dataset of restricted quality, which refers to a commercial aldol condensation unit (BASF). Based on the pre-processed database a neural model for the prediction of the process yield has been developed. The results verify the importance of the pre-processing stage in terms of generalization accuracy as well as of simpler network structure due to the data-variable selection procedure. Finally, an analysis of the model trends has been implemented to assess qualitative characteristics of the model, which was then used in industrial test runs and resulted in an improvement of the process operation.

Introduction

Neural networks (NNs) belong to the most promising modelling techniques of our time. As “universal approximators” they can handle non-linear multivariate systems of any complexity level, while as “black box” models they don’t require an extensive knowledge about the process to be modelled. Instead, they are based on databases, being tolerant to the faults and noise included in them (Hornik, Stichcombe, & White, 1989). Although they are not new in concept, the interest in them has increased significantly in the last decade mainly due to the tremendous evolution of digital computing. Some published applications of neural networks in chemical engineering topics are concerned with fault diagnosis in chemical plants (Venkatasubramanian & Chan, 1989), system identification and control (Polland, Broussard, Garrison & San, 1992; Psichogios & Ungar, 1991), sensor data analysis (Piovoso & Owens, 1991), and chemical composition analysis (Weixiang, Dezhao, & Shangxu, 1998).

As a result of their modelling success a number of software packages aiming at the design and development of neural networks has become nowadays available. Most of these packages include a variety of options about the design of the neural network architecture, the parameters of the training algorithm, the stop criteria and the model analysis. Most of all, they provide a user-friendly environment, which hides from the user the insights of the complicated mathematical and computational network training procedure and makes neural networks development much easier. However, the neural networks modelling task involves the data pre-processing stage, which can be decisive for the development of a successful model, since it must always be kept in mind that no matter how powerful the neural network modeling technique may be, it cannot escape the “garbage in–garbage out” curse of “black box” modelling.

More specifically, as far as commercial databases are concerned, raw data obtained from plant operation should not be used unprocessed in identification studies. First of all, the special cases of the startup and shutdown of the process have to be recognized and the respective data should be removed. Secondly, outliers should be detected in order to avoid using data that correspond to measurement errors or operation faults. One should always be aware of the fact that not all of the outliers are of the aforementioned nature. They may also refer to extreme but yet possible process set-up and therefore contain useful information. The cooperation of the neural network model developer with the process engineers is always of great importance for all these aspects as well as for the variable selection. The proper variables have to be selected out of a large number of potential input variables, which may also strongly correlate to each other due to the process control system. Furthermore, for effective modelling the data must be information rich over the process operation range (Neelkanten & Guiver, 1998). A well-balanced data-set may cure some of the problems arising from the fact that the variables range in industrial data-sets is usually restricted because of the operational limitations set by the process control system.

From the short description made above, it is clear that the data pre-processing stage is in fact a complicated task of data and variables selection, in which a substantial number of aspects must be taken into account. Because of the complex, multidimensional problems in which neural networks modelling is usually implemented, no standard data pre-processing procedure has been developed so far that treats all these aspects and could therefore be integrated and automatically performed in a neural network software package. Instead, it is very common that the users, based in a great extent on their intuition and the idiosyncrasies of a specific dataset, follow case-oriented data pre-processing methods.

In this paper, a data pre-processing method is proposed that combines typical statistical techniques (Hair, Anderson, Tatham, & Black, 1998) as well as information theoretic approaches for variable selection and data preparation in neural networks modelling (Sridhar, Barlett, & Seagrave, 1998). This method deals with the data selection procedure in a systematic way, setting simultaneously various variable selection levels and mentioning at which points the experience of the process engineers is valuable. A database regarding a commercial aldolcondensation process has been used for the implementation of the method as well as of the neural network modelling technique. The product of this process constitutes an important intermediate for the production of fungicidal agro-chemicals. The complexity of the modelling task, which makes a neural network modeling effort worthwhile, arises mainly from the side reactions in combination with crystallization occurring in the semi batch reactor as will be discussed later. Finally, the paper deals with the effect of the proposed pre-processing methods on the model accuracy and generalization capability, analyzes the performance of the model in terms of theoretical consistency in some case studies and validates the model predictions with operational test runs on plant.

Section snippets

Combinatory data pre-processing method

Typically, data pre-processing methods for neural networks modeling consist of two stages. In the first stage trivial techniques about missing data and outliers detection perform a first level data screening. Then, based on these results, more sophisticated multivariate techniques are implemented in order to decrease the dimensionality of the input variables space (variables selection) and homogenize the information distribution for input and output variables (data samples selection). Usually,

Process description

A brief description of the present type of process is given in order to illustrate the complexity-level of the modelling task as typical field of application for NN-modelling.

Commercial database—implementation of pre-processing methods

The training and validation of the NN model was based upon industrial data provided by BASF (Schwarzheide). The 19 candidate input variables and the one output variable (the primary product yield P1, where P1 is simply denoted as P in the following) are presented in Table 1.

From the description of the variables it is clear that operational variables of the reactor and the separation unit as well as laboratory variables relative to the reagent quality have been taken into account. The commercial

Results and discussion

The training of the NNs was carried out with ATLAN-tec’s NeuroModel^Ⓡ software package (Version 1.41). This package performs automatically the data normalization procedure with a linear transformation in the range [0.1, 0.9]. It uses multilayer perceptrons (MLP), with one hidden layer and a version of the EBP algorithm with momentum term for the training procedure. The nodes of the hidden layer use a sigmoid transfer function and the number of the hidden nodes is a parameter that can be

Conclusions

The data pre-processing stage in neural networks modelling is of great importance, especially in the case of commercial data sets with restricted quality. The simultaneous “data balancing-variable selection” procedure proposed in this paper has been implemented in the complex case of a commercial aldol condensation process and the results have shown that the neural networks generalization accuracy has been improved in comparison with the one achieved, when only the typical “outliers removal”

Acknowledgements

Financial support provided by BASF Schwarzheide GmbH and knowledge interchange with its research and operating section particularly with Udo Rotermund, Olaf Otto, Rainer Noack and Jan Rudloff is greatly acknowledged.

References (14)

K. Hornik et al.
Multilayer feedforward networks are universal approximators
Neural Networks
(1989)
D. Sridhar et al.
Information theoretic subset selection for neural network models
Computers and Chemical Engineering
(1998)
Z. Weixiang et al.
Potential function based neural networks and its application to the classification of complex chemical patterns
Computer and Chemistry
(1998)
R. Zauner et al.
On the influence of mixing on crystal precipitation processes – Application of the segregated feed model
Chemical Engineering Science
(2002)
D. Berry et al.
Synthesis of reactive crystallization processes
AIChE Journal
(1997)
J. Clayden et al.
Organic chemistry
(2001)
J. Hair et al.
Multivariate data analysis
(1998)

There are more references available in the full text version of this article.

Cited by (28)

A Light Attention-Mixed-Base Deep Learning Architecture toward Process Multivariable Modeling and Knowledge Discovery
2023, Computers and Chemical Engineering
A Light Attention-Mixed-Base Deep Learning Architecture (LAMBDA) is developed to simultaneously achieve process knowledge discovery and high-accuracy multivariable modeling. By organizing multiple network bases and a novel light attention mechanism in a special way, the proposed LAMBDA is capable to learn different factors affecting the chemical process outputs, i.e. the basic dynamic characteristics, transient disturbances and other unknown factors. Besides, a development procedure embedding a hyperparameter optimization framework—Optuna is performed to optimize the network architecture. Compared with baselines including FNN, CNN, LSTM and Attention-LSTM, the new architecture displays an outstanding fitting capacity on the discharge flowrates modeling of an actual deethanization process. The process knowledges extracted from the LAMBDA model parameters are also illustrated, which are valuable in the development of advanced process tasks. The proposed LAMDBA can fit any number of outputs without degrading the knowledge discovery ability, making itself potential in the modeling of complex chemical processes.
A novel algorithm for dead time estimation between soft sensor inputs and output
2019, Computers and Chemical Engineering
This paper introduces a new procedure for extracting signals and their time delays that contain the most information about the desired soft sensor output using information-theoretic subset selection and a genetic algorithm. This procedure can be used before the creation of a soft sensor, as it is only based on historic values of the signals. The algorithm is tested on real problems from the cement industry, i.e., how the input’s time delays affect the quality of the soft sensor estimation of cement fineness in a cement mill.
Artificial neural network modelling of the bioethanol-to-olefins process on a HZSM-5 catalyst treated with alkali
2017, Applied Soft Computing Journal
In this work the kinetic modelling of the transformation of bioethanol-to-olefins (BTO) process over a HZSM-5 catalyst treated with alkali using artificial neural networks (ANN) is presented. The main goal has been to obtain a BTO process neuronal model with the desired accuracy that allows the simplification and reduction of the computational cost with respect to a mechanistic knowledge model. To check the goodness of ANN base model structures, during the study a comparison with other alternative modelling techniques such as support vector machines was performed. Following a parameters optimization procedure and testing different training methods, the optimal ANN structure results to be a feed-forward 3–5–1 network with the Bayesian regularization training method. Using a set of experimental data obtained in a laboratory scale fixed bed reactor, we have obtained a similar fit to the knowledge model but with the advantage of being up to 43 times faster. These results are important for moving forward real time automatic control strategies in the biorefinery context.
Simulation of CO<inf>2</inf> capture using sodium hydroxide solid sorbent in a fluidized bed reactor by a multi-layer perceptron neural network
2016, Journal of Natural Gas Science and Engineering
Various investigations have been conducted in order to decrease worldwide carbon dioxide in recent decades. Presently, CO₂ capture applying solid sorbent has attracted attentions as a manner in which the energy consumption is relatively low. In this study, a feed forward multi-layer perceptron neural network has been developed to predict the ratio of output to input of carbon dioxide concentration (C_out/C_in) in a fluidized bed reactor applied for CO₂ capture using sodium hydroxide solid sorbent over operational conditions: temperature (25–40 °C), CO₂ volume percentage (1–2%), air flow rate (14–16 m³/hr) and time (0–420 s). The ANN was trained by the Levenberge-Marquardt algorithm, enhanced through the combination with Bayesian regularization technique. Regression analysis results (R² = 0.9838) and comparison of the ANN predicted C_out/C_in values with corresponding experimental data (%AARD = 1.9217) have shown high prediction ability and robustness of the developed neural network.
Soft methodology selection of wind turbine parameters to large affect wind energy conversion
2015, International Journal of Electrical Power and Energy Systems
Citation Excerpt :
The objective here is to select the proper explanatory (input) parameters and thereby reduce and minimize the error that exists between the observed values and the model estimations of the explained variables. Among the many neural network system, one of the most used and powerful is the ANFIS; and the ANFIS was employed here, for the purposes of this study, in the variable selection part [22–26]. In order to determine how the four main parameters affect the wind turbine power coefficient: blade pitch angle, rotor speed, wind speed and rotor radius, a parameter search by employing the ANFIS was conducted.
In recent years the use of renewable energy including wind energy has risen dramatically. Because of the increasing development of wind power production, improvement of the control of wind turbines using classical or intelligent methods is necessary. To optimize the power produced in a wind turbine, it is important to determine and analyze the most influential factors on the produced energy. To build a wind turbine model with the best features, it is desirable to select and analyze factors that are the most influential to the converted wind energy. This process includes several ways to discover a subset of the total set of recorded parameters, showing good predictive capability. The method of ANFIS (adaptive neuro fuzzy inference system) was applied to the data resulting from this investigation. The ANFIS process for variable selection was implemented in order to detect the predominant variables affecting the converted wind energy. Then, it was used to determine how four parameters, blade pitch angle, rotor speed, wind speed and rotor radius, affect the wind turbine power coefficient. The results indicated that of all the parameters examined, blade pitch angle is the most influential to wind turbine power coefficient prediction, and the best predictor of accuracy.
Soft sensor for real-time cement fineness estimation
2015, ISA Transactions
Citation Excerpt :
Information theory is used to analyze interdependency between the process output and inputs to determine, from the entire input data, a subset of input variables that contains most of the information necessary to predict the output. Theoretical knowledge regarding the process that is modeled should also be incorporated when using ITSS to ensure that all important variables to the model are included in the subset [22,23]. In this paper the soft sensor is proposed to estimate cement fineness and provide real-time information on a process variable available only from off-line laboratory tests.
This paper describes the design and implementation of soft sensors to estimate cement fineness. Soft sensors are mathematical models that use available data to provide real-time information on process variables when the information, for whatever reason, is not available by direct measurement. In this application, soft sensors are used to provide information on process variable normally provided by off-line laboratory tests performed at large time intervals. Cement fineness is one of the crucial parameters that define the quality of produced cement. Providing real-time information on cement fineness using soft sensors can overcome limitations and problems that originate from a lack of information between two laboratory tests. The model inputs were selected from candidate process variables using an information theoretic approach. Models based on multi-layer perceptrons were developed, and their ability to estimate cement fineness of laboratory samples was analyzed. Models that had the best performance, and capacity to adopt changes in the cement grinding circuit were selected to implement soft sensors. Soft sensors were tested using data from a continuous cement production to demonstrate their use in real-time fineness estimation. Their performance was highly satisfactory, and the sensors proved to be capable of providing valuable information on cement grinding circuit performance. After successful off-line tests, soft sensors were implemented and installed in the control room of a cement factory. Results on the site confirm results obtained by tests conducted during soft sensor development.

View all citing articles on Scopus

View full text

Variable selection and data pre-processing in NN modelling of complex chemical processes

Abstract

Introduction

Section snippets

Combinatory data pre-processing method

Process description

Commercial database—implementation of pre-processing methods

Results and discussion

Conclusions

Acknowledgements

Neural Networks

Computers and Chemical Engineering

Computer and Chemistry

Chemical Engineering Science

Synthesis of reactive crystallization processes

AIChE Journal

Organic chemistry

Multivariate data analysis