Comparison of Machine Learning Methods for Image Reconstruction Using the LSTM Classifier in Industrial Electrical Tomography

Kłosowski, Grzegorz; Rymarczyk, Tomasz; Niderla, Konrad; Rzemieniak, Magdalena; Dmowski, Artur; Maj, Michał

doi:10.3390/en14217269

Open AccessArticle

Comparison of Machine Learning Methods for Image Reconstruction Using the LSTM Classifier in Industrial Electrical Tomography

¹

Faculty of Management, Lublin University of Technology, 20-618 Lublin, Poland

²

Faculty of Transport and Computer Science, University of Economics and Innovation in Lublin, 20-209 Lublin, Poland

³

Research & Development Centre Netrix S.A., 20-704 Lublin, Poland

^*

Author to whom correspondence should be addressed.

Energies 2021, 14(21), 7269; https://doi.org/10.3390/en14217269

Submission received: 13 October 2021 / Revised: 28 October 2021 / Accepted: 29 October 2021 / Published: 3 November 2021

(This article belongs to the Special Issue Engineering and Medical Aspects of the Use of Electromagnetic Field Energy 2021)

Download

Browse Figures

Versions Notes

Abstract

:

Electrical tomography is a non-invasive method of monitoring the interior of objects, which is used in various industries. In particular, it is possible to monitor industrial processes inside reactors and tanks using tomography. Tomography enables real-time observation of crystals or gas bubbles growing in a liquid. However, obtaining high-resolution tomographic images is problematic because it involves solving the so-called ill-posed inverse problem. Noisy input data cause problems, too. Therefore, the use of appropriate hardware solutions to eliminate this phenomenon is necessary. An important cause of obtaining accurate tomographic images may also be the incorrect selection of algorithmic methods used to convert the measurements into the output images. In a dynamically changing environment of a tank reactor, selecting the optimal algorithmic method used to create a tomographic image becomes an optimization problem. This article presents the machine learning method’s original concept of intelligent selection depending on the reconstructed case. The long short-term memory network was used to classify the methods to choose one of the five homogenous methods—elastic net, linear regression with the least-squares learner, linear regression with support vector machine learner, support vector machine model, or artificial neural networks. In the presented research, tomographic images of selected measurement cases, reconstructed using five methods, were compared. Then, the selection methods’ accuracy was verified thanks to the long short-term memory network used as a classifier. The results proved that the new concept of long short-term memory classification ensures better tomographic reconstructions efficiency than imaging all measurement cases with single homogeneous methods.

Keywords:

electrical tomography; industrial tomography; machine learning; neural networks; long short-term memory (LSTM) networks

1. Introduction

Industrial tank reactors serve an essential role in many processes that involve technology lines. A chemical reactor is a tank designed to carry out the reactions that occur within it. The purpose of industrial tank reactors is to ensure that the economic parameters of chemical processes are optimal [1]. It is possible due to the reactor’s optimal design and the skilful overlap of the three sub-processes occurring inside the reactor: mass, momentum, and heat transfer. Process control can thus be based on the dynamic selection of mixing intensity, temperature, pressure, substrate proportion, and others. The study described here applies to reactors that have interactions between solids and liquids and gas and liquids.

Monitoring the states of dynamic systems is done for two main reasons. The first is detecting approaching failures [2], which include damage to the technological infrastructure, excessively high deviations in crucial process parameters, or interruptions in the process’s continuity. Second, an effective monitoring system is intended to detect a problem early enough to take practical corrective steps.

The requirement to regulate the course of the industrial process is the second reason for utilizing industrial process monitoring [3]. It is critical for guaranteeing a high degree of quality. Effective monitoring methods must be employed to control multiphase processes, including chemicals that can change aggregate states dynamically. Given the aggressive circumstances under which the reactions occur inside the reactor, this is a demanding task. The problem with using invasive sensors is the inability to directly examine any part of the reactor’s interior, the accuracy of the measurements taken, the requirement to use multiple monitoring systems at the same time, and the high uncertainty in determining the dynamic state of the process based on incomplete data (indirect method). Electrical capacitance tomography (ECT) [4,5,6,7,8,9,10,11,12,13,14,15,16], electrical impedance tomography (EIT) [17,18,19,20,21,22,23,24], magnetoacoustic tomography [25], ultrasound and radio [26,27], X-ray tomography [28], optical tomography [29,30], and other non-invasive technologies are used to monitor industrial operations. Recently, an increasing number of research papers in the field of industrial system operation have included the use of various computational methods, such as intelligent predictive methods [31], fuzzy logic [32,33], machine learning [34], numerical modelling [35], deep learning [18,36], and binary programming [37].

The present non-invasive monitoring techniques utilized in industrial processes do not fully meet the current operational needs. The images obtained of the phenomena and processes under examination are frequently blurry, confusing, and challenging to interpret, and they are riddled with mistakes in terms of the number of artefacts (crystals or gas bubbles) found in the reactor as their sizes and locations. As a result, redundant systems are employed to gather exact information about the condition of the monitored process, increasing operational expenses dramatically. The challenges and flaws in monitoring chemical tank reactors stated above cause their need for modification. Implementing a better monitoring strategy will boost the reliability of the activities occurring inside the reactors while decreasing the running costs of industrial systems.

This study aims to present an improved method of monitoring and optimization of chemical processes in heterogeneous tank reactors, in which reactions occur between solid and liquid and gas and liquid. The applied method concerns electrical tomography, and the innovation is the original method of parallel use of many homogeneous machine learning methods and the long short-term memory (LSTM) classifier in selecting the optimal method for a given measurement case [38]. The advantage of the described algorithmic concept over other non-invasive methods is the ability to automatically adapt the method to a specific measurement case, increased resistance to interference arising during measurements, higher accuracy of reconstruction than with homogeneous methods, high imaging resolution, low cost of use, and high speed of operation. Furthermore, the use of the proposed concept enables automatic adjustment of the tomographic system to a specific measurement case. As a result, the system automatically responds to changes inside the tank reactor, selecting the method of converting the measurements into images depending on the measurement vector. The LSTM classifier performs this task. The other part of this study describes the original, intelligent hybrid system enabling effective monitoring of chemical reactions using process electrical tomography.

The Method Oriented Ensemble (MOE) concept presented in Figure 1 is a novelty here and, at the same time, constitutes the authors’ contribution. The vector of 96 measurements is the input to the LSTM classifier. The LSTM network generates one of five classes at the output, determining the optimal reconstruction method for a given measurement case. The values of all 2883 pixels of the image are generated by separately trained models of the selected homogeneous method. In addition, each pixel has its own specially trained machine learning model. In this way, the tomographic image is reconstructed according to the new MOE concept. Using the measurement vector as a sequence of input variables to the recurrent deep LSTM neural network is also new [39]. The research results showed the very high effectiveness of such an approach in terms of the classification/selection of the optimal homogeneous method within MOE.

The article is broken into four parts. The Introduction part offers a synopsis of issues linked to industrial tomography, a review of known methods of imaging the interior of reactors and pipes, a description of several types of tomography, and the authors’ contribution. The second part, Materials and Methods, discusses the research facility—a physical model of the reactor, the technique of gathering training data, the innovative method-oriented ensemble (MOE) idea, five homogeneous methods utilized in the MOE, and the LSTM classifier. The final part comprises test results derived using both real and simulation data. This part also contains a discussion of the results received. Finally, part four of the paper presents an overview and synthesis of the most relevant components of the research work carried out and the results and conclusions acquired. It also includes information about upcoming research endeavours.

2. Materials and Methods

2.1. Research Object

The subject of the research is the physical model of the tank reactor. The main element of the model is a plastic cylinder around which 16 electrodes are placed. The cylinder’s diameter is 200 mm. The container was filled with tap water. Empty plastic tubes with a diameter of 20 mm are placed inside the cylinder. The task of the electrical impedance tomography (EIT) system was to reconstruct the cross-section of the tank correctly. The reconstruction’s quality is determined by the visibility and clarity of inclusions (tubes) in terms of diameter, shape, number and position of inclusions to each other and with the tank wall. Figure 2a shows the test stand with an electrical impedance tomograph connected to the reservoir electrodes. The Netrix S.A. Research and Development Center made the prototype of the EIT measuring device (tomograph). Figure 2b,c shows a plastic cylinder used as a physical model of a tank reactor. Synthetic tubes filled with air are immersed in the cylinder.

2.2. Data Preparation

Based on the above physical models, a unique simulation algorithm was developed to generate learning cases used during the training of machine learning systems. Each training case was generated with the assumption of homogeneity of the distribution of electrical conductivity. For the obtained conductivity distribution, the measurement voltages are determined by the finite element method. The Eidors toolbox was used for this purpose [40]. In individual cases, the number of internal inclusions was selected randomly. It was assumed that we would get a maximum of five objects—each of a round shape as a result of the drawing. The radius of the inclusions and the electrical conductivity are such that they correspond to the actual tests performed by the EIT. In the next stage of calculations, the centre of each internal object is drawn.

Figure 3 shows one of the 35,000 generated cases used to train the predictive system. The cross-section of the tank contains one random inclusion. Inclusion is visible through the variety of colours on the 2883 finite element mesh.

The colours correspond to conventional units corrected for the electrical conductivity of the interior of the tested object. Ninety-six voltage measurements were assigned to the conductivity values (Figure 3b). These are not the voltage values, but the arbitrary units correlated with them. As the polarity of the electrodes changes during individual measurements, the voltage changes in the range (−0.3; +0.3). Each measurement’s value was boosted by the addition of Gaussian noise with a standard deviation of 4%. Finite elements (pixels) in the image have the values 1 (for background) or

10^{- 5}

(for plastic tubes).

2.3. The Concept of the Method Oriented Ensemble (MOE)

The proposed novel paradigm necessitates the training of several machine learning models for each pixel individually. Because the total number of pixels in the situation at hand is 2883, each of the five homogeneous approaches will require 2883 regression prediction models to be trained. Elastic Net (EN), Linear Regression with Least Squares learner (LR-LS), Linear Regression with Support Vector Machine learner (LR-SVM), Support Vector Machine (SVM), and Artificial Neural Network (ANN) are all used in the method-oriented ensemble (MOE) concept. The process of modelling the POE concept is presented in Figure 4. The presented flowchart is consistent with Algorithm 1 and Figure 1.

#

Algorithm 1. The pseudocode algorithm for training MOE

1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.

m = 96 % number of measurements
n = 2883 % number of finite elements in reconstruction mesh (pixels)
Train n models

f_{1} (x_{1 \dots m}) \to y_{1 \dots n}

with method # 1 (e.g., EN)
Train n models

f_{2} (x_{1 \dots m}) \to y_{1 \dots n}

with method # 2 (e.g., LR-LS)
Train n models

f_{3} (x_{1 \dots m}) \to y_{1 \dots n}

with method # 3 (e.g., LR-SVM)
Train n models

f_{4} (x_{1 \dots m}) \to y_{1 \dots n}

with method # 4 (e.g., SVM)
Train n models

f_{5} (x_{1 \dots m}) \to y_{1 \dots n}

with method # 5 (e.g., ANN)
% Assigning the RMSE for each method and pixel
for i = 1:5 % for 5 methods: EN, LR-LS, LR-SVM, SVM, ANN
for j = 1:n % for n = 2883 pixels
calculate RMSE(i, j) % assignment root mean square error for i-th method and j-th pixel
end
meanRMSE(i) = mean(RMSE(i,:)) % Calculate the mean RMSE for each of the 5 methods.
end % Assignment meanRMSE for i-th method and all 2883 pixels.
Prepare the training set to train the LSTM classifier. Inputs—96 measurements. Output—5 categories/classes.
Select the method with the lowest meanRMSE.
Reconstruct all n pixels using the selected method.

Algorithm 1 presents pseudocode to train a method-oriented ensemble (MOE) model. Based on 96 data, a specially trained LSTM classifier determines which of the five homogenous approaches (EN, LR-LS, LR-SVM, SVM, or ANN) is best for a given measurement situation (vector 96 measurements). It is worth noting that training five predictive models for each of the 2883 pixels yield 14,415 models. To this value, add the LSTM technique classifier. High computational complexity characterizes the new MOE idea. However, compared to the enormous improvement in image reconstruction quality that can be attained, this is a minor nuisance.

2.4. Elastic Net (EN)

When reconstructing tomographic images of real objects with low conductivity, the electrode data are frequently noisy. It is due to electrode insulation imperfections, the impacts of fast-varying, low-intensity currents created by multiplexers, the effects of electromagnetic fields, and various other reasons. Industrial reactors are another example of technological devices with a high noise level as determined by tomographic data. Electrical signal interference is one of the primary impediments to developing tomographic methods for such objects [41].

The elastic net regularization technique made the input data more robust to noise and distortions [42]. We begin with a linear system described by the equation of state

Y = X β + ε

, where

Y \in R^{n}

is the matrix of output variables (reconstruction),

X \in R^{n \times (k + 1)}

is the matrix of input variables, the coefficient

β \in R^{k + 1}

denotes a vector with unknown parameters, and

ε \in R^{n}

reflects the sequence of the disturbance.

Elastic net is a compromise between L₁ and L₂ norms, or, to put it another way, between Robert Tibshirani’s LASSO (Least Absolute Shrinkage and Selection Operator) and ridge regression is known as Tikhonov regularization. The approach is also successful when there are numerous correlated predictors or when the number of discretized current elements is substantially more significant than the number of measurement points. Task (1) can be used to indicate the problem that determines the elastic net:

\min_{(β_{0}, β^{'}) \in R^{i + 1}} \frac{1}{2 n} \sum_{i = 1}^{n} {(y_{i} - β_{0} - x_{i} β^{'})}^{2} + λ P_{α} (β^{'}),

(1)

where

(y_{i} - β_{0} - x_{i} β^{'})

are the linear model residuals,

x_{i}

is the vector of measurements,

y_{i}

is the vector of reference values,

β_{0}

is the intercept equal to the mean of the response variable parameter,

β^{'}

denotes unknown parameters,

λ

is the parameter that specifies the penalty for regularization, and

P_{α}

is a net elastic penalty defined by (2).

P_{α} (β^{'}) = (1 - α) \frac{1}{2} {‖ β^{'} ‖}_{L_{2}} + α {‖ β^{'} ‖}_{L_{1}},

(2)

If the linear problem (1) has a solution in which the regression line intersects the y axis, then the unit column vector is the first column of the X matrix in the linear equation

Y = X β + ε

. The elastic net penalty

P_{α}

is a summary combination of the L₁ and L₂ norms of unknown parameters

β^{'}

, as shown by Equation (2). The trade-off between LASSO and ridge regression is represented by parameter

0 \leq α \leq 1

. It is pure ridge regression if the value is

α = 0

, but it is pure LASSO if the value is

α = 1

.

2.5. Linear Regression with Least-Squares Learner (LR-LS)

A linear regression-based form of the approach was employed in the investigation. The SVM learner has been substituted by the least-squares model [43]. The algorithm employed in the LR-LS method is quite similar to the methodology employed in the previously disclosed LR-SVM method. In this instance, LASSO regression with L1 regularization was also utilized, as was the LASSO regression model without regularization. The distinction between LR-SVM and LR-LS is the loss function used to calculate the likelihood ratio. For LR-LS, the loss function is mean square error, which can be calculated as

M S E = ℓ [y, f (x)] = \frac{1}{2} {[y - f (x)]}^{2}

with the range of answers

y \in (- \infty, \infty)

. In LR-LS, the deviation b equals the y-weighted median of all training processes multiplied by the number of training processes.

2.6. Linear Regression with Support Vector Machine Learner (LR-SVM)

The linear regression (LR) and support vector machines (SVM) algorithms are used in the LR-SVM algorithm [44]. The algorithm has been tuned to work best with the multidimensional vectors of data provided as input. This strategy, known as the L₁ LASSO regularization technique, employs a regression model that incorporates the “absolute value of magnitude” into the loss function as a penalty component. In this study, the “learner” was a linear regression model based on the SVM approach, which was utilized [45].

f (x) = x β + b

is the loss function for a linear regression model type, where β denotes a vector of pp coefficients, x denotes an observation of p predictor variables, and b denotes a scalar bias. The mean square error (MSE) is determined as a loss function in the implemented algorithm, and it takes the form of the formula

ℓ [y, f (x)] = \max [0, | y - f (x) | - ε]

where

y \in (- \infty, \infty)

is a reconstruction of the response value. The LASSO cost function is represented by the Equation (3)

\min_{β_{0}, β} (\frac{\sum_{i = 1}^{n} {(y_{i} - b - x_{i}^{T} β)}^{2}}{2 n} + λ \sum_{j = 1}^{p} | β_{j} |),

(3)

where the number of observations is given by n. The transposed vector of length p at observation i is given by

x_{i}^{T}

. The reconstruction of a pixel at the observation vector gives

y_{i}

. The regularization parameter is given by

λ

, which must be non-negative. In this study, the value of Lambda is

λ = 1 / n

. It is worth noting that the parameters b and

β

are in that order a scalar bias and a vector of length p, respectively. The number of non-zero

β

parameters reduces as the value grows. Regularized support vector machines (SVM) and least-squares regression methods are both included in the LR-SVM. The model minimizes the objective function by employing stochastic gradient descent (SGD), reducing the time required for computation. A ridge penalty is applied to support vector machines in the outlined method, which is then optimized using a dual SGD for SVM. The formula represents the criterion for terminating the iteration process

‖ \frac{B_{t} - B_{t - 1}}{B_{t}} ‖ < ϰ

, where

ϰ

is the relative tolerance on linear coefficients

β_{t}^{'}

, and bias term

b_{t}

, and

B_{t} = [β_{t}^{'}, b_{t}]

.

2.7. Support Vector Machine (SVM)

Support Vector Machines (SVM) are based on the premise that a decision space may be partitioned by erecting borders that separate objects belonging to various classes. Regression and classification problems may both be solved using SVM, a standard machine learning method. Vladimir Vapnik and his colleagues first proposed this idea in 1992. As a result, there are four types of SVM: classification type 1 (C-SVM), classification type 2 (

ε

-SVM), regression type 1 (

ε

-SVM), regression type 2 (

ν

-SVM). Support vector models fall into one of these four categories (

ν

-SVM). We use SVM regression analysis to find the functional dependence between a dependent variable y and its independent variable x.

y = f (x) + n o i s e

is the formula used in regression analysis since it assumes the relationship is of the deterministic type

f (x)

, with some random noise added on top of it. The main goal is to determine the function f form that best provides the dependent variable’s value in new scenarios that the SVM model has not previously “seen”. The learning test trains the SVM model system by putting it through its paces with a set of cases. According to the proposed theory, each SVM subsystem is responsible for generating a single pixel value. The total number of trained SVM models in an EIT system is equal to the output image resolution, which is

(96 \to SVM \to 1) \times 2883

. As a result, the technique utilized in this study implements the regression type 2 problem (also known as the

ν

-SVM). The Equation (4) ’s deviation function is minimized [44]:

\frac{1}{2} w^{T} w - C (ν ε + \frac{1}{N} \sum_{i = 1}^{N} (ξ_{i} + ξ_{i}^{*})),

(4)

under the following conditions (5):

{\begin{matrix} (w^{T} Φ (x_{i}) + b) - y_{i} \leq ε + ξ_{i} \\ y_{i} - (w^{T} Φ (x_{i}) + b_{i}) \leq ε + ξ_{i}^{*} \\ ξ_{i}, ξ_{i}^{*} \geq 0, i = 1, \dots, N, ε \geq 0 \end{matrix},

(5)

where

ε, ν

are the penalty parameters, C is the capacity constant,

w

is a vector of coefficients, b is a constant, and

ξ_{i}, ξ_{i}^{*}

are the overlapping case parameters. N learning examples are represented by index i independent variables are represented by

x_{i}

, and regression patterns are represented by

y_{i}

. The input data are converted to a new feature space via the kernel function. It is important to note that C significantly impacts the deviation, and its value must be carefully chosen to avoid overfitting the model. Each of the SVM subsystems was trained using 4000 training cases.

2.8. Artificial Neural Network (ANN)

The researchers employed artificial neural networks in the form of a multilayer perceptron for their investigation. The collection of 35,000 examples has been separated into three subsets: training, validation, and testing, with an aspect ratio of 70:15:15. The training subset contains cases that have been validated and assessed. As a result, 2883 models were trained, which is the same number of models as the resolution of the spatial tomographic image. Additionally, to optimize the models, a backward propagation of errors approach using conjugate gradients was utilized in conjunction with conjugate gradients. The structure of a single ANN dedicated to each of the 2883 pixels is 96-10-1. Ninety-six measurements are input to the network, ten neurons are in the hidden layer, and one regression neuron is in the output. The transfer function of the hidden layer is a hyperbolic tangent

\tanh (x) = \frac{e^{2 x} - 1}{e^{2 x} + 1}

. The output layer makes use of a linear activation technique.

2.9. The Long Short-Term Memory (LSTM) Network for Classification

The classification of measurement cases to homogenous methods within MOE was accomplished using a deep LSTM network. This method differs from previous methods because it can learn long-term correlations between time steps in a time series or provided sequences. Furthermore, the measurement vector is used as a sequence of input variables to the recurrent deep LSTM neural network, a novel technique. The study’s findings demonstrated that such a strategy is quite effective in selecting the ideal homogenous method within MOE, as demonstrated by the study results.

Illustration of a time series X flow with C features (channels) of length S through an LSTM layer, as depicted in Figure 5. The output (also known as the hidden state) and the cell state at time step t are represented by the symbols h_t and c_t, respectively, in the diagram [46]. For example, the first LSTM block considers both the network’s starting state and the first time step of the sequence to compute both the first output and the updated cell state. To compute the output and the updated cell state c_t at time step t, the block uses the current state of the network (c_t−₁, h_t−₁) and the next time step in the series (time step t).

The hidden state (also known as the output state) and the cell state are the two states that make up the state of the layer. The output of the LSTM layer for the time step t is stored in the hidden state at time step t. In each time step, the cell state contains information that has been gained from the preceding steps in time. Thus, each time step, the layer either adds information to the cell state or removes information from the cell state, depending on the situation. The layer manages these changes through the use of gates. The cell state and concealed state of the layer are controlled by the following four components: input gate (i) that controls the level of cell state update, forget gate (f) that controls the level of cell state reset (forget), cell candidate (g) that adds information to cell state, and output gate (o) that controls the level of cell state added to hidden state.

The flow of data at the time step t is depicted in Figure 6. The diagram illustrates how the gates forget, update, and output the cell and hidden states and how they interact with one another. The LSTM network with seven layers was used.

The gates provide additional information on the cell’s state. Equation (6) can be used to characterize the weights W, the recurrent weights R, and the biases b.

W = [\begin{matrix} W_{i} \\ \begin{matrix} W_{f} \\ W_{g} \\ W_{o} \end{matrix} \end{matrix}], R = [\begin{matrix} R_{i} \\ \begin{matrix} R_{f} \\ R_{g} \\ R_{o} \end{matrix} \end{matrix}], b = [\begin{matrix} b_{i} \\ \begin{matrix} b_{f} \\ b_{g} \\ b_{o} \end{matrix} \end{matrix}] .

(6)

The symbols i, f, g, and o signify input gate, forget gate, cell candidate, and output gate, respectively. The state of a cell at a particular time step t is denoted as

c_{t} = f_{t} ⊙ c_{t - 1} + i_{t} ⊙ g_{t}

, where ⊙ represents the Hadamard product, or in other words, vector element-wise multiplication. At time step t, the hidden state is defined as

h_{t} = o_{t} ⊙ σ_{c} (c_{t})

, where

σ_{c}

is the state activation function. The Equation (7) defines the LSTM layer’s components at time step t,

{\begin{matrix} i_{t} = σ_{g} (W_{i} x_{t} + R_{i} h_{t - 1} + b_{i}) \\ \begin{matrix} f_{t} = σ_{g} (W_{f} x_{t} + R_{f} h_{t - 1} + b_{f}) \\ g_{t} = σ_{c} (W_{g} x_{t} + R_{g} h_{t - 1} + b_{g}) \\ o_{t} = σ_{g} (W_{o} x_{t} + R_{o} h_{t - 1} + b_{o}) \end{matrix} \end{matrix},

(7)

where

σ_{g}

denotes the gate activation function. The sigmoidal activation function was employed in both biLSTM layers (see Table 1). The Equation

σ (x) = {(1 + e^{- x})}^{- 1}

can be used to express this type of function.

In the case of neural networks, there are no rigid criteria for picking network parameters (e.g., number of layers, number of hidden units in LSTM layers, normalization requirements, initial weights, and biases functions) for specific types of issues. Accordingly, the network and training parameters are selected empirically. It was likewise the case in this circumstance. Table 1 presents the neural network parameters for the classification of homogeneous methods, having two double LSTM layers, 128 hidden units each. The performed experiments demonstrated that a lesser number of hidden units causes a worsening of network quality while increasing the number of units and adding subsequent layers extends the learning process without creating an increase in the quality of the LSTM network.

The first layer of the LSTM model is sequence input. The sequence input layer tries to enter chronological data into the network. The next is the bidirectional layer BiLSTM. The bidirectional LSTM layer learns long-term correlations between signal time steps or sequence data in both ways (forward with feedback). These interactions are significant when there is a need for the network to learn from full-time series at each time step. The second is the batch normalization layer. The batch normalization operation normalizes the input data across all observations for each channel independently. The batch normalization employed between convolution and nonlinear operations such as BiLSTM speeds up the training of the convolutional neural network and minimizes the sensitivity to network initialization. The next layer is again the BiLSTM bi-directional layer. The fifth layer of the LSTM is fully connected. This layer multiplies the numerical input values by the weight matrix and also adds a vector of biases. Another layer is the softmax. A softmax layer applies a softmax function to the input. For classification problems, a softmax layer and a classification layer commonly follow the final fully connected layer. The last is a classification layer. This layer computes the cross-entropy loss for classification and weighted classification tasks with mutually exclusive classes. The layer infers the number of classes from the output size of the previous layer.

One or more fully linked layers are introduced following the convolution and downsampling layers in deep networks. When the input to a fully connected layer is a sequence, as with LSTM, the fully connected layer performs each stage independently. If the output of the layer preceding the fully connected layer is an array A₁ of dimension X by Y by Z, then the output of the fully connected layer is an array A₂ of size X’ by Y by Z. The proper input to A₂ at time step t is

{WA}_{t} + b

, where A t is the time step t of A and b is the bias. Glorot initializer was used to generate the weights for this layer in this research [47]. The softmax is the penultimate layer. It is a common type of layer in deep categorization neural networks. An ultimately linked layer is always preceding the softmax layer. The formula

y_{r} (x) = e^{a_{r} (x)} / \sum_{j = 1}^{k} e^{a_{j} (x)}

denotes the softmax activation function, with

0 \leq y_{r} \leq 1

, and

\sum_{j = 1}^{k} y_{j} = 1

.

For classification problems with mutually exclusive classes, the final layer computes the cross-entropy loss. To ensure a sufficient number of training sets, aggregation of measurement data with comparable properties was performed. The LSTM network was trained using the adaptive moment estimation (ADAM) algorithm [48]. The following parameters apply to the BiLSTM layer: Tanh function for state activation, sigmoid function for gate activation, mini-batch size = 100, starting learning rate = 0.01, sequence length = 96 (longest), gradient threshold = 1. The parameters listed above were determined empirically. With a probability range from 0.1 to 0.5, various model variants featuring the dropout layer were examined. The experiments revealed that increasing dropout layers did not affect the network’s generalization in the scenario analyzed. As a result of the preceding, we opted against incorporating this layer into the proposed prediction model. The training status of the LSTM network using a raw input is depicted in Figure 7.

Cross-entropy and accuracy were used to assess the LSTM model’s quality. Accuracy is defined as the proportion of correctly identified observations across all instances (8).

A c c u r a c y = \frac{N_{c}}{N} \cdot 100 %,

(8)

where N_c is the number of correctly rebuilt pixels, and N is the total number of pixels [49]. Equation (9) defines the cross-entropy loss between network predictions and target values

L o s s = - \sum_{i = 1}^{M} T_{i} l o g (X_{i}) / N,

(9)

where N is the number of observations, M is the number of responses, T_i is the number of patterns, and X_i is the number of network outputs. The training-progress graphic demonstrates the correctness of the training. Indeed, it indicates the accuracy of each minibatch’s classification. This number increases to 100% for optimal training development. At the finish of the training procedure, the classifier’s accuracy oscillates between 99% and 100%. It took approximately 12 min to train. The computation was carried out on a personal computer configured as follows: 2.80 GHz Intel^® CoreTM i5-8400 CPU, 16 GB RAM, NVIDIA GeForce RTX 2070 GPU. Parallel computing with GPU was used [50,51].

The LSTM learning process using a measurements vector as a signal in conjunction with the Loss indicator is illustrated in Figure 8. The graph depicts the training loss, which is the cross-entropy loss for each mini-batch. When training is carried out precisely, the loss should be zero. The shape of this plot corroborates all of the information in Figure 7.

Both Figure 7 and Figure 8 show a very high quality of the LSTM classifier. Curve shapes resembling a logarithm and a hyperbola, without fluctuations, are proof of the correct course of the learning process.

Figure 9 shows the confusion matrix of the LSTM classifier on the set of 1000 test cases. As can be seen, significant problems were caused by the correct classification of the EN method because 13.6% of the answers were incorrectly categorized as the ANN method. Thus, cumulative accuracy for the entire testing set was 98.2%.

3. Results and Discussion

Experiments were carried out to reconstruct EIT measurements based on real and simulation data to verify the effectiveness of the new MOE method. The first experiment was to visualize the inside of a physical model for three different scenarios. In the first case, two tubes were immersed in the water. In the second case, the position of the tubes was changed, and their number increased to three. Finally, the fourth case included four tubes arranged symmetrically at equal distances and the walls of the water container.

Reconstructions based on real measurements do not allow for objective comparisons due to the lack of reference images. Therefore, a series of experiments based on simulation data was also carried out to allow for an objective, index-based assessment that allows the comparison of individual reconstructions. In this way, the quality of the proposed MOE method was verified using error, deviation and correlation indicators.

3.1. Visualizations of Real Measurements

Figure 10 shows a schematic diagram of the physical model of the tested tank. Figure 10a shows a dimensioned side view of the cylinder with the tube submerged. Figure 10b–d show top views of the test tank with three cases of differently distributed inclusions. The considered measurement cases contain 2, 3 and 4 tubes, respectively, immersed in the cylinder. The diameter of the tested cross-section of the tank model is 32.5 cm. When submerged in water, the diameter of the plastic tube is 28 mm. 16 electrodes were evenly spaced around the cylinder at the height of 14 cm from the bottom.

Figure 11 shows the reconstructed three measurement cases presented earlier in Figure 10. The LSTM classifier indicated the SVM method in all tested cases. As is known, there is always a certain level of data noise in the case of real measurements. It makes this type of reconstruction more difficult to reconstruct than imaging simulation measurements. In Figure 11a, an artefact appears in the central part of the tank, but it is less pronounced than the images of the two tubes. The other reconstructions Figure 11b,c are successful.

3.2. Comparison of the Reconstructions Based on Simulation Data

Four widely used metrics were used to evaluate the quality of tomographic reconstructions objectively: Root Mean Square Error (RMSE), Relative Image Error (RIE), Percentage Error (PE), Mean Absolute Percentage Error (MAPE), and Image Correlation Coefficient (ICC). The root mean square error is calculated using the Equation (10)

RMSE = \sqrt{\frac{\sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}{n},}

(10)

where n is the number of finite elements or the picture resolution,

y_{i}

denotes the i-th pixel’s pattern conductivity, and

{\hat{y}}_{i}

denotes the reconstruction conductivity. RIE is calculated as (11)

RIE = \frac{‖ y^{'} - y ‖}{‖ y ‖},

(11)

where

y

is the ground-truth (reference) conductivity distribution and

y^{'}

is the reconstructed conductivity distribution. MAPE is calculated using the following Eqaution (12)

MAPE = \frac{1}{n} \sum_{i = 1}^{n} | \frac{y - \hat{y}}{y} | .

(12)

The ICC metric describes the function (13)

ICC = \frac{\sum_{i = 1}^{n} (y_{i} - \bar{y}) ({\hat{y}}_{i} - \bar{\hat{y}})}{\sqrt{\sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2} \sum_{i = 1}^{n} {({\hat{y}}_{i} - \bar{\hat{y}})}^{2}}},

(13)

where

\bar{y}

denotes the mean reference ground-truth conductivity distribution and

\bar{\hat{y}}

denotes the mean EIT reconstruction conductivity distribution.

The lower the root mean square error, the smaller the RIE, and the greater the MAPE value, the higher the tomographic image quality. Thus, ICC = 1 indicates perfect reconstruction, whereas ICC = 0 indicates the worst-case scenario. Reconstructions based on simulation measurements are shown in Figure 11. The first line of Figure 11 illustrates the reference images of the model tested. Reconstructions using homogeneous EN, LS-SVM, LR-LS, SVM and ANN methods are shown in the following lines.

Table 2 corresponds with Figure 12. Table 2 presents quantitative indicators that enable the comparison of individual homogeneous methods included in the new MOE concept. The best values of indicators are marked in blue.

Visual (subjective) observation of the images in Figure 12 concludes that the best reconstructions were obtained by the EN and ANN methods for the first measurement case with a single inclusion. According to the indicators included in Table 2, the best method is ANN. The second case of measurement is not so clear-cut. In Table 2, according to the RMSE and RIE indicators, the best is the SVM reconstruction, according to MAPE-ANN and ICC-EN. Notably, any indicators did not select the LR-LS and LR-SVM methods, which visually look worse than the others. The third case, with three inclusions, according to RMSE and RIE, is best reconstructed by the LR-SVM method, according to MAPE-SVM, and according to ICC-EN. Visual observation confirms the rightness of this choice. In the last, fourth measurement case, all indicators in Table 2 indicated the superiority of the LR-SVM method. It can be seen that the SVM method fares very similarly, although it is worse. Visual observation of Figure 12 can confirm this. Since the purpose of tomography is not a precise measurement of electrical conductivity inside the tested objects but an accurate, precise, and quick visualization of inclusions, the MAPE index, as an index reflecting the absolute values of the patterns and reconstructions, should be treated as optional/supplementary. Large values of the MAPE index result from 2883 pixels included in the tomographic image and the fact that the reference background values are “1” and the inclusions are close to zero (precisely 10⁻⁵).

Figure 13 shows the 1000-item testing set broken down by quantity for each homogeneous method selected by the LSTM classifier. As can be seen, in the test set under consideration, the LR-SVM method was indicated the most times, while the LR-LS method was chosen the least frequently. It should be noted that the method quality indicator for the LSTM classifier was the RMSE. Impedance tomography is a potential field of science for identifying inclusions inside industrial reactors, containers, and pipes [52,53]. The presented research findings substantiate the preceding assertion due to the complexity of the model transferring the measurement data into the output image and the requirement to solve the incorrectly stated inverse problem, developing an effective and universal technique. The novel MOE approach described in this article uses considerable computational resources to maximize the resulting image’s quality. With the ongoing development of new mathematical homogenous methods, hybrid (ensemble) methods should not be overlooked. In this regard, the MOE technique is more advanced. MOE makes every effort from the start to fully use the data available in the measurement data collection. Because distinct training models are used for each pixel in the output image, each pixel’s value, and hence the colour, is calculated using the entire vector of 96 measurements.

Perfect results were obtained due to the unconventional deployment of a recurrent LSTM network. The most often utilized use of LSTM is to process time series and sequences. Therefore, we considered the measurement vector a collection of structured data (sequence) with a single step in our research. This technique proved correct, as indicated by the near-perfect classification of homogeneous methods using the MOE method.

Differences in image reconstruction across separate models are due to different methods, but not exclusively. For instance, when a homogeneous approach (e.g., elastic net or SVM) is used with the same input vector for all pixels, reconstructing an image with 2883 pixels requires adjusting the parameter values and hyperparameters of models trained for specific finite elements (pixels).

The objective of machine learning methods is to extract pure information from input data. It is information that can be used to create the output image successfully. There are numerous methods for obtaining the information’s essence. Preprocessing can be used to clean, standardize, encode and decode, increase or shrink, and extract features from data. It turns out that to get the highest possible reconstruction quality, the number of iterations and variation of the model can also be multiplied. This approach is superior to typical pre-processing procedures in that it is applicable to practically any problem. As a result, the tomographic images of the inclusion distribution inside the reactors are more precise, legible, and have a more excellent resolution. It is one of the benefits of the novel MOE concept described in this paper.

4. Conclusions

This publication provides a new algorithmic MOE method that optimizes EIT reconstructions in industrial applications simultaneously using many machine learning methods. Moreover, the provided method permits the visualization of the distribution of inclusions inside the tested reactor.

The fundamental aspect that distinguishes MOE from other methods is the high chance of increasing the quality of reconstructive images without using new, previously unknown homogenous methods. MOE is the concept of making greater use of computing resources and offering alternative methods. To apply the novel MOE methodology, it is enough to train a few models that use existing prediction methods, such as SVM, linear regression, neural networks, decision trees, elastic net, LASSO, k-nearest neighbours, etc. and then use them as stated in this article. Regardless of how many methods are employed in the MOE, it is almost assured that the results achieved will be better than if each homogenous approach was utilized alone. An exciting discovery is the unusual use of the LSTM network as a method classifier. This was made possible by treating the measurement vector as a succession of measurements.

Future studies will be carried out to improve the MOE approach with increased usage of deep learning and recursive neural networks.

Author Contributions

Development of the system concept, measurement methodology, image reconstruction and supervision, T.R.; Development of software for mathematical models and the measurement concept of electrical impedance tomography for industrial tomography research, G.K.; Development of a measurement concept in a real model of a tank reactor, preparation of a measuring station, measurements, development of measurement methodology, preparation of descriptions in the article, M.M. and A.D.; Development of the numerical methods and techniques, K.N.; Literature review, formal analysis, general review, and editing of the manuscript, M.R. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Kłosowski, Grzegorz (2021), “Training dataset”, Mendeley Data, V1, doi: 10.17632/4hjt9dzcvd.1.

Conflicts of Interest

The authors declare no conflict of interest.

References

Park, S.; Na, J.; Kim, M.; Lee, J.M. Multi-objective Bayesian optimization of chemical reactor design using computational fluid dynamics. Comput. Chem. Eng. 2018, 119, 25–37. [Google Scholar] [CrossRef]
Kosicka, E.; Kozłowski, E.; Mazurkiewicz, D. Intelligent Systems of Forecasting the Failure of Machinery Park and Supporting Fulfilment of Orders of Spare Parts. In Proceedings of the First International Conference on Intelligent Systems in Production Engineering and Maintenance ISPEM, Wrocław, Poland, 28–29 September 2017; pp. 54–63. [Google Scholar]
Wang, M. Industrial Tomography: Systems and Applications; Elsevier: Amsterdam, The Netherlands, 2015; ISBN 1782421181. [Google Scholar]
Wajman, R.; Banasiak, R.; Babout, L. On the Use of a Rotatable ECT Sensor to Investigate Dense Phase Flow: A Feasibility Study. Sensors 2020, 20, 4854. [Google Scholar] [CrossRef] [PubMed]
Ye, Z.; Banasiak, R.; Soleimani, M. Planar array 3D electrical capacitance tomography. Insight Non-Destr. Test. Cond. Monit. 2013, 55, 675–680. [Google Scholar] [CrossRef]
Mosorov, V.; Rybak, G.; Sankowski, D. Plug Regime Flow Velocity Measurement Problem Based on Correlability Notion and Twin Plane Electrical Capacitance Tomography: Use Case. Sensors 2021, 21, 2189. [Google Scholar] [CrossRef] [PubMed]
Romanowski, A.; Chaniecki, Z.; Koralczyk, A.; Woźniak, M.; Nowak, A.; Kucharski, P.; Jaworski, T.; Malaya, M.; Rózga, P.; Grudzień, K. Interactive Timeline Approach for Contextual Spatio-Temporal ECT Data Investigation. Sensors 2020, 20, 4793. [Google Scholar] [CrossRef] [PubMed]
Voss, A.; Pour-Ghaz, M.; Vauhkonen, M.; Seppänen, A. Retrieval of the saturated hydraulic conductivity of cement-based materials using electrical capacitance tomography. Cem. Concr. Compos. 2020, 112, 103639. [Google Scholar] [CrossRef]
Garbaa, H.; Jackowska-Strumiłło, L.; Grudzień, K.; Romanowski, A. Application of electrical capacitance tomography and artificial neural networks to rapid estimation of cylindrical shape parameters of industrial flow structure. Arch. Electr. Eng. 2016, 65, 657–669. [Google Scholar] [CrossRef]
Grudzien, K.; Chaniecki, Z.; Romanowski, A.; Sankowski, D.; Nowakowski, J.; Niedostatkiewicz, M. Application of twin-plane ECT sensor for identification of the internal imperfections inside concrete beams. In Proceedings of the 2016 IEEE International Instrumentation and Measurement Technology Conference, Taipei, Taiwan, 23–26 May 2016; pp. 1–6. [Google Scholar] [CrossRef]
Kryszyn, J.; Smolik, W.T.; Radzik, B.; Olszewski, T.; Szabatin, R. Switchless charge-discharge circuit for electrical capacitance tomography. Meas. Sci. Technol. 2014, 25, 115009. [Google Scholar] [CrossRef]
Kryszyn, J.; Smolik, W. Toolbox for 3D modelling and image reconstruction in electrical capacitance tomography. Inform. Control. Meas. Econ. Environ. Prot. 2017, 7, 137–145. [Google Scholar] [CrossRef]
Majchrowicz, M.; Kapusta, P.; Jackowska-Strumiłło, L.; Sankowski, D. Acceleration of image reconstruction process in the electrical capacitance tomography 3D in heterogeneous, multi-GPU system. Inform. Control. Meas. Econ. Environ. Prot. 2017, 7, 37–41. [Google Scholar] [CrossRef]
Romanowski, A. Big Data-Driven Contextual Processing Methods for Electrical Capacitance Tomography. IEEE Trans. Ind. Inform. 2019, 15, 1609–1618. [Google Scholar] [CrossRef]
Soleimani, M.; Mitchell, C.N.; Banasiak, R.; Wajman, R.; Adler, A. Four-dimensional electrical capacitance tomography imaging using experimental data. Prog. Electromagn. Res. 2009, 90, 171–186. [Google Scholar] [CrossRef] [Green Version]
Banasiak, R.; Wajman, R.; Jaworski, T.; Fiderek, P.; Fidos, H.; Nowakowski, J.; Sankowski, D. Study on two-phase flow regime visualization and identification using 3D electrical capacitance tomography and fuzzy-logic classification. Int. J. Multiph. Flow 2014, 58, 1–14. [Google Scholar] [CrossRef]
Dusek, J.; Hladky, D.; Mikulka, J. Electrical impedance tomography methods and algorithms processed with a GPU. In Proceedings of the 2017 Progress in Electromagnetics Research Symposium—Spring (PIERS), St. Petersburg, Russia, 22–25 May 2017; pp. 1710–1714. [Google Scholar]
Kłosowski, G.; Rymarczyk, T. Using neural networks and deep learning algorithms in electrical impedance tomography. Inform. Autom. Pomiary Gospod. Ochr. Sr. 2017, 7, 99–102. [Google Scholar] [CrossRef]
Voutilainen, A.; Lehikoinen, A.; Vauhkonen, M.; Kaipio, J.P. Three-dimensional nonstationary electrical impedance tomography with a single electrode layer. Meas. Sci. Technol. 2010, 21, 35107. [Google Scholar] [CrossRef]
Duraj, A.; Korzeniewska, E.; Krawczyk, A. Classification algorithms to identify changes in resistance. Przegląd Elektrotechn. 2015, 1, 82–84. [Google Scholar] [CrossRef]
Szczesny, A.; Korzeniewska, E. Selection of the method for the earthing resistance measurement. Przegląd Elektrotechn. 2018, 94, 178–181. [Google Scholar]
Dusek, J.; Mikulka, J. Measurement-Based Domain Parameter Optimization in Electrical Impedance Tomography Imaging. Sensors 2021, 21, 2507. [Google Scholar] [CrossRef]
Chen, B.; Abascal, J.; Soleimani, M. Extended Joint Sparsity Reconstruction for Spatial and Temporal ERT Imaging. Sensors 2018, 18, 4014. [Google Scholar] [CrossRef] [Green Version]
Liu, S.; Huang, Y.; Wu, H.; Tan, C.; Jia, J. Efficient Multitask Structure-Aware Sparse Bayesian Learning for Frequency-Difference Electrical Impedance Tomography. IEEE Trans. Ind. Inform. 2021, 17, 463–472. [Google Scholar] [CrossRef] [Green Version]
Ziolkowski, M.; Gratkowski, S.; Zywica, A.R. Analytical and numerical models of the magnetoacoustic tomography with magnetic induction. COMPEL Int. J. Comput. Math. Electr. Electron. Eng. 2018, 37, 538–548. [Google Scholar] [CrossRef]
Rymarczyk, T.; Adamkiewicz, P.; Polakowski, K.; Sikora, J. Effective ultrasound and radio tomography imaging algorithm for two-dimensional problems. Przegląd Elektrotechn. 2018, 94, 62–69. [Google Scholar]
Liang, G.; Dong, F.; Kolehmainen, V.; Vauhkonen, M.; Ren, S. Nonstationary Image Reconstruction in Ultrasonic Transmission Tomography Using Kalman Filter and Dimension Reduction. IEEE Trans. Instrum. Meas. 2021, 70, 1–12. [Google Scholar] [CrossRef]
Babout, L.; Grudzień, K.; Wiącek, J.; Niedostatkiewicz, M.; Karpiński, B.; Szkodo, M. Selection of material for X-ray tomography analysis and DEM simulations: Comparison between granular materials of biological and non-biological origins. Granul. Matter 2018, 20, 38. [Google Scholar] [CrossRef] [Green Version]
Korzeniewska, E.; Sekulska-Nalewajko, J.; Gocławski, J.; Dróżdż, T.; Kiełbasa, P. Analysis of changes in fruit tissue after the pulsed electric field treatment using optical coherence tomography. Eur. Phys. J. Appl. Phys. 2020, 91, 30902. [Google Scholar] [CrossRef]
Sekulska-Nalewajko, J.; Gocławski, J.; Korzeniewska, E. A method for the assessment of textile pilling tendency using optical coherence tomography. Sensors 2020, 20, 3687. [Google Scholar] [CrossRef] [PubMed]
Sobaszek, Ł.; Gola, A.; Świć, A. Predictive Scheduling as a Part of Intelligent Job Scheduling System; Springer: Cham, Switzerland, 2018; pp. 358–367. [Google Scholar]
Kłosowski, G.; Gola, A.; Świć, A. Application of fuzzy logic controller for machine load balancing in discrete manufacturing system. In Proceedings of the International Conference on Intelligent Data Engineering and Automated Learning, Wroclaw, Poland, 14–16 October 2015; Volume 9375, pp. 256–263. [Google Scholar]
Rymarczyk, T.; Kłosowski, G.; Kozłowski, E. A Non-Destructive System Based on Electrical Tomography and Machine Learning to Analyze the Moisture of Buildings. Sensors 2018, 18, 2285. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Rymarczyk, T.; Kłosowski, G.; Hoła, A.; Sikora, J.; Wołowiec, T.; Tchórzewski, P.; Skowron, S. Comparison of Machine Learning Methods in Electrical Tomography for Detecting Moisture in Building Walls. Energies 2021, 14, 2777. [Google Scholar] [CrossRef]
Lopato, P.; Chady, T.; Sikora, R.; Gratkowski, S.; Ziolkowski, M. Full wave numerical modelling of terahertz systems for nondestructive evaluation of dielectric structures. COMPEL Int. J. Comput. Math. Electr. Electron. Eng. 2013, 32, 736–749. [Google Scholar] [CrossRef]
Psuj, G. Multi-Sensor Data Integration Using Deep Learning for Characterization of Defects in Steel Elements. Sensors 2018, 18, 292. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Kozłowski, E.; Mazurkiewicz, D.; Kowalska, B.; Kowalski, D. Binary linear programming as a decision-making aid for water intake operators. In Proceedings of the First International Conference on Intelligent Systems in Production Engineering and Maintenance ISPEM, Wrocław, Poland, 28–29 September 2017; Advances in Intelligent Systems and Computing Series. Springer: Cham, Switzerland, 2018; Volume 637, pp. 199–208. [Google Scholar]
Yuen, B.; Dong, X.; Lu, T. Inter-Patient CNN-LSTM for QRS Complex Detection in Noisy ECG Signals. IEEE Access 2019, 7, 169359–169370. [Google Scholar] [CrossRef]
Saadatnejad, S.; Oveisi, M.; Hashemi, M. LSTM-Based ECG Classification for Continuous Monitoring on Personal Wearable Devices. IEEE J. Biomed. Heal. Inform. 2019, 24, 515–523. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Adler, A.; Lionheart, W.R.B. Uses and abuses of EIDORS: An extensible software base for EIT. Physiol. Meas. 2006, 27, S25–S42. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Rymarczyk, T.; Kłosowski, G.; Kozłowski, E.; Tchórzewski, P. Comparison of Selected Machine Learning Algorithms for Industrial Electrical Tomography. Sensors 2019, 19, 1521. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Kryszyn, J.; Wanta, D.M.; Smolik, W.T. Gain Adjustment for Signal-to-Noise Ratio Improvement in Electrical Capacitance Tomography System EVT4. IEEE Sens. J. 2017, 17, 8107–8116. [Google Scholar] [CrossRef]
Xiao, L. Dual averaging methods for regularized stochastic learning and online optimization. J. Mach. Learn. Res. 2010, 11, 2543–2596. [Google Scholar]
Ho, C.H.; Lin, C.J. Large-scale linear support vector regression. J. Mach. Learn. Res. 2012, 13, 3323–3348. [Google Scholar]
Hsieh, C.J.; Chang, K.W.; Lin, C.J.; Keerthi, S.S.; Sundararajan, S. A dual coordinate descent method for large-scale linear SVM. In Proceedings of the 25th International Conference on Machine Learning, Helsinki, Finland, 5–9 July 2008. [Google Scholar]
Beale, M.H.; Hagan, M.T.; Demuth, H.B. Deep Learning Toolbox User’s Guide; The Mathworks Inc.: Herborn, Germany, 2018. [Google Scholar]
Glorot, X.; Yoshua, B. Understanding the difficulty of training deep feedfor-ward neural networks. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, Sardinia, Italy, 13–15 May 2010; pp. 249–256. [Google Scholar]
Kingma, D.P.; Ba, J.L. Adam: A method for stochastic optimization. In Proceedings of the 3rd International Conference on Learning Representations, ICLR 2015—Conference Track, San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
Kłosowski, G.; Rymarczyk, T.; Kania, K.; Świć, A.; Cieplak, T. Maintenance of industrial reactors supported by deep learning driven ultrasound tomography. Eksploat. Niezawodn. 2020, 22, 138–147. [Google Scholar] [CrossRef]
Mikulka, J. GPU-Accelerated Reconstruction of T2 Maps in Magnetic Resonance Imaging. Meas. Sci. Rev. 2015, 15, 210–218. [Google Scholar] [CrossRef] [Green Version]
Majchrowicz, M.; Kapusta, P.; Jackowska-Strumiłło, L.; Banasiak, R.; Sankowski, D. Multi-GPU, Multi-Node Algorithms for Acceleration of Image Reconstruction in 3D Electrical Capacitance Tomography in Heterogeneous Distributed System. Sensors 2020, 20, 391. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Rzasa, M.R.; Czapla-Nielacna, B. Analysis of the Influence of the Vortex Shedder Shape on the Metrological Properties of the Vortex Flow Meter. Sensors 2021, 21, 4697. [Google Scholar] [CrossRef] [PubMed]
Shi, X.; Tan, C.; Dong, F.; Santos, E.N.D.; Silva, M.J. Da Conductance Sensors for Multiphase Flow Measurement: A Review. IEEE Sens. J. 2021, 21, 12913–12925. [Google Scholar] [CrossRef]

Figure 1. Principle of the Method Oriented Ensemble (MOE) concept.

Figure 2. The key elements of the test stand: (a)—electrical impedance tomograph connected to the electrodes; (b)—a physical model of a tank with plastic tubes immersed in water, (c)—cylinder diameter in (cm).

Figure 3. A sample training measurement case generated with the Eidors toolbox: (a)—cross-section with a visible inclusion; (b)—values of 96 measurements corresponding to the cross-section containing the inclusion.

Figure 4. The flowchart of new concept (MOE) modelling.

Figure 5. LSTM network structure for classification of raw electrical measurements.

Figure 6. Gates interaction in the LSTM network [46].

Figure 7. Training accuracy for LSTM network.

Figure 8. Training loss for LSTM network.

Figure 9. Confusion matrix for the testing set of the LSTM classification.

Figure 10. Real measurements: (a)—dimensioned test stand for real cases. Three variants of the arrangement of phantoms in the tested tank with 16 electrodes: (b)—2 phantoms, (c)—3 phantoms, (d)—4 phantoms.

Figure 11. Image reconstructions based on real measurements: (a)—a case with two inclusions, (b)—a case with three inclusions, (c)—a case with four inclusions.

Figure 12. Image reconstructions based on simulation measurements.

Figure 13. Results of the selections made by the LSTM classifier on a set of 1000 testing cases.

Table 1. Layers of LSTM network for classification with activations, learnable and states.

#	Layer Description	Activations	Learnable Parameters (Weights and Biases)	Total Learnables	States
1	Sequence input with 96 dimensions	96	-	0	-
2	BiLSTM with 128 hidden units	256	Input weights: 1024 × 96 Recurrent weights: 1024 × 128 Bias: 1024 × 1	230,400	Hidden state 256 × 1 Cell state 256 × 1
3	Batch normalization	256	Offset: 256 × 1 Scale: 256 × 1	512	-
4	BiLSTM with 128 hidden units	256	Input weights: 1024 × 256 Recurrent weights: 1024 × 128 Bias: 1024 × 1	364,240	Hidden state 256 × 1 Cell state 256 × 1
5	Fully connected layer	5	Weights: 5 × 256 Bias: 5 × 1	1285	-
6	Softmax	5	-	0	-
7	Classification output (cross entropy)	5	-	0	-

Table 2. Reconstruction quality indicators for homogenous methods—testing set. The best values of indicators are marked in blue.

Case Number	Indicator	Methods of Reconstruction					Best Homogenous Method (MOE Concept)
Case Number	Indicator	EN	LR-LS	LR-SVM	SVM	ANN	Best Homogenous Method (MOE Concept)
1	RMSE	0.289	0.133	0.145	0.135	0.068	ANN
	RIE	0.293	0.135	0.147	0.137	0.069	ANN
	MAPE	1610.2	1880.7	2142.1	1988.4	126.6	ANN
	ICC	0.875	0.536	0.392	0.530	0.914	ANN
2	RMSE	0.299	0.1519	0.123	0.117	0.132	SVM
	RIE	0.306	0.155	0.126	0.120	0.135	SVM
	MAPE	2845.92	2659.4	2191.2	1990.1	1087.3	ANN
	ICC	0.905	0.709	0.843	0.853	0.794	EN
3	RMSE	0.304	0.157	0.077	0.084	0.140	LR-SVM
	RIE	0.318	0.164	0.081	0.087	0.147	LR-SVM
	MAPE	4467.8	2935.4	846.1	835.3	1239.5	SVM
	ICC	0.986	0.826	0.960	0.953	0.870	EN
4	RMSE	0.317	0.135	0.076	0.085	0.152	LR-SVM
	RIE	0.336	0.143	0.081	0.090	0.161	LR-SVM
	MAPE	5952.1	2536.1	916.7	988.2	1584.6	LR-SVM
	ICC	0.950	0.900	0.969	0.962	0.876	LR-SVM

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kłosowski, G.; Rymarczyk, T.; Niderla, K.; Rzemieniak, M.; Dmowski, A.; Maj, M. Comparison of Machine Learning Methods for Image Reconstruction Using the LSTM Classifier in Industrial Electrical Tomography. Energies 2021, 14, 7269. https://doi.org/10.3390/en14217269

AMA Style

Kłosowski G, Rymarczyk T, Niderla K, Rzemieniak M, Dmowski A, Maj M. Comparison of Machine Learning Methods for Image Reconstruction Using the LSTM Classifier in Industrial Electrical Tomography. Energies. 2021; 14(21):7269. https://doi.org/10.3390/en14217269

Chicago/Turabian Style

Kłosowski, Grzegorz, Tomasz Rymarczyk, Konrad Niderla, Magdalena Rzemieniak, Artur Dmowski, and Michał Maj. 2021. "Comparison of Machine Learning Methods for Image Reconstruction Using the LSTM Classifier in Industrial Electrical Tomography" Energies 14, no. 21: 7269. https://doi.org/10.3390/en14217269

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Comparison of Machine Learning Methods for Image Reconstruction Using the LSTM Classifier in Industrial Electrical Tomography

Abstract

1. Introduction

2. Materials and Methods

2.1. Research Object

2.2. Data Preparation

2.3. The Concept of the Method Oriented Ensemble (MOE)

2.4. Elastic Net (EN)

2.5. Linear Regression with Least-Squares Learner (LR-LS)

2.6. Linear Regression with Support Vector Machine Learner (LR-SVM)

2.7. Support Vector Machine (SVM)

2.8. Artificial Neural Network (ANN)

2.9. The Long Short-Term Memory (LSTM) Network for Classification

3. Results and Discussion

3.1. Visualizations of Real Measurements

3.2. Comparison of the Reconstructions Based on Simulation Data

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI