Molten steel temperature prediction model based on bootstrap Feature Subsets Ensemble Regression Trees

doi:10.1016/j.knosys.2016.02.018

Knowledge-Based Systems

Volume 101, 1 June 2016, Pages 48-59

https://doi.org/10.1016/j.knosys.2016.02.018 Get rights and content

Highlights

•
Large-scale and noise data impose strong restrictions on building temperature models.
•
To solve these two issues, the BFSE-RTs method is proposed in this paper.
•
First, feature subsets are constructed based on multivariate fuzzy Taylor theorem.
•
Second, smaller-scale and lower-dimensional bootstrap replications are used.
•
Third, considering its simplicity, an RT is built on replications of each feature subset.

Abstract

Molten steel temperature prediction is important in Ladle Furnace (LF). Most of the existing temperature models have been built on small-scale data. The accuracy and the generalization of these models cannot satisfy industrial production. Now, the large-scale data with more useful information are accumulated from the production process. However, the data are with noise. Large-scale and noise data impose strong restrictions on building a temperature model. To solve these two issues, the Bootstrap Feature Subsets Ensemble Regression Trees (BFSE-RTs) method is proposed in this paper. Firstly, low-dimensional feature subsets are constructed based on the multivariate fuzzy Taylor theorem, which saves more memory space in computers and indicates ``smaller-scale'' data sets are used. Secondly, to eliminate the noise, the bootstrap sampling approach of the independent identically distributed data is applied to the feature subsets. Bootstrap replications consist of smaller-scale and lower-dimensional samples. Thirdly, considering its simplicity, a Regression Tree (RT) is built on each bootstrap replication. Lastly, the BFSE-RTs method is used to establish a temperature model by analyzing the metallurgic process of LF. Experiments demonstrate that the BFSE-RTs outperforms other estimators, improves the accuracy and the generalization, and meets the requirements of the RMSE and the maximum error on the temperature prediction.

Introduction

Ladle furnace (LF) steel refining technology plays a substantial role in secondary metallurgic process. When LF is taken over at downstream secondary metallurgy units or at a continuous caster, the main purpose of LF refining processing is to get the qualified molten steel temperature and composition [1], [2], [3], [4], [5]. In practice, the molten steel temperature cannot be measured continuously, making it difficult to realize the precise controlling. Therefore, it is important to build a precise molten steel temperature prediction model.

Several molten steel temperature models of LF based on thermodynamics and conservation of energy are developed in earlier works. [6] ``However, these models cannot be used efficiently for online accurate prediction because the parameters are hard to obtain. It is attributed to the harsh operating environment of ladle metallurgy, especially the high temperatures and corrosive slag associated with the process. Practically, some of the parameters are estimated according to experience. Consequently, it is hard to ensure the accuracy of mechanism models'' [1].

To overcome the limitations of mechanism models, various data-driven modeling methods have been applied to establish the temperature prediction models. For example, Sun et al. [7] build a temperature model based on the Neural Networks (NN). Later, a PLS-SVM based temperature model is proposed by Wang [8], where input variables are firstly dealt with Partial Least Squares (PLS) algorithm to get rid of the linear dependency and then Support Vector Machine (SVM) is utilized to establish the prediction model. However, these temperature models only learn from a single model, and their performance are hard to improve when the object is complicated or the samples are with high noise [9].

Compared with a single model, an ensemble model can further improve the accuracy and the generalization [10], [11]. In the past decade, ensemble methods have been applied to establish temperature models. Tian & Mao [1] propose to establish an ensemble temperature prediction model based on a modified AdaBoost.RT algorithm. Lv et al. [9] propose a hybrid model in which the pruned Bagging model based on the negative correlation learning is used to predict the unknown parameters and the undefined function of the thermal model simultaneously.

Today, the accuracy and the generalization of most existing molten steel temperature models cannot satisfy industrial production. They have been estimated on small-scale data. With the development of the information and the computer techniques, large-scale data are accumulated from the production process in LF. They contain more useful information, and make it possible to improve the accuracy and the generalization of the temperature prediction.

However, large-scale data impose more restrictions on modeling and increase the costs of building models [12]. Furthermore, high complexity models increase the computational burden in the phase of actual use [13] and cannot be used efficiently for online accurate prediction. The existing ensemble temperature models are not suitable for large-scale data. In [1], the Extreme Learning Machine (ELM) [14] is selected as the base learner and the modified AdaBoost.RT is employed as the ensemble structure. Although the learning speed of the ELM is extremely fast, the AdaBoost.RT is a serial ensemble in which every new sub-model relies on previously built sub-models. In [9], the pruning process of sub-models is also one by one, and Bagging is changed into a serial ensemble. A serial ensemble is often more complex than a single model [15], especially on large-scale data. Additionally, data sampled from the production process are with noise, which reduces the accuracy and the generalization of the temperature prediction.

To deal with the large-scale and the noise issues, the Bootstrap Feature Subsets Ensemble Regression Trees (BFSE-RTs) method is proposed in this paper. Firstly, low-dimensional feature subsets are constructed based on the multivariate fuzzy Taylor theorem [16], which saves more memory space in computers and indicates ``smaller-scale'' data sets are used. Secondly, to eliminate the noise data, the bootstrap sampling approach [17] of independent identically distributed data is introduced into the feature subsets. Bootstrap replications consist of smaller-scale and lower-dimensional samples. Thirdly, considering its simplicity the Regression Tree (RT) [18] is employed as the base learner, and a RT is built on each bootstrap replication. The BFSE-RTs method is expected to successfully utilize the large-scale data accumulated from the production process in LF, to improve the accuracy and the generalization of the temperature prediction, and to meet the requirements that the Root Mean Square Errors (RMSE) of the temperature is less than 3 °C and the maximum error of it is less than 10 °C.

The remainder of this paper is organized as follows. Section 2 introduces the mechanism of the production process of LF. In Section 3, existing ensemble methods are reviewed. In Section 4, the BFSE-RTs method is proposed, and the differences from other ensembles are introduced. In Section 5, the experimental investigations are brought out, and the BFSE-RTs temperature model is compared with the FSE-RTs, the RF, the stacking trees, the modified AdaBoost.RT, and the pruned Bagging temperature models. In Section 6, the conclusion of this paper is summarized.

Section snippets

The mechanism of production process of LF

To establish the data-driven based model of the temperature in LF, the energy change during the refining process in Fig. 1 is considered [1], [6], [14].

The refining process starts from the entry of LF and ends at the exit of LF. The data are sampled in the entire refining process that is subdivided into many different sampling periods. In any sampling period, the same steps are executed, and they are the argon bowing and slag adding, the power supply, and the power off.

In any sampling period,

Review of ensembles

Ensembles of learnt models constitute one of the main directions in machine learning and data mining [19]. An aggregation of many simple but different predictors can reduce the complexity and achieve better performance which cannot be achieved by a single model [11], [20]. Thus, we focus on building an ensemble temperature prediction model.

A training set L consists of data ${(X_{k}, Y_{k})}_{k}^{N},$ where $X = {x_{1}, . . ., x_{n}} \in R^{n}$ is n-dimensional input, Y ∈ Ris the 1-dimensional output, and N is the size of L (i.e.

The FSE-RTs method

To handle the large-scale data, based on the multivariate fuzzy Taylor theorem, the Feature Subsets Ensemble (FSE) method proposed and presented in our paper [12] is selected. In the FSE method, low-dimensional feature subsets are constructed, which saves more memory space in computers and indicates ``smaller-scale'' data sets are used.

Considering its simplicity, the RT [18] is employed as the base learner of the FSE model. In Fig. 2 the structure of an FSE-RTs model is shown. The generalized

Experimentation

The number of the input features is 10, and the 1714 data of production from 300t LF are used. Ninety-five percent of the data are randomly selected without replacement from the original data set to train the model, and the others are used to test. The data are normalized in the range [–1,1].

For all estimators, we repeat this process 20 times and present the statistical results (say average and standard deviation) of obtained performance. Any estimator is constructed using 10-fold

Conclusions

The main contribution of this paper is to significantly improve the accuracy and the generalization of the temperature prediction in LF by the proposed BFSE-RTs method on large-scale and noise data accumulated from the production process.

On both the training and the testing sets, the RMSE and the maximum error of the BFSE-RTs temperature model are less than those of the FSE-RTs, the RF, the stacking trees, the modified AdaBoost.RT and the pruned Bagging temperature models. The BFSE-RTs

Acknowledgments

We would like to thank the Editor-in-Chief, the Associate Editor, and the anonymous reviewers for their careful reading of the manuscript and constructive comments.

This work is supported by the National Natural Science Foundation of China (Nos.61425002, 31370778, 613700057, 61300015, 31170797, 61103057), Program for Changjiang Scholars and Innovative Research Team in University (No.IRT_15R07), the Program for Liaoning Innovative Research Team in University (No.LT2015002), the Basic Research

References (36)

H.X. Tian et al.
An ensemble ELM based on modified AdaBoost.RT algorithm for predicting the temperature of molten steel in ladle furnace
IEEE Trans. Autom. Sci. Eng.
(2010)
W. Lǚ et al.
Ladle furnace liquid steel temperature prediction model based on optimally pruned bagging
J. Iron Steel Res. Int.
(2012)
F. He et al.
End temperature prediction of molten steel in LF based on CBR
Steel Res. Int.
(2012)
W. Lv et al.
Multi-kernel learnt partial linear regularization network and its application to predict the liquid steel temperature in ladle furnace
Knowl. Based Syst.
(2012)
W. Lv et al.
Corrigendum to ‘‘Multi-kernel learnt partial linear regularization network and its application to predict the liquid steel temperature in ladle furnace’’
Knowl. Based Syst.
(2013)
U. Camdali et al.
Steady state heat transfer of ladle furnace during steel production process
J. Iron Steel Res. Int.
(2006)
Y.G. Sun
An intelligent ladle furnace control system
X.L. Wang
The Research of LF Liquid Steel Temperature Prediction based on PLS-SVM Algorithm
(2007)
W. Lv et al.
Pruned bagging aggregated hybrid prediction models for forecasting the steel temperature in ladle furnace
Steel Res. Int.
(2014)
L.K. Hansen et al.
Neural network ensembles
IEEE Trans. Pattern Anal. Mach. Intell.
(1990)

L. Breiman

Random forests

Mach. Learn.

(2001)

X. Wang et al.

Ensemble fixed-size LS-SVMs applied for the Mach number prediction in transonic wind tunnel

IEEE Trans. Aerospace Electron. Syst.

(2015)

V. Ceperic et al.

Recurrent sparse support vector regression machines trained by active learning in the time-domain

Expert Syst. Appl.

(2012)

H. Tian et al.

Hybrid modeling of molten steel temperature prediction in LF

ISIJ Int.

(2008)

E. Tuv et al.

Feature selection with ensembles, artificial variables, and redundancy elimination

J. Mach. Learn. Res.

(2009)

G.A. Anastassiou

Fuzzy Mathematics: Approximation Theory

(2010)

E. Bradley

Bootstrap methods: another look at the jackknife

Ann. Stat.

(1979)

L. Breiman, J.H. Friedman, R.A. Olshen, C.J. Stone, Classification and regression trees, CRC Press,...

Cited by (19)

A framework based on heterogeneous ensemble models for liquid steel temperature prediction in LF refining process[Formula presented]
2022, Applied Soft Computing
The precise control of liquid steel temperature in the ladle furnace (LF) refining process is vital for stabilizing and improving the quality of liquid steel, necessitating a capable prediction system. To achieve better predictive performance, an effective prediction framework based on heterogeneous ensemble models is proposed in this paper, which mainly consists of three parts: (1) utilizing single models and tree-based ensemble models to constitute the heterogeneous ensemble model; (2) proposing Recursive Feature Increase (RFI) to facilitate the construction of the heterogeneous ensemble model, including Stacking and Majority Voting; (3) proposing a new optimization algorithm, namely Recursive Search Optimization (RSO), to optimize the hyper-parameters of the heterogeneous ensemble model. Through the verification of the collected industrial production data, it is found that the proposed framework in this paper possesses higher fitting and generalization ability, which is of great significance for engineering applications such as liquid steel temperature prediction in the LF refining process.
Boosting the prediction of molten steel temperature in ladle furnace with a dynamic outlier ensemble
2022, Engineering Applications of Artificial Intelligence
Citation Excerpt :
In literature, many researches have witnessed the superiority of data-driven predictors over those model-based ones. Accordingly, many machine learning algorithms have been developed to prediction models of MST (Wang et al., 2016b; Yuan et al., 2015; Tian and Mao, 2010; Lv et al., 2012c, 2014, 2012a; Wang et al., 2016a). However, the high requirement on data quality of these data models has greatly restricted their applications in practical projects.
Molten steel temperature prediction is a critical step in the development of level-two control systems for ladle furnace. Many machine learning algorithms have been employed to complete such a work. Whereas data-driven predictors often deteriorate due to the presence of outliers in practical applications. This paper proposes to boost the predictive performance via outlier detection. Specifically, a dynamic outlier ensemble is developed inspired by the superiority of dynamic classifier selection in classification. Clustering analysis is used to determine the region of competence, on which base detectors are selected with the dedicated measure. The reason for the usage of clustering analysis lies in its efficiency during online detection. One attribute weighting algorithm is used to enhance the capability of clustering in outlier detection. The information behind regression is used to facilitate the measure of competence, results of which can promote the performance of predictors. Such a strategy can achieve double-win from the perspective of regression and outlier detection. Extensive experiments on real-world data sets show that results of all 4 predictive models with respect to accuracy and hit rate can be improved. Moreover, the detection performance in terms of G-mean and F1 score of our detector has also been confirmed via the comparison with 8 competitors.
Dynamic selective Gaussian process regression for forecasting temperature of molten steel in ladle furnace
2022, Engineering Applications of Artificial Intelligence
Citation Excerpt :
Superiority of these data models over mechanical models has been validated. However, in situation where training data are noisy, the performance of these data models will deteriorate (Wang et al., 2016a). In order to alleviate this problem, several ensemble predictors have been proposed for temperature prediction in LF.
The requirement for intelligent steelmaking has underlined the significance of data-driven predictions of molten steel temperature in ladle furnace. Recently, predictors based on ensemble learning have shown their superiority over single ones. However, the strong reliability on the ensemble diversity can hardly insure their generalization ability. Moreover, most existing predictors cannot provide statistical meaning to their outputs. This has degraded their engineering value. In this paper, we aim to address these two problems in one scheme, where a dynamic regression ensemble of Gaussian process models is built. Our dynamic ensemble will select the most competent individual for each test pattern according to the competence estimated by informative neighbors. To this end, a distance measure based on RReliefF is constructed to search for these neighbors, rather than traditional K-nearest neighbor. Several evaluation indexes are combined by a meta regressor so that more robust estimation of competence can be achieved. A Bayesian nonparametric model is used for ensemble generation in order to obtain statistical predictions. A data set from real-world ladle furnace is used to verify the effectiveness of the proposed predictor. According to the comparative results, we have found the superiority of our dynamic ensemble over static ensembles and single predictors. Furthermore, the improvement over existing dynamic ensembles has also been confirmed.
Random ensemble of fuzzy rule-based models
2019, Knowledge-Based Systems
Citation Excerpt :
For example, the least squares support vector regression method has been used in an ensemble to predict hydropower consumption [17]. Bootstrap feature subsets ensemble regression trees have been designed for dealing with large-scale and noisy data [16], and fuzzy regression tree forest has been proposed in [18] and [19] to improve the performance of a single fuzzy regression tree. In [20], fuzzy linear model trees have been arranged into an ensemble topology for multioutput systems.
Fuzzy rule-based models, due to modular architecture, have attracted attention and resulted in some practical implications because of their nonlinear characteristics and substantial interpretability. Data-driven fuzzy modeling is one of the most prevailing approaches, and the performance of such fuzzy models has been directly affected by a fundamental bias–variance dilemma. The concept and ensuing topologies of the ensemble strategy (bagging and boosting) offer an efficient method for constructing models to address this dilemma and to achieve a sound tradeoff. In this study, we design an ensemble fuzzy rule-based model in the setting of random forest and boosting mechanisms. To demonstrate the feasibility of the proposed method, we focus on the regression type of models. First, we design a method for assembling fuzzy rule-based models to improve the prediction accuracy. Second, we quantify the performance of the ensemble mechanism. To illustrate the effectiveness and discuss the main features of the proposed method, a series of publicly available datasets are considered in the experimental studies.
The soft sensor of the molten steel temperature using the modified maximum entropy based pruned bootstrap feature subsets ensemble method
2018, Chemical Engineering Science
Citation Excerpt :
To investigate how well the proposed soft sensor based on the MMEP-BFSE method works on the molten steel temperature prediction in ladle furnace, the empirical study was performed. The data set was constructed by 1714 samples accumulated from the refining process, each sample with one output feature (i.e., the molten steel temperature) and 10 input features (i.e., “the heat effects of additions, the time interval of temperature measure, the volume of argon purging, the refining power consumption, the initial temperature, the refining time, the temperature of the empty ladle, the ladle states, the number of the ladle, and the weight of molten steel” (Wang et al., 2016). The data set was divided into the training set and the testing set.
The molten steel temperature in ladle furnace is a significant variable, but it is hard to be measured by real-time detection, which has some bad effects on productions. Soft sensors are alternative and effective techniques to solve this issue. In this paper, the soft sensor of the molten steel temperature established by the Modified Maximum Entropy based Pruned Bootstrap Feature Subsets Ensemble (MMEP-BFSE) method is proposed. Although the Bootstrap Feature Subsets Ensemble (BFSE) temperature model is prominent in the precision and the forecasting speed on the large-scale and noisy data, its main drawback is too many sub-models required to combine, which is not always feasible for applications. To alleviate this drawback, the Modified Maximum Entropy based Pruning (MMEP) approach is presented, in which a subset of sub-models that better approximates the complete ensemble is find based on the maximum Rényi entropy and the trade-off parameter between the precision and the diversity of sub-models. Then, the soft sensor of the temperature based on the MMEP-BFSE is established on the practical data. Experiments show that the proposed soft sensor outperforms the others in the precision, and meets the precision requirements. Sub-models of the BFSE temperature model are substantially pruned with improved generalization by the MMEP approach.
A prediction and outlier detection scheme of molten steel temperature in ladle furnace
2018, Chemical Engineering Research and Design
Molten steel temperature prediction is a crucial step in ladle furnaces (LFs). Due to the complicated working conditions, process data usually suffers from various types of outliers. However, most of existing temperature models have not taken robustness to outliers into account. Hence, their accuracies usually cannot satisfy the industrial production. In this paper, we propose a comprehensive scheme that integrates temperature prediction with outlier detection. Of this scheme, we develop a three-level ensemble model where Gaussian process (GP) is used as the base learner, to accomplish the prediction task. Motivation for GP base leaner is two-fold. One is that GP models perform well on the nonlinear regression problem. The other is that GP is a Bayesian method and its output can be used in the outlier detection step. Motivation for our ensemble model is also two-fold. First two problems regarding GP, i.e. high computational complexity and model selection, can be alleviated. Second, the predictive accuracy can be further improved. As for the outlier detection task, we develop two types of detectors implemented for both training and testing data points. We proposed a novel detector based on one-class classification (OCC) and use it for training samples and inputs of testing data. And the detector for the output values of testing data is constructed from outputs of the prediction model. Finally, we verify the prediction performance on several real-world data sets and compare their performance with several competitors. The significance of the proposed outlier detection step is also validated. Experimental results approve the potential of our scheme.

View all citing articles on Scopus

View full text

Molten steel temperature prediction model based on bootstrap Feature Subsets Ensemble Regression Trees

Highlights

Abstract

Introduction

Section snippets

The mechanism of production process of LF

Review of ensembles

The FSE-RTs method

Experimentation

Conclusions

Acknowledgments

An ensemble ELM based on modified AdaBoost.RT algorithm for predicting the temperature of molten steel in ladle furnace

IEEE Trans. Autom. Sci. Eng.

Ladle furnace liquid steel temperature prediction model based on optimally pruned bagging

J. Iron Steel Res. Int.

End temperature prediction of molten steel in LF based on CBR

Steel Res. Int.

Multi-kernel learnt partial linear regularization network and its application to predict the liquid steel temperature in ladle furnace

Knowl. Based Syst.

Corrigendum to ‘‘Multi-kernel learnt partial linear regularization network and its application to predict the liquid steel temperature in ladle furnace’’