Molten steel temperature prediction model based on bootstrap Feature Subsets Ensemble Regression Trees
Introduction
Ladle furnace (LF) steel refining technology plays a substantial role in secondary metallurgic process. When LF is taken over at downstream secondary metallurgy units or at a continuous caster, the main purpose of LF refining processing is to get the qualified molten steel temperature and composition [1], [2], [3], [4], [5]. In practice, the molten steel temperature cannot be measured continuously, making it difficult to realize the precise controlling. Therefore, it is important to build a precise molten steel temperature prediction model.
Several molten steel temperature models of LF based on thermodynamics and conservation of energy are developed in earlier works. [6] ``However, these models cannot be used efficiently for online accurate prediction because the parameters are hard to obtain. It is attributed to the harsh operating environment of ladle metallurgy, especially the high temperatures and corrosive slag associated with the process. Practically, some of the parameters are estimated according to experience. Consequently, it is hard to ensure the accuracy of mechanism models'' [1].
To overcome the limitations of mechanism models, various data-driven modeling methods have been applied to establish the temperature prediction models. For example, Sun et al. [7] build a temperature model based on the Neural Networks (NN). Later, a PLS-SVM based temperature model is proposed by Wang [8], where input variables are firstly dealt with Partial Least Squares (PLS) algorithm to get rid of the linear dependency and then Support Vector Machine (SVM) is utilized to establish the prediction model. However, these temperature models only learn from a single model, and their performance are hard to improve when the object is complicated or the samples are with high noise [9].
Compared with a single model, an ensemble model can further improve the accuracy and the generalization [10], [11]. In the past decade, ensemble methods have been applied to establish temperature models. Tian & Mao [1] propose to establish an ensemble temperature prediction model based on a modified AdaBoost.RT algorithm. Lv et al. [9] propose a hybrid model in which the pruned Bagging model based on the negative correlation learning is used to predict the unknown parameters and the undefined function of the thermal model simultaneously.
Today, the accuracy and the generalization of most existing molten steel temperature models cannot satisfy industrial production. They have been estimated on small-scale data. With the development of the information and the computer techniques, large-scale data are accumulated from the production process in LF. They contain more useful information, and make it possible to improve the accuracy and the generalization of the temperature prediction.
However, large-scale data impose more restrictions on modeling and increase the costs of building models [12]. Furthermore, high complexity models increase the computational burden in the phase of actual use [13] and cannot be used efficiently for online accurate prediction. The existing ensemble temperature models are not suitable for large-scale data. In [1], the Extreme Learning Machine (ELM) [14] is selected as the base learner and the modified AdaBoost.RT is employed as the ensemble structure. Although the learning speed of the ELM is extremely fast, the AdaBoost.RT is a serial ensemble in which every new sub-model relies on previously built sub-models. In [9], the pruning process of sub-models is also one by one, and Bagging is changed into a serial ensemble. A serial ensemble is often more complex than a single model [15], especially on large-scale data. Additionally, data sampled from the production process are with noise, which reduces the accuracy and the generalization of the temperature prediction.
To deal with the large-scale and the noise issues, the Bootstrap Feature Subsets Ensemble Regression Trees (BFSE-RTs) method is proposed in this paper. Firstly, low-dimensional feature subsets are constructed based on the multivariate fuzzy Taylor theorem [16], which saves more memory space in computers and indicates ``smaller-scale'' data sets are used. Secondly, to eliminate the noise data, the bootstrap sampling approach [17] of independent identically distributed data is introduced into the feature subsets. Bootstrap replications consist of smaller-scale and lower-dimensional samples. Thirdly, considering its simplicity the Regression Tree (RT) [18] is employed as the base learner, and a RT is built on each bootstrap replication. The BFSE-RTs method is expected to successfully utilize the large-scale data accumulated from the production process in LF, to improve the accuracy and the generalization of the temperature prediction, and to meet the requirements that the Root Mean Square Errors (RMSE) of the temperature is less than 3 °C and the maximum error of it is less than 10 °C.
The remainder of this paper is organized as follows. Section 2 introduces the mechanism of the production process of LF. In Section 3, existing ensemble methods are reviewed. In Section 4, the BFSE-RTs method is proposed, and the differences from other ensembles are introduced. In Section 5, the experimental investigations are brought out, and the BFSE-RTs temperature model is compared with the FSE-RTs, the RF, the stacking trees, the modified AdaBoost.RT, and the pruned Bagging temperature models. In Section 6, the conclusion of this paper is summarized.
Section snippets
The mechanism of production process of LF
To establish the data-driven based model of the temperature in LF, the energy change during the refining process in Fig. 1 is considered [1], [6], [14].
The refining process starts from the entry of LF and ends at the exit of LF. The data are sampled in the entire refining process that is subdivided into many different sampling periods. In any sampling period, the same steps are executed, and they are the argon bowing and slag adding, the power supply, and the power off.
In any sampling period,
- (i)
Review of ensembles
Ensembles of learnt models constitute one of the main directions in machine learning and data mining [19]. An aggregation of many simple but different predictors can reduce the complexity and achieve better performance which cannot be achieved by a single model [11], [20]. Thus, we focus on building an ensemble temperature prediction model.
A training set L consists of data where is n-dimensional input, Y ∈ Ris the 1-dimensional output, and N is the size of L (i.e.
The FSE-RTs method
To handle the large-scale data, based on the multivariate fuzzy Taylor theorem, the Feature Subsets Ensemble (FSE) method proposed and presented in our paper [12] is selected. In the FSE method, low-dimensional feature subsets are constructed, which saves more memory space in computers and indicates ``smaller-scale'' data sets are used.
Considering its simplicity, the RT [18] is employed as the base learner of the FSE model. In Fig. 2 the structure of an FSE-RTs model is shown. The generalized
Experimentation
The number of the input features is 10, and the 1714 data of production from 300t LF are used. Ninety-five percent of the data are randomly selected without replacement from the original data set to train the model, and the others are used to test. The data are normalized in the range [–1,1].
For all estimators, we repeat this process 20 times and present the statistical results (say average and standard deviation) of obtained performance. Any estimator is constructed using 10-fold
Conclusions
The main contribution of this paper is to significantly improve the accuracy and the generalization of the temperature prediction in LF by the proposed BFSE-RTs method on large-scale and noise data accumulated from the production process.
On both the training and the testing sets, the RMSE and the maximum error of the BFSE-RTs temperature model are less than those of the FSE-RTs, the RF, the stacking trees, the modified AdaBoost.RT and the pruned Bagging temperature models. The BFSE-RTs
Acknowledgments
We would like to thank the Editor-in-Chief, the Associate Editor, and the anonymous reviewers for their careful reading of the manuscript and constructive comments.
This work is supported by the National Natural Science Foundation of China (Nos.61425002, 31370778, 613700057, 61300015, 31170797, 61103057), Program for Changjiang Scholars and Innovative Research Team in University (No.IRT_15R07), the Program for Liaoning Innovative Research Team in University (No.LT2015002), the Basic Research
References (36)
- et al.
An ensemble ELM based on modified AdaBoost.RT algorithm for predicting the temperature of molten steel in ladle furnace
IEEE Trans. Autom. Sci. Eng.
(2010) - et al.
Ladle furnace liquid steel temperature prediction model based on optimally pruned bagging
J. Iron Steel Res. Int.
(2012) - et al.
End temperature prediction of molten steel in LF based on CBR
Steel Res. Int.
(2012) - et al.
Multi-kernel learnt partial linear regularization network and its application to predict the liquid steel temperature in ladle furnace
Knowl. Based Syst.
(2012) - et al.
Corrigendum to ‘‘Multi-kernel learnt partial linear regularization network and its application to predict the liquid steel temperature in ladle furnace’’
Knowl. Based Syst.
(2013) - et al.
Steady state heat transfer of ladle furnace during steel production process
J. Iron Steel Res. Int.
(2006) An intelligent ladle furnace control system
The Research of LF Liquid Steel Temperature Prediction based on PLS-SVM Algorithm
(2007)- et al.
Pruned bagging aggregated hybrid prediction models for forecasting the steel temperature in ladle furnace
Steel Res. Int.
(2014) - et al.
Neural network ensembles
IEEE Trans. Pattern Anal. Mach. Intell.
(1990)
Random forests
Mach. Learn.
Ensemble fixed-size LS-SVMs applied for the Mach number prediction in transonic wind tunnel
IEEE Trans. Aerospace Electron. Syst.
Recurrent sparse support vector regression machines trained by active learning in the time-domain
Expert Syst. Appl.
Hybrid modeling of molten steel temperature prediction in LF
ISIJ Int.
Feature selection with ensembles, artificial variables, and redundancy elimination
J. Mach. Learn. Res.
Fuzzy Mathematics: Approximation Theory
Bootstrap methods: another look at the jackknife
Ann. Stat.
Cited by (19)
Boosting the prediction of molten steel temperature in ladle furnace with a dynamic outlier ensemble
2022, Engineering Applications of Artificial IntelligenceCitation Excerpt :In literature, many researches have witnessed the superiority of data-driven predictors over those model-based ones. Accordingly, many machine learning algorithms have been developed to prediction models of MST (Wang et al., 2016b; Yuan et al., 2015; Tian and Mao, 2010; Lv et al., 2012c, 2014, 2012a; Wang et al., 2016a). However, the high requirement on data quality of these data models has greatly restricted their applications in practical projects.
Dynamic selective Gaussian process regression for forecasting temperature of molten steel in ladle furnace
2022, Engineering Applications of Artificial IntelligenceCitation Excerpt :Superiority of these data models over mechanical models has been validated. However, in situation where training data are noisy, the performance of these data models will deteriorate (Wang et al., 2016a). In order to alleviate this problem, several ensemble predictors have been proposed for temperature prediction in LF.
Random ensemble of fuzzy rule-based models
2019, Knowledge-Based SystemsCitation Excerpt :For example, the least squares support vector regression method has been used in an ensemble to predict hydropower consumption [17]. Bootstrap feature subsets ensemble regression trees have been designed for dealing with large-scale and noisy data [16], and fuzzy regression tree forest has been proposed in [18] and [19] to improve the performance of a single fuzzy regression tree. In [20], fuzzy linear model trees have been arranged into an ensemble topology for multioutput systems.
The soft sensor of the molten steel temperature using the modified maximum entropy based pruned bootstrap feature subsets ensemble method
2018, Chemical Engineering ScienceCitation Excerpt :To investigate how well the proposed soft sensor based on the MMEP-BFSE method works on the molten steel temperature prediction in ladle furnace, the empirical study was performed. The data set was constructed by 1714 samples accumulated from the refining process, each sample with one output feature (i.e., the molten steel temperature) and 10 input features (i.e., “the heat effects of additions, the time interval of temperature measure, the volume of argon purging, the refining power consumption, the initial temperature, the refining time, the temperature of the empty ladle, the ladle states, the number of the ladle, and the weight of molten steel” (Wang et al., 2016). The data set was divided into the training set and the testing set.
A prediction and outlier detection scheme of molten steel temperature in ladle furnace
2018, Chemical Engineering Research and Design