Elsevier

Knowledge-Based Systems

Volume 101, 1 June 2016, Pages 48-59
Knowledge-Based Systems

Molten steel temperature prediction model based on bootstrap Feature Subsets Ensemble Regression Trees

https://doi.org/10.1016/j.knosys.2016.02.018Get rights and content

Highlights

  • Large-scale and noise data impose strong restrictions on building temperature models.

  • To solve these two issues, the BFSE-RTs method is proposed in this paper.

  • First, feature subsets are constructed based on multivariate fuzzy Taylor theorem.

  • Second, smaller-scale and lower-dimensional bootstrap replications are used.

  • Third, considering its simplicity, an RT is built on replications of each feature subset.

Abstract

Molten steel temperature prediction is important in Ladle Furnace (LF). Most of the existing temperature models have been built on small-scale data. The accuracy and the generalization of these models cannot satisfy industrial production. Now, the large-scale data with more useful information are accumulated from the production process. However, the data are with noise. Large-scale and noise data impose strong restrictions on building a temperature model. To solve these two issues, the Bootstrap Feature Subsets Ensemble Regression Trees (BFSE-RTs) method is proposed in this paper. Firstly, low-dimensional feature subsets are constructed based on the multivariate fuzzy Taylor theorem, which saves more memory space in computers and indicates ``smaller-scale'' data sets are used. Secondly, to eliminate the noise, the bootstrap sampling approach of the independent identically distributed data is applied to the feature subsets. Bootstrap replications consist of smaller-scale and lower-dimensional samples. Thirdly, considering its simplicity, a Regression Tree (RT) is built on each bootstrap replication. Lastly, the BFSE-RTs method is used to establish a temperature model by analyzing the metallurgic process of LF. Experiments demonstrate that the BFSE-RTs outperforms other estimators, improves the accuracy and the generalization, and meets the requirements of the RMSE and the maximum error on the temperature prediction.

Introduction

Ladle furnace (LF) steel refining technology plays a substantial role in secondary metallurgic process. When LF is taken over at downstream secondary metallurgy units or at a continuous caster, the main purpose of LF refining processing is to get the qualified molten steel temperature and composition [1], [2], [3], [4], [5]. In practice, the molten steel temperature cannot be measured continuously, making it difficult to realize the precise controlling. Therefore, it is important to build a precise molten steel temperature prediction model.

Several molten steel temperature models of LF based on thermodynamics and conservation of energy are developed in earlier works. [6] ``However, these models cannot be used efficiently for online accurate prediction because the parameters are hard to obtain. It is attributed to the harsh operating environment of ladle metallurgy, especially the high temperatures and corrosive slag associated with the process. Practically, some of the parameters are estimated according to experience. Consequently, it is hard to ensure the accuracy of mechanism models'' [1].

To overcome the limitations of mechanism models, various data-driven modeling methods have been applied to establish the temperature prediction models. For example, Sun et al. [7] build a temperature model based on the Neural Networks (NN). Later, a PLS-SVM based temperature model is proposed by Wang [8], where input variables are firstly dealt with Partial Least Squares (PLS) algorithm to get rid of the linear dependency and then Support Vector Machine (SVM) is utilized to establish the prediction model. However, these temperature models only learn from a single model, and their performance are hard to improve when the object is complicated or the samples are with high noise [9].

Compared with a single model, an ensemble model can further improve the accuracy and the generalization [10], [11]. In the past decade, ensemble methods have been applied to establish temperature models. Tian & Mao [1] propose to establish an ensemble temperature prediction model based on a modified AdaBoost.RT algorithm. Lv et al. [9] propose a hybrid model in which the pruned Bagging model based on the negative correlation learning is used to predict the unknown parameters and the undefined function of the thermal model simultaneously.

Today, the accuracy and the generalization of most existing molten steel temperature models cannot satisfy industrial production. They have been estimated on small-scale data. With the development of the information and the computer techniques, large-scale data are accumulated from the production process in LF. They contain more useful information, and make it possible to improve the accuracy and the generalization of the temperature prediction.

However, large-scale data impose more restrictions on modeling and increase the costs of building models [12]. Furthermore, high complexity models increase the computational burden in the phase of actual use [13] and cannot be used efficiently for online accurate prediction. The existing ensemble temperature models are not suitable for large-scale data. In [1], the Extreme Learning Machine (ELM) [14] is selected as the base learner and the modified AdaBoost.RT is employed as the ensemble structure. Although the learning speed of the ELM is extremely fast, the AdaBoost.RT is a serial ensemble in which every new sub-model relies on previously built sub-models. In [9], the pruning process of sub-models is also one by one, and Bagging is changed into a serial ensemble. A serial ensemble is often more complex than a single model [15], especially on large-scale data. Additionally, data sampled from the production process are with noise, which reduces the accuracy and the generalization of the temperature prediction.

To deal with the large-scale and the noise issues, the Bootstrap Feature Subsets Ensemble Regression Trees (BFSE-RTs) method is proposed in this paper. Firstly, low-dimensional feature subsets are constructed based on the multivariate fuzzy Taylor theorem [16], which saves more memory space in computers and indicates ``smaller-scale'' data sets are used. Secondly, to eliminate the noise data, the bootstrap sampling approach [17] of independent identically distributed data is introduced into the feature subsets. Bootstrap replications consist of smaller-scale and lower-dimensional samples. Thirdly, considering its simplicity the Regression Tree (RT) [18] is employed as the base learner, and a RT is built on each bootstrap replication. The BFSE-RTs method is expected to successfully utilize the large-scale data accumulated from the production process in LF, to improve the accuracy and the generalization of the temperature prediction, and to meet the requirements that the Root Mean Square Errors (RMSE) of the temperature is less than 3 °C and the maximum error of it is less than 10 °C.

The remainder of this paper is organized as follows. Section 2 introduces the mechanism of the production process of LF. In Section 3, existing ensemble methods are reviewed. In Section 4, the BFSE-RTs method is proposed, and the differences from other ensembles are introduced. In Section 5, the experimental investigations are brought out, and the BFSE-RTs temperature model is compared with the FSE-RTs, the RF, the stacking trees, the modified AdaBoost.RT, and the pruned Bagging temperature models. In Section 6, the conclusion of this paper is summarized.

Section snippets

The mechanism of production process of LF

To establish the data-driven based model of the temperature in LF, the energy change during the refining process in Fig. 1 is considered [1], [6], [14].

The refining process starts from the entry of LF and ends at the exit of LF. The data are sampled in the entire refining process that is subdivided into many different sampling periods. In any sampling period, the same steps are executed, and they are the argon bowing and slag adding, the power supply, and the power off.

In any sampling period,

  • (i)

Review of ensembles

Ensembles of learnt models constitute one of the main directions in machine learning and data mining [19]. An aggregation of many simple but different predictors can reduce the complexity and achieve better performance which cannot be achieved by a single model [11], [20]. Thus, we focus on building an ensemble temperature prediction model.

A training set L consists of data {(Xk,Yk)}kN, where X={x1,...,xn}Rn is n-dimensional input, Y ∈ Ris the 1-dimensional output, and N is the size of L (i.e.

The FSE-RTs method

To handle the large-scale data, based on the multivariate fuzzy Taylor theorem, the Feature Subsets Ensemble (FSE) method proposed and presented in our paper [12] is selected. In the FSE method, low-dimensional feature subsets are constructed, which saves more memory space in computers and indicates ``smaller-scale'' data sets are used.

Considering its simplicity, the RT [18] is employed as the base learner of the FSE model. In Fig. 2 the structure of an FSE-RTs model is shown. The generalized

Experimentation

The number of the input features is 10, and the 1714 data of production from 300t LF are used. Ninety-five percent of the data are randomly selected without replacement from the original data set to train the model, and the others are used to test. The data are normalized in the range [–1,1].

For all estimators, we repeat this process 20 times and present the statistical results (say average and standard deviation) of obtained performance. Any estimator is constructed using 10-fold

Conclusions

The main contribution of this paper is to significantly improve the accuracy and the generalization of the temperature prediction in LF by the proposed BFSE-RTs method on large-scale and noise data accumulated from the production process.

On both the training and the testing sets, the RMSE and the maximum error of the BFSE-RTs temperature model are less than those of the FSE-RTs, the RF, the stacking trees, the modified AdaBoost.RT and the pruned Bagging temperature models. The BFSE-RTs

Acknowledgments

We would like to thank the Editor-in-Chief, the Associate Editor, and the anonymous reviewers for their careful reading of the manuscript and constructive comments.

This work is supported by the National Natural Science Foundation of China (Nos.61425002, 31370778, 613700057, 61300015, 31170797, 61103057), Program for Changjiang Scholars and Innovative Research Team in University (No.IRT_15R07), the Program for Liaoning Innovative Research Team in University (No.LT2015002), the Basic Research

References (36)

  • H.X. Tian et al.

    An ensemble ELM based on modified AdaBoost.RT algorithm for predicting the temperature of molten steel in ladle furnace

    IEEE Trans. Autom. Sci. Eng.

    (2010)
  • W. et al.

    Ladle furnace liquid steel temperature prediction model based on optimally pruned bagging

    J. Iron Steel Res. Int.

    (2012)
  • F. He et al.

    End temperature prediction of molten steel in LF based on CBR

    Steel Res. Int.

    (2012)
  • W. Lv et al.

    Multi-kernel learnt partial linear regularization network and its application to predict the liquid steel temperature in ladle furnace

    Knowl. Based Syst.

    (2012)
  • W. Lv et al.

    Corrigendum to ‘‘Multi-kernel learnt partial linear regularization network and its application to predict the liquid steel temperature in ladle furnace’’

    Knowl. Based Syst.

    (2013)
  • U. Camdali et al.

    Steady state heat transfer of ladle furnace during steel production process

    J. Iron Steel Res. Int.

    (2006)
  • Y.G. Sun

    An intelligent ladle furnace control system

  • X.L. Wang

    The Research of LF Liquid Steel Temperature Prediction based on PLS-SVM Algorithm

    (2007)
  • W. Lv et al.

    Pruned bagging aggregated hybrid prediction models for forecasting the steel temperature in ladle furnace

    Steel Res. Int.

    (2014)
  • L.K. Hansen et al.

    Neural network ensembles

    IEEE Trans. Pattern Anal. Mach. Intell.

    (1990)
  • L. Breiman

    Random forests

    Mach. Learn.

    (2001)
  • X. Wang et al.

    Ensemble fixed-size LS-SVMs applied for the Mach number prediction in transonic wind tunnel

    IEEE Trans. Aerospace Electron. Syst.

    (2015)
  • V. Ceperic et al.

    Recurrent sparse support vector regression machines trained by active learning in the time-domain

    Expert Syst. Appl.

    (2012)
  • H. Tian et al.

    Hybrid modeling of molten steel temperature prediction in LF

    ISIJ Int.

    (2008)
  • E. Tuv et al.

    Feature selection with ensembles, artificial variables, and redundancy elimination

    J. Mach. Learn. Res.

    (2009)
  • G.A. Anastassiou

    Fuzzy Mathematics: Approximation Theory

    (2010)
  • E. Bradley

    Bootstrap methods: another look at the jackknife

    Ann. Stat.

    (1979)
  • L. Breiman, J.H. Friedman, R.A. Olshen, C.J. Stone, Classification and regression trees, CRC Press,...
  • Cited by (19)

    • Boosting the prediction of molten steel temperature in ladle furnace with a dynamic outlier ensemble

      2022, Engineering Applications of Artificial Intelligence
      Citation Excerpt :

      In literature, many researches have witnessed the superiority of data-driven predictors over those model-based ones. Accordingly, many machine learning algorithms have been developed to prediction models of MST (Wang et al., 2016b; Yuan et al., 2015; Tian and Mao, 2010; Lv et al., 2012c, 2014, 2012a; Wang et al., 2016a). However, the high requirement on data quality of these data models has greatly restricted their applications in practical projects.

    • Dynamic selective Gaussian process regression for forecasting temperature of molten steel in ladle furnace

      2022, Engineering Applications of Artificial Intelligence
      Citation Excerpt :

      Superiority of these data models over mechanical models has been validated. However, in situation where training data are noisy, the performance of these data models will deteriorate (Wang et al., 2016a). In order to alleviate this problem, several ensemble predictors have been proposed for temperature prediction in LF.

    • Random ensemble of fuzzy rule-based models

      2019, Knowledge-Based Systems
      Citation Excerpt :

      For example, the least squares support vector regression method has been used in an ensemble to predict hydropower consumption [17]. Bootstrap feature subsets ensemble regression trees have been designed for dealing with large-scale and noisy data [16], and fuzzy regression tree forest has been proposed in [18] and [19] to improve the performance of a single fuzzy regression tree. In [20], fuzzy linear model trees have been arranged into an ensemble topology for multioutput systems.

    • The soft sensor of the molten steel temperature using the modified maximum entropy based pruned bootstrap feature subsets ensemble method

      2018, Chemical Engineering Science
      Citation Excerpt :

      To investigate how well the proposed soft sensor based on the MMEP-BFSE method works on the molten steel temperature prediction in ladle furnace, the empirical study was performed. The data set was constructed by 1714 samples accumulated from the refining process, each sample with one output feature (i.e., the molten steel temperature) and 10 input features (i.e., “the heat effects of additions, the time interval of temperature measure, the volume of argon purging, the refining power consumption, the initial temperature, the refining time, the temperature of the empty ladle, the ladle states, the number of the ladle, and the weight of molten steel” (Wang et al., 2016). The data set was divided into the training set and the testing set.

    View all citing articles on Scopus
    View full text