A randomized-algorithm-based decomposition-ensemble learning methodology for energy price forecasting
Introduction
In the era of big data, energy price prediction, which is for capturing evolution laws of energy systems based on sufficient history observations (a typical case of big data) and thus providing a reliable future evaluation, has become an increasingly hot but challenging issue [1]. On the one hand, with the rapid development of the Internet and big data techniques, there exist a rich of data available concerning energy markets, which requires an urgent innovation of energy price forecasting techniques toward fast algorithms. Taking oil prices for example, besides historical time series data in different oil markets (e.g., the Brent and West Texas Intermediate), there are also a lot of myriad information with reference to the influencing factors like market factors (e.g., supplies and demands) [2] and external factors (e.g., substitutability with other energy resources, weather, stock levels, economic growth, political changes, demographics, emergency events, and even psychological expectations) [3]. Accordingly, a fast learning algorithm is extremely desirable to effectively process these big data and produce prediction results rapidly. On the other hand, given that a high level of noise cannot be avoided within energy systems, how to capture the true information and enhance prediction accuracy still remains a key issue in the area of energy price prediction. For example, an accurate prediction for oil prices can help to improve the corresponding plans of production, marketing and investment, control potential risks and increase future profits in the oil-related sectors [4]. For this purpose, this paper will focus on improving the existing forecasting techniques toward efficient and fast algorithms, in the context of big data.
According to the existing studies, there are an abundance of forecasting techniques for energy prices, which generally fall into three major groups—econometric approaches, computational intelligences (CIs) and hybrid algorithms (integrating two or more single models in any aforementioned type(s)) [5]. In energy price prediction, for example, popular econometric models are auto-regressive integrated moving average (ARIMA) [6], generalized autoregressive conditional heteroskedasticity (GARCH) [7], random walk (RW) [8], vector auto-regression (VAR) [9] and error correction models (ECM) [10]. However, these models hold the data assumptions of stationarity and nonlinearity which contradict the real energy systems. In such a context, CIs have been become the most dominant approaches in energy price forecasting, such as artificial neural networks (ANNs) [5], support vector regression (SVR) [11], least squares support vector regression (LSSVR) [12] and various CI-based optimization tools. However, these conventional CI techniques have two intrinsic shortcomings—time consuming and parameter sensitivity [13]. For example, ANNs, using gradient descent methods for tuning parameters (such as weights and bias), take a long training time but frequently fall into local optimum [14]. Similarly, SVR and LSSVR, using iterative learning algorithms (such as the grid searching method or the trial-and-error method) to determine regularization and kernel parameters, cannot avoid the double problems of time consuming and parameter sensitivity [15]. Due to the respective disadvantages of the first two types, the third type, i.e., hybrid techniques combining two or more single algorithms, have emerged and offered an excellent performance in energy price prediction.
In particular, the decomposition-ensemble learning paradigms based on the promising concept of “decomposition and ensemble” has been widely considered as an excellent case among hybrid methods [16]. In a typical decomposition-ensemble model, three major steps are included—data decomposition to decompose the original complex data into relatively simple components for reducing data complexity, individual prediction to model each extracted component independently, and results ensemble to aggregate individual predictions to the final predictions [12,16]. The superiority of decomposition-ensemble techniques has been proved in terms of prediction accuracy in the forecasting of energy prices such as oil prices [6] and gas prices [17]. However, the “decomposition and ensemble” strategy poses a big challenge, i.e., a large computational burden for modeling all the decomposed components individually. Furthermore, most existing decomposition-ensemble models employed CI-based individual predictors with iterative tuning processes, such as ANN, SVR and LSSVR [5,11,12], which largely aggravates the time-consuming problem [14]. In addition, the performances of these CI algorithms are heavily dependent on the predesigned parameters concerning the iterative learning process, any one of which set inappropriately will make a great difference in the final prediction [15]. In this sense, the emerging decomposition-ensemble learning methodology severely suffers from the double problems of time consuming and parameter sensitivity. For this purpose, this study will try to address the both issues of time consuming and parameter sensitivity.
Fortunately, the double issues of time consuming and parameter sensitivity can be nicely overcome by using the interesting idea of randomization, thereby an extremely efficient and fast decomposition-ensemble learning methodology can be developed. Based on randomization, some randomized algorithms have recently presented and shown excellent capabilities in terms of fast speed and prediction accuracy. In particular, the randomized algorithms employ randomly fixed parameters, randomly mapping features, randomly generated samples or randomly selected variables rather than iteratively tuned ones in conventional CIs, which effectively ensures an extremely fast learning speed and an excellent generalization performance [14]. Furthermore, without setting stopping criteria, learning rate, learning epochs and other parameters in learning processes, the problem of parameter sensitivity can be greatly solved. Typical cases include extreme learning machine (ELM) [18] and random vector functional links (RVFL) network [19] using randomly fixed input weights and hidden bias in neural networks, random kitchen sinks (RKS) [20] using randomly mapping features to approximate shift invariant kernels, and random forest (RF) [21] using randomly bootstrapped samples and randomly selected variables to grow a decision tree. These above randomized algorithms have extensively been applied to various complex systems such as electricity load [22,23], electricity production [24], pedestrian detection [25], wave energy flux [26], water demand [27], wind farm power ramp rates [28], mineral prospectivity [29], etc. Therefore, this paper try to introduce such emerging randomized algorithms to formulate some efficient and fast decomposition-ensemble learning models. To have a better overview of the conventional CIs and randomized algorithms, Table 1 synthetically presented the advantages and disadvantages of those intelligent approaches.
By using randomized algorithms, some randomized-algorithm-based decomposition-ensemble learning models have recently been developed and obtained satisfactory forecasting results. For example, Tang et al. [35] introduced ELM as the individual predictor into the “decomposition and ensemble” framework and observed the effectiveness of the proposed methodology in oil price prediction in terms of time-saving and accuracy. Wang et al. [36] used two-phase decomposition technique and modified extreme learning machine to forecast air quality index. Lu and Shao [37] developed an ensemble learning approach with ELM as the individual forecasting tool for computer products sales forecasting. Shrivastava [38] built a wavelet-based ELM decomposition-ensemble model for electricity price forecasting. Tang et al. [15] using RVFL developed a decomposition-ensemble learning paradigm for oil price forecasting. However, to the best of our knowledge, there were few decomposition-ensemble learning paradigms by using other probably more competitive randomized algorithms like RKS etc. Therefore, this study especially fills in such a literature gap by introducing various promising randomized algorithms and conducting a thorough comparison to explore whether the idea of randomization does improve the existing decomposition-ensemble learning paradigms in terms of speed and accuracy.
Generally speaking, this study aims to formulate some efficient and fast decomposition-ensemble learning models by using the emerging randomized algorithms in individual prediction for energy price forecasting, which well solves the double problems of time consuming and parameter sensitivity. The major contributions of this study can be summarized into two perspectives. First, by introducing various randomized algorithms, a series of randomized-algorithm-based decomposition-ensemble models are formulated. Second, a thorough comparison is conducted to check whether the idea of randomization does improve the existing decomposition-ensemble learning paradigms from the perspectives of speed and accuracy, and to explore the most efficient and fast one in energy price prediction.
The rest of this study is organized as follows. Section 2 describes the formulation process of the proposed methodology. For illustration and verification, the proposed methodology is performed to predict the Brent crude oil spot prices and the Henry Hub natural gas prices, as the results presented in Section 3. Finally, Section 4 concludes the major contributions of the paper, and discusses some interesting directions for future research.
Section snippets
Methodology formulation
This section presents the formulation process of the proposed methodology. In particular, Section 2.1 designs the general model framework, and Sections 2.2 EEMD, 2.3 Randomized algorithms describe the related techniques in detail.
Empirical study
For illustration, the Brent crude oil spot prices and the Henry Hub natural gas prices are selected as the studying samples. The experiment is designed in Section 3.1, and Section 3.2 presents the empirical results and discusses whether the proposed model by using randomized algorithms statistically improves the energy price prediction in terms of speed, accuracy and robustness.
Conclusions
To solve the time-consuming issue in existing decomposition-ensemble learning paradigms, this study introduces the emerging randomized algorithms to develop an efficient and fast decomposition-ensemble learning methodology for energy price forecasting. In particular, the randomized algorithms use randomization, in terms of randomly fixed parameters, randomly mapping features, randomly generated samples or randomly selected variables rather than iteratively tuned ones, which effectively ensures
Acknowledgements
This work is supported by grants from the National Natural Science Foundation of China (NSFC Nos. 71622011, 71433001 and 71301006), the National Program for Support of Top Notch Young Professionals, and Beijing Advanced Innovation Center for Soft Matter Science and Engineering.
References (45)
- et al.
Crude oil price behaviour before and after military conflicts and geopolitical events
Energy
(2017) - et al.
A deep learning ensemble approach for crude oil price forecasting
Energy Econ
(2017) - et al.
Modeling natural gas price volatility: the case of the UK gas market
Energy
(2014) - et al.
Forecasting oil price movements with crack spread futures
Energy Econ
(2009) - et al.
Modeling the price relationships between crude oil, energy crops and biofuels
Energy
(2016) - et al.
Modeling and forecasting cointegrated relationships among heavy oil and product prices
Energy Econ
(2005) - et al.
Forecasting carbon price using empirical mode decomposition and evolutionary least squares support vector regression
Appl Energy
(2017) - et al.
A survey of randomized algorithms for training neural networks
Inf Sci
(2016) - et al.
Short-term electricity price forecasting with empirical mode decomposition based ensemble kernel machines
Procedia Comput Sci
(2017) - et al.
Short-term electricity demand and gas price forecasts using wavelet transforms and adaptive models
Energy
(2010)
Extreme learning machine: theory and applications
Neurocomputing
Learning and generalization characteristics of the random vector functional-link net
Neurocomputing
Random vector functional link network for short-term electricity load demand forecasting
Inf Sci
GEFCom2012: electric load forecasting and backcasting with semi-parametric models
Int J Forecast
A benchmark of statistical regression methods for short-term forecasting of photovoltaic electricity production, part I: deterministic forecast of hourly production
Sol Energy
A high accuracy pedestrian detection system combining a cascade AdaBoost detector and random vector functional-link net
Sci World J
Short-term forecasting of the wave energy flux: analogues, random forests, and physics-based models
Ocean Eng
Predictive models for forecasting hourly urban water demand
J Hydrol
Machine learning predictive models for mineral prospectivity: an evaluation of neural networks, random forest, regression trees and support vector machines
Ore Geol Rev
Advantages and disadvantages of using artificial neural networks versus logistic regression for predicting medical outcomes
J Clin Epidemiol
A hybrid particle swarm optimization–back-propagation algorithm for feedforward neural network training
Appl Math Comput
A novel hybrid model for air quality index forecasting based on two-phase decomposition technique and modified extreme learning machine
Sci Total Environ
Cited by (62)
A novel hybrid model for crude oil price forecasting based on MEEMD and Mix-KELM
2024, Expert Systems with ApplicationsA novel interval-based hybrid framework for crude oil price forecasting and trading
2024, Energy EconomicsEnsemble learning methods using the Hodrick–Prescott filter for fault forecasting in insulators of the electrical power grids
2023, International Journal of Electrical Power and Energy SystemsRandom vector functional link network: Recent developments, applications, and future directions
2023, Applied Soft Computing