Elsevier

Neurocomputing

Volume 127, 15 March 2014, Pages 172-180
Neurocomputing

Particle swarm optimization for construction of neural network-based prediction intervals

https://doi.org/10.1016/j.neucom.2013.08.020Get rights and content

Abstract

Point forecasts suffer from unreliable and uninformative problems when the uncertainty level increases in data. Prediction intervals (PIs) have been proposed in the literature to quantify uncertainties associated with point forecasts. In this paper, a newly introduced method called Lower Upper Bound Estimation (LUBE) (Khosravi et al., 2011, [1]) is applied and extended for construction of PIs. The LUBE method adopts a neural network (NN) with two outputs to directly generate the upper and lower bounds of PIs without making any assumption about the data distribution. A new width evaluation index that is suitable for NN training is proposed. Further a new cost function is developed for the comprehensive evaluation of PIs based on their width and coverage probability. The width index is replaced by the new one and PSO with mutation operator is used for minimizing the cost function and adjusting NN parameters in the LUBE method. By introducing these two changes we observe dramatic improvements in the quality of results and speed. Demonstrated results for six synthetic and real-world case studies indicate that the proposed PSO-based LUBE method is very efficient in constructing high quality PIs in a short time.

Introduction

Since neural networks (NNs) were first proposed by McCulloch and Pitts in 1943, they have been successfully applied to different areas. Multilayer feedforward NNs are theoretically universal approximators [2]. Due to the strong approximation capacity and learning ability, NNs are suitable for prediction and regression problems. In literatures there are numerous applications, such examples include, but not limited to, electric power systems [3], [4], [5], transportation systems [6], pattern recognition [7], [8], [9], and financial price forecasting [10].

In regression application of NNs, they have been mainly used for generating point forecasts. However, the reliability of point forecasts significantly drops when the level of uncertainty increases. The problem becomes more severe when datasets are multivalued, sparse, noisy or the target values are affected by some probabilistic events [11]. Nowadays, real world man-made systems become more and more complex, and they have a high level of uncertainty. Point forecasts are more likely unreliable and questionable for these applications. Compared with prediction intervals (PIs), there is another disadvantage for point forecasts. NN point forecasts only provide point prediction values but convey no information about the prediction accuracy [11]. For example, point forecasts provide only the prediction error but tell nothing about the probability for correct predictions. This makes decision-making more problematic, as limited information is available by predicted values.

Unlike point forecasts, PI is a powerful tool for uncertainty modeling by its nature [12]. By definition, a PI consists of lower and upper bounds that bracket a future unknown target value with a certain probability ((1α)%) called the confidence level [13]. PIs not only provide a range that targets are highly likely to lie within, but also have an indication of their accuracy called the confidence level. PIs are more reliable and informative than point forecasts for decision makers. Using high quality PIs, the decision makers can confidently draw up future plans, better manage risks, and maximize their benefits.

Traditional methods for construction of NN-based PIs are delta, Bayesian and bootstrap [12]. In spite of advantages of PIs, applications of these methods are still less popular than NN point forecasts. One important reason is that the implementation of these methods is complex. For instance, the delta and Bayesian methods need to calculate the Jacobian matrix and Hessian matrix of the parameters separately [14]. In each iteration, the Jacobian or Hessian matrix needs to be updated, which is very time consuming. Also calculation of derivatives suffers from singularity problems that decrease the reliability of PIs. On the other hand, traditional methods make assumptions about the data distribution. The delta method assumes that the noises are normally distributed and t-distribution is applied [15]. Bootstrap method assumes that an ensemble of NN models will produce a less biased estimate of the true regression of the targets [16]. Implementation difficulties, special assumption about the data distribution, and massive computational requirements hinder widespread applications of these methods for decision-making.

Khosravi et al. in [1] proposed a new method for construction of PIs called lower upper bound estimation (LUBE) method. LUBE method applies a NN with two outputs to directly generate the upper and lower bounds of PIs. This method makes no assumptions about data distribution and avoids calculation of derivatives such as Jacobian and Hessian matrix. Therefore, it is easy for implementation and fast in generating PIs. Comparative results reported in [1] reveal that the LUBE method is simpler, faster, and more reliable than traditional methods for PI construction. For high quality PIs in real-world applications, a higher coverage probability and narrower width are always expected. After introducing two quantitative PI evaluation indices, specially, a new width evaluation index that is suitable for NN training is proposed. A comprehensive measure to evaluate both coverage probability and width of PIs is also developed. This measure was successfully applied to different areas, such as electrical load forecasting [17], travel time prediction [12] and industrial systems [18].

The measurement cost function is nonlinear, complex, discontinuous and nondifferentiable. Traditional derivative-based algorithms cannot be applied for its minimization. In [1], simulated annealing algorithm is applied to minimize the cost function. Obtained results can be significantly improved in case of using more powerful optimization methods such as the PSO algorithm. PSO is powerful for parameter optimization, especially for optimization of NN connection weights in numerous literatures. The mutation operator, which helps to achieve diversity in evolution algorithm, is also integrated into PSO to improve the exploratory capabilities and help to jump out of local optima.

The main contributions of the paper are listed below. (1) A new PI width evaluation index, which is suitable for NN training, is proposed to help improve the quality of PIs. (2) PSO associated with mutation operator is firstly integrated into the LUBE method called the PSO-based LUBE method. This PSO associated with mutation operator has a very strong searching ability. (3) Demonstrated results from six case studies indicate that the quality of PIs has been significantly improved compared with the same case studies of two journal papers [1], [11]. (4) PI construction time of the PSO-based LUBE method is also much shorter than both the delta [11] and LUBE [1] method.

The rest of this paper is organized as follows. Section 2 describes indices used for PI evaluation. Section 3 introduces the proposed PSO-based LUBE method. Case studies, results and discussions are presented in 4 Case studies, 5 Results and discussions respectively. Finally, Section 6 concludes the whole paper and introduces the future work.

Section snippets

PI evaluation indices

Before discussing the evaluation indices for PIs, the evaluation indices for point forecasts are reviewed. Two frequently used evaluation indices are mean square errors (MSE) and mean absolute percentage errors (MAPE). Both MAPE and MSE are used as the cost function in the learning algorithms such as back propagation (BP) for training NN parameters. These indices are also calculated for examining performance of models for test samples. Likewise, PIs, which estimate the lower and upper bounds

LUBE method

Lower upper bound estimation (LUBE) method is a new proposed method for construction of PIs in [1]. The basic concept of LUBE method is to adopt a NN with two outputs to directly generate the upper and lower bounds of PIs. The first and second outputs correspond to the upper and lower bounds of PIs separately. The symbolic NN with two outputs for the LUBE method is shown in Fig. 1. The real NN architectures will vary in different applications.

For a typical three-layered NN, the mathematical

Datasets

Six case studies are used to examine performance of the PSO-based LUBE method. The first two cases are synthetic with heterogeneous and homogenous noise. The other four cases are from real world systems. Table 1 summarizes these case studies. These cases are as follows:

  • 1.

    Ding10 is a one-dimensional synthetic mathematical function, f(x)=x2+sin(x)+2+ε, where x is randomly generated in [10,10], and ε is the added noise with a heterogeneous distribution.

  • 2.

    HAS is a five-dimensional synthetic

Training process

For construction of NN-based PIs, representation of results consists of mainly three parts, the training process, test results and discussions on computation time. As shown in Fig. 4, the training process shows the convergence behavior of the algorithm. Through training, the value of cost function becomes smaller and smaller. The test results are based on the constructed PIs for test samples. Through testing, the performance of the trained NN is validated. In this section, the training process,

Conclusions

A recently introduced PI construction method named LUBE method is applied and extended in this paper. To quantitatively evaluate the quality of PIs, PI evaluation indices for both coverage probability and width are introduced. Specially, a new width evaluation index PINRW, which is suitable for NN training, is proposed. Compared with PINAW which gives equal weights to all intervals, PINRW magnifies the wider PIs. Case study results show that the new index can obtain averagely narrower PIs with

Acknowledgments

This work was supported by National Research Foundation programme grant, NRF-2007EWT-CERP01-0954 (R-263-000-522-272). A. Khosravi would like to acknowledge the financial support by the Centre for Intelligent Systems Research (CISR) at Deakin University. The authors are grateful to the reviewers for their helpful comments and suggestions.

Hao Quan obtained his B.E. degree in Water Conservancy and Hydropower Engineering from Huazhong University of Science and Technology (HUST), Wuhan, China, in 2008. He joined the National University of Singapore (NUS) in January 2011, and he is currently pursuing the Ph.D. degree in the Department of Electrical and Computer Engineering (ECE) at NUS. His main areas of interest are uncertainty modeling in distributed power systems, neural networks, evolutionary computation, unit commitment

References (31)

  • A. Khosravi et al.

    Lower upper bound estimation method for construction of neural network-based prediction intervals

    IEEE Trans. Neural Netw.

    (2011)
  • H. Hippert et al.

    Neural networks for short-term load forecastinga review and evaluation

    IEEE Trans. Power Syst.

    (2001)
  • K. Methaprayoon et al.

    An integration of ann wind power estimation into unit commitment considering the forecasting uncertainty

    IEEE Trans. Ind. Appl.

    (2007)
  • D. Srinivasan et al.

    Neural networks for real-time traffic signal control

    IEEE Trans. Intell. Transp. Syst.

    (2006)
  • A. Jain et al.

    Statistical pattern recognitiona review

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2000)
  • Cited by (99)

    View all citing articles on Scopus

    Hao Quan obtained his B.E. degree in Water Conservancy and Hydropower Engineering from Huazhong University of Science and Technology (HUST), Wuhan, China, in 2008. He joined the National University of Singapore (NUS) in January 2011, and he is currently pursuing the Ph.D. degree in the Department of Electrical and Computer Engineering (ECE) at NUS. His main areas of interest are uncertainty modeling in distributed power systems, neural networks, evolutionary computation, unit commitment scheduling, etc.

    Dipti Srinivasan received the M.E. and Ph.D. degrees in electrical engineering from the National University of Singapore (NUS) in 1991 and 1994, respectively. She worked at the University of California at Berkeley's Computer Science Division as a Postdoctoral Researcher from 1994 to 1995. In June 1995, she joined the faculty of the Electrical and Computer Engineering Department at the National University of Singapore, where she is an Associate Professor. From 1998 to 1999 she was a Visiting Faculty in the Department of Electrical and Computer Engineering at the Indian Institute of Science, Bangalore. Her research interest is in the development of hybrid neural network architectures, learning methods and their practical applications for large complex engineered systems, such as the electric power system and urban transportation systems. Dr. Srinivasan is currently serving as an Associate Editor of IEEE Transaction of Neural Networks and IEEE Transactions on Intelligent Transportation Systems. She was awarded the IEEE PES Outstanding Engineer award in 2010.

    Abbas Khosravi received B.Sc. in Elec. Eng. from Sharif University of Technology, Iran 2002, M.Sc. in Elec. Eng. from Amirkabir University of Technology, Iran 2005, and Ph.D. from Deakin University, Australia 2010. In 2006–2007, he was with the eXiT Group, University of Girona, Spain, conducting research in the field of artificial intelligence. Currently he is a research fellow in the Centre for Intelligent Systems Research (CISR) at Deakin University. His primary research interests include theory and application of neural networks and fuzzy logic systems for modeling, analysis, control, and optimization of operations within complex systems. Mr. Khosravi is recipient of Alfred Deakin Postdoctoral Research Fellowship in 2011.

    View full text