Prediction of rainfall time series using modular soft computingmethods
Introduction
An accurate and timely rainfall forecast is crucial for reservoir operation and flooding prevention because it can provide an extension of lead-time of the flow forecast, larger than the response time of the watershed, in particular for small and medium-sized mountainous basins.
Rainfall prediction is a very complex problem. Simulating the response using conventional approaches in modeling rainfall time series is far from a trivial task since the hydrologic processes are complex and involve various inherently complex predictors such as geomorphologic and climatic factors, which are still not well understood. As such, the artificial neural network algorithm becomes an attractive inductive approach in rainfall prediction owing to their highly nonlinearity, flexibility and data-driven learning in building models without any prior knowledge about catchment behavior and flow processes. They are purely based on the information retrieved from the hydro-meteorological data and act as blackbox.
Many studies have been conducted for the quantitative precipitation forecast (QPF) using diverse techniques including numerical weather prediction (NWP) models and remote sensing observations (Davolio et al., 2008, Diomede et al., 2008, Ganguly and Bras, 2003, Sheng et al., 2006, Yates et al., 2000), statistical models (Chan and Shi, 1999, Chu and He, 1995, DelSole and Shukla, 2002, Li and Zeng, 2008, Munot and Kumar, 2007, Nayagam et al., 2008), chaos-based approach (Jayawardena and Lai, 1994), non-parametric nearest-neighbors method (Toth et al., 2000), and soft computing-based methods including artificial neural networks (ANN), support vector regression (SVR) and fuzzy logic (FL) (Brath et al., 2002, Dorum et al., 2010, Guhathakurta, 2008, Nasseri et al., 2008, Pongracz et al., 2001, Sedki et al., 2009, Silverman and Dracup, 2000, Sivapragasam et al., 2001, Surajit and Goutami, 2007, Talei et al., 2010, Toth et al., 2000, Venkatesan et al., 1997). The contemporary studies focused on soft computing-based methods. Several examples of such methods can be mentioned. Venkatesan et al. (1997) employed the ANN to predict the all India summer monsoon rainfall with different meteorological parameters as model inputs. Chattopadhyay and Chattopadhyay (2008a) constructed an ANN model to predict monsoon rainfall in India depending on the rainfall series alone. The fuzzy logic theory was applied to monthly rainfall prediction by Pongracz et al. (2001). Toth et al. (2000) applied three time series models, auto-regressive moving average (ARMA), ANN and k-nearest-neighbors (KNN) method, to short-term rainfall prediction. The results showed that the ANN performed the best in the improvement of the runoff forecasting accuracy when the predicted rainfall was used as inputs of the rainfall-runoff model. ANN has also been applied on general circulation model (GCM). Chadwick et al. (2011) employed an artificial neural network approach to downscale GCM temperature and rainfall fields to regional model scale over Europe. Sachindra et al. (2011) developed a model with various soft computing techniques capable of statistically downscaling monthly GCM outputs to catchment scale monthly streamflows, accounting for the climate change.
Recently, models based on combining concepts have been paid more attention in hydrologic forecasting. Depending on different combination methods, combining models can be categorized into ensemble models and modular (or hybrid) models. The basic idea behind the ensemble models is to build several different or similar models for the same process and to combine them in a combining method (Abrahart and See, 2002, Kim et al., 2006, Shamseldin et al., 1997, Shamseldin and O'Connor, 1999, Xiong et al., 2001). For example, Xiong et al. (2001) used a Takagi-Sugeno fuzzy technique to combine several conceptual rainfall-runoff models. Coulibaly et al. (2005) employed an improved weighted-average method to coalesce forecasted daily reservoir inflows from the KNN model, conceptual model and ANN model. Kim et al. (2006) investigated five combining methods for improving ensemble streamflow prediction.
Physical processes in rainfall and/or runoff are generally composed of a number of sub-processes so that their accurate modeling by the building of a single global model is often not possible. Modular models are therefore proposed where sub-processes are first of all identified and then separate models (also called local or expert model) are established for each of them (Solomatine and Ostfeld, 2008). In these modular models, the split of training data can be soft or crisp. The soft split means the dataset can be overlapped and the overall forecasting output is the weighted-average of each local model (Shrestha and Solomatine, 2006, Zhang and Govindaraju, 2000, Wang et al., 2006, Wu et al., 2008). Zhang and Govindaraju (2000) examined the performance of modular networks in predicting monthly discharges based on the Bayesian concept. Wu et al. (2008) employed a distributed SVR for daily river stage prediction. On the contrary, there is no overlap of data in the crisp split and the final forecasting output is generated explicitly from one of the local models (Corzo and Solomatine, 2007, Jain and Srinivasulu, 2006, See and Openshaw, 2000, Sivapragasam and Liong, 2005, Solomatine and Xue, 2004). Solomatine and Xue (2004) used M5 model trees and neural networks in a flood-forecasting problem. Sivapragasam and Liong (2005) divided the flow range into three regions, and employed different SVR models to predict daily flows in high, medium and low regions.
Apart from the adoption of the modular model, the improvement of predictions may be expected by suitable data preprocessing techniques. Besides the conventional rescaling or standardization of training data, preprocessing methods from the perspective of signal analysis are also crucial because rainfall time series may be also viewed as a quasi-periodic signal, which is contaminated by various noises. Hence techniques such as singular spectrum analysis (SSA) were recently introduced to hydrology field by some researchers (Marques et al., 2006, Partal and Kişi, 2007, Sivapragasam et al., 2001). Sivapragasam et al. (2001) established a hybrid model of support vector machine (SVM) and the SSA for rainfall and runoff predictions. The hybrid model resulted in a considerable improvement in the model performance in comparison with the original SVM model. The application of wavelet analysis to precipitation was undertaken by Partal and Kişi (2007). Their results indicated that the wavelet analysis was highly promising. In addition, the issue of lagged predictions in the ANN model was mentioned by some researchers (Dawson and Wilby, 2001, Jain and Srinivasulu, 2004, De Vos and Rientjes, 2005, Muttil and Chau, 2006). A main reason on lagged predictions was the use of previous observed data as ANN inputs (De Vos and Rientjes, 2005). An effective solution was to obtain new model inputs by moving average over the original data series.
The scope of this study was to investigate the effect of the MA and SSA as data-preprocessing techniques and to couple with modular models in improving model performance for rainfall prediction. The modular model included three local models which were associated with three crisp subsets (low-, medium- and high-intensity rainfall) clustered by fuzzy C-mean (FCM) method. The ANN was first used to choose data-preprocessing method from MA and SSA. Depending on the selected data-preprocessing technique, modular models were employed to perform rainfall prediction. Generally, the ANN is very efficient in processing large-size training samples due to its parallel information processing configuration. The biggest drawback is that the model outputs are variable because of the random initialization of weights and biases. The SVR holds a good generalization and more stable model outputs. However, it is suitable for a small-size training sample (e.g. below 200) because the training time exponentially increases with the size of training samples. For the current rainfall data, the majority of subsets after data split belong to a small-size sample except for the low-intensity daily rainfall. Therefore, three local SVRs (hereafter referred as to MSVR) were employed for monthly rainfall data whereas two local SVRs and one ANN (hereafter referred to as ANN-SVR) were adopted for daily rainfall data. For daily rainfall record, the low-intensity subset was modeled by the ANN because it was overwhelming in the training data. For the comparison purpose, the global ANN and the persistence model were used as benchmarks. To ensure generalization of this study, four cases consisting of two monthly rainfall series and two daily rainfall series from India and China, were explored.
Section snippets
Moving average (MA)
The moving average method smoothes data by replacing each data point with the average of the K neighboring data points, where K may be called the length of memory window. The basic idea behind the method is that any large irregular component at any point in time will exert a smaller effect if we average the point with its immediate neighbors (Newbold et al., 2003). The most common moving average method is the unweighted moving average, in which each value of the data carries the same weight in
Case study
Two daily mean rainfall series (at Zhenwan and Wuxi raingauge stations, respectively) from Zhenshui and Da'ninghe watersheds of China, and two monthly mean rainfall series from India and Zhongxian raingauge station of China, were analyzed in thisstudy.
The Zhenshui basin is located in the north of Guangdong province and adjoined by Hunan province and Jianxi Province. The basin belongs to a second-order tributary of the Pearl River and has an area of 7554 km2. The daily rainfall time series data
Decomposition of rainfall data
The decomposition of the daily average rainfall series requires identifying the window length m (or the singular number) if the interval of neighboring points in discrete time series is defaulted as the lag time (i.e. τ=1 day for daily rainfall data or 1month for monthly rainfall data). The reasonable value of m should give rise to a clear resolution of the original signal. The present study does not need accurately resolve any trends or oscillations in the raw rainfall signal. A rough
Results
The overall performances of each model in terms of RMSE, CE, and PI are presented in Table 2 for two monthly rainfall series and Table 3 for two daily rainfall series. It can be seen that two benchmark models of persistence and ANN demonstrated very poor performances for all four cases except for India. The performances from ANN-MA and ANN-SSA indicate that data-preprocessing methods resulted in considerable improvement in the accuracy of the rainfall forecasting. Moreover, the MA seems
Conclusions
The purpose of this study was to investigate the effect of modular models coupled with data-preprocessing techniques in improving the accuracy of rainfall forecasting. The modular models consisted of three local SVR and/or ANN. A three-layer feed-forward ANN was used to examine two data-preprocessing techniques, MA and SSA. Results show that the MA was superior to the SSA. Four rainfall records, India, Zhongxian, Wuxi and Zhenwan, from India and China, were used as testingcases.
With the help of
References (66)
- et al.
Input determination for neural network models in water resources applications: Part 1—background and methodology
J. Hydrol.
(2005) - et al.
Optimized scenario for rainfall forecasting using genetic algorithm coupled with artificial neural network
Expert Syst. Appl.
(2010) - et al.
Integrated approach to model decomposed flow hydrograph using artificial neural network and conceptual techniques
J. Hydrol.
(2006) - et al.
Analysis and prediction of chaos in rainfall and stream-flow time-series
J. Hydrol.
(1994) Constructing neural network sediment estimation models using a data-driven algorithm
Math. Comput. Simulation
(2008)- et al.
Singular spectral analysis and forecasting of hydrological time series
Phys. Chem. Earth
(2006) - et al.
Optimized scenario for rainfall forecasting using genetic algorithm coupled with artificial neural network
Expert Syst. Appl.
(2008) - et al.
Wavelet and Neuro-fuzzy conjunction model for precipitation forecasting
J. Hydrol.
(2007) - et al.
Fuzzy rule-based prediction of monthly precipitation
Phys. Chem. Earth Part B-Hydrol. Oceans Atmos.
(2001) - et al.
Evolving neural network using real coded genetic algorithm for daily rainfall–runoff forecasting
Expert Syst. Appl.
(2009)
Methods for combining the outputs of different rainfall-runoff models
J. Hydrol.
A novel application of a neuro-fuzzy computational technique in event-based rainfall–runoff modeling
Expert Syst. Appl.
Comparison of short-term rainfall prediction models for real-time flood forecasting
J. Hydrol.
Singular-spectrum analysis: a toolkit for short, noisy and chaotic signals
Physica D
Forecasting daily streamflow using hybrid ANN models
J. Hydrol.
River stage prediction based on a distributed support vector regression
J. Hydrol.
A non-linear combination of the forecasts of rainfall-runoff models by the first-order Takagi-Sugeno fuzzy system
J. Hydrol.
Support vector regression for real-time flood stage forecasting
J. Hydrol.
Multi-model data fusion for river flow forecasting: an evaluation of six alternative methods based on two contrasting catchment
Hydrol. Earth Syst. Sci.
Neural networks and non-parametric methods for improving real time flood forecasting through conceptual hydrological models
Hydrol. Earth Syst. Sci.
An artificial neural network technique for downscaling GCM outputs to RCM spatial scale
Nonlinear Processes Geophys.
Prediction of the summer monsoon rainfall over South China
Int. J. Climatol.
Identification of the best hidden layer size for three-layered neural net in predicting monsoon rainfall in India
J. Hydroinformatics
Comparative study among different neural net learning algorithms applied to rainfall time series
Meteorol. Appl.
Neural networks: a review from a statistical perspective
Stat. Sci.
Long-range prediction of Hawaiian winter rainfall using canonical correlation-analysis
Int. J. Climatol.
Baseflow separation techniques for modular artificial neural network modelling in flow forecasting
Hydrol. Sci. J.
Improving daily reservoir inflow forecasts with model combination
J. Hydrol. Eng.
Hydrological modeling using artificial neural networks
Prog. Phys. Geography
A meteo-hydrological prediction system based on a multi-model approach for precipitation forecasting
Nat. Hazards Earth Syst. Sci.
Linear prediction of Indian monsoon rainfall
J. Climate
Constraints of artificial neural networks for rainfall-runoff modeling: trade-offs in hydrological state representation and model evaluation
Hydrol. Earth Syst. Sci.
Discharge prediction based on multi-model precipitation forecasts
Meteorol. Atmos. Phys.
Cited by (273)
Comparative analysis of different rainfall prediction models: A case study of Aligarh City, India
2024, Results in EngineeringSeries decomposition Transformer with period-correlation for stock market index prediction
2024, Expert Systems with ApplicationsA new few-shot learning model for runoff prediction: Demonstration in two data scarce regions
2023, Environmental Modelling and SoftwareApplication of time series models to rainfall forecasting in Senai, Johor
2024, AIP Conference Proceedings