Abstract
The precise forecasting of water consumption is the basis in water resources planning and management. However, predicting water consumption fluctuations is complicated, given their non-stationary and non-linear characteristics. In this paper, a multiple random forests model, integrated wavelet transform and random forests regression (W-RFR), is proposed for the prediction of daily urban water consumption in southwest of China. Raw time series were first decomposed into low- and high-frequency parts with discrete wavelet transformation (DWT). The random forests regression (RFR) method was then used for prediction using each subseries. In the process, the input and output constructions of the RFR model were proposed for each subseries on the basis of the delay times and the embedding dimension of the attractor reconstruction computed by the C-C method, respectively. The forecasting values of each subseries were summarized as the final results. Four performance criteria, i.e., correlation coefficient (R), mean absolute percentage error (MAPE), normalized root mean square error (NRMSE) and threshold static (TS), were used to evaluate the forecasting capacity of the W-RFR. The results indicated that the W-RFR can capture the basic dynamics of the daily urban water consumption. The forecasted performance of the proposed approach was also compared with those of models, i.e., the RFR and forward feed neural network (FFNN) models. The results indicated that among the models, the precision of the predictions of the proposed model was greater, which is attributed to good feature extractions from the multi-scale perspective and favorable feature learning performance using the decision trees.







Similar content being viewed by others
References
Adamowski J, Karapataki C (2010) Comparison of multivariate regression and artificial neural networks for peak urban water-demand forecasting: evaluation of different ANN learning algorithms. J Hydrol Eng 15(10):729–743
Adusumilli S, Bhatt D, Wang H, Devabhaktuni V, Bhattacharya P (2015) A novel hybrid approach utilizing principal component regression and random forest regression to bridge the period of GPS outages. Neurocomputing 166:185–192
Altunkaynak A (2014) Predicting water level fluctuations in lake Michigan-Huron using wavelet-expert system methods. Water Resour Manag 28(8):2293–2314
Bai Y, Wang P, Li C, Xie J, Wang Y (2014) A multi-scale relevance vector regression approach for daily urban water demand forecasting. J Hydrol 517:236–245
Bai Y, Wang P, Li C, Xie J, Wang Y (2015) Dynamic forecast of daily urban water consumption using a variable-structure support vector regression model. J Water Resour Plan Manag 141(3):4014058
Breiman L (2001) Random forests. Mach Learn 45:5–32
Eynard J, Grieu S, Polit M (2011) Wavelet-based multi-resolution analysis and artificial neural networks for forecasting temperature and thermal power consumption. Eng Appl Artif Intell 24(3):501–516
Fazeli A, Bagheri M, Ghaniyari-Benis S, Aslebagh R, Kamaloo E (2011) Prediction of absolute entropy of ideal gas at 298K of pure chemicals through GAMLR and FFNN. Energy Convers Manag 52(1):630–634
Grossmann A, Morlet J (1984) Decomposition of Hardy function into square integrable wavelets of constant shape. J Math Anal Appl 5:723–736
Ho TK (1995) Random decision forest. IEEE Comput Soc 278–282
Ibarra-Berastegi G, Saénz J, Esnaola G, Ezcurra A, Ulazia A (2015) Short-term forecasting of the wave energy flux: analogues, random forests, and physics-based models. Ocean Eng 104:530–539
Janitza S, Tutz G, Boulesteix A (2016) Random forest for ordinal responses: prediction and variable selection. Comput Stat Data Anal 96:57–73
Kim HS, Eykholt R, Salas JD (1999) Nonlinear dynamics, delay times, and embedding windows. Phys D 127:48–60
Lee S, Lim JS, Kim J, Yang J, Lee Y (2014) Classification of normal and epileptic seizure EEG signals using wavelet transform, phase-space reconstruction, and Euclidean distance. Comput Methods Prog Biomed 116(1):10–25
Li C, Sanchez R, Zurita G, Cerrada M, Cabrera D, Vásquez R (2016) Gearbox fault diagnosis based on deep random forest fusion of acoustic and vibratory signals. Mech Syst Signal Process 76–77:283–293
Mallat GS (1989) A theory for multi resolution signal decomposition: the wavelet representation. IEEE Trans Pattern Anal Mach Intell 11:674–693
Maroco J, Silva D, Rodrigues A, Guerreiro M, Santana I, de Mendonca A (2011) Data mining methods in the prediction of dementia: a real-data comparison of the accuracy, sensitivity and specificity of linear discriminant analysis, logistic regression, neural networks, support vector machines, classification trees and random forests. BMC Res Notes 4:299
Nourani V, Alami MT, Vousoughi FD (2015) Wavelet-entropy data pre-processing approach for ANN-based groundwater level modeling. J Hydrol 524:255–269
Odan FK, Reis LFR (2012) Hybrid water demand forecasting model associating artificial neural network with Fourier series. J Water Resour Plan Manag 138(3):245–256
Oliveira S, Oehler F, San-Miguel-Ayanz J, Camia A, Pereira JMC (2012) Modeling spatial patterns of fire occurrence in Mediterranean Europe using multiple regression and random Forest. For Ecol Manag 275:117–129
Paul R, Sengupta A, Pathak RR (2013) Wavelet based denoising technique for liquid level system. Measurement 46(6):1979–1994
Rosenstein MT, Collins JJ, De Luca CJ (1994) Reconstruction expansion as a geometry-based framework for choosing proper delay times. Phys D 73(1–2):82–89
Sauer S, Lemke J, Zinn W, Buettner R, Kohls N (2015) Mindful in a random forest: assessing the validity of mindfulness items using random forests methods. Personal Individ Differ 81:117–123
Shafaei M, Kisi O (2016) Lake level forecasting using wavelet-SVR, wavelet-ANFIS and wavelet-ARMA conjunction models. Water Resour Manag 30(1):79–97
Smith PF, Ganesh S, Liu P (2013) A comparison of random forest regression and multiple linear regression for prediction in neuroscience. J Neurosci Methods 220(1):85–91
Tang J, Liu F, Zhang W, Zhang S, Wang Y (2016) Exploring dynamic property of traffic flow time series in multi-states based on complex networks: phase space reconstruction versus visibility graph. Phys A 450:635–648
Tiwari MK, Adamowski JF (2015) Medium-term urban water demand forecasting with limited data using an ensemble wavelet-bootstrap machine-learning approach. J Water Resour Plan Manag 141(2):4014053
Vezza P, Muñoz-Mas R, Martinez-Capel F, Mouton A (2015) Random forests to evaluate biotic interactions in fish distribution models. Environ Model Softw 67:173–183
Zhong S, Xie X, Lin L (2015) Two-layer random forests model for case reuse in case-based reasoning. Expert Syst Appl 42(24):9412–9425
Acknowledgements
This work is supported by the Project in the National Science and Technology Pillar Programme during the Twelfth Five-year Plan Period (2012BAJ25B06-003) and the Key Project of University Natural Science Research of Anhui, China (KJ2016A168).
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Conflict of Interest
No interest conflict.
Rights and permissions
About this article
Cite this article
Chen, G., Long, T., Xiong, J. et al. Multiple Random Forests Modelling for Urban Water Consumption Forecasting. Water Resour Manage 31, 4715–4729 (2017). https://doi.org/10.1007/s11269-017-1774-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11269-017-1774-7