Deep belief improved bidirectional LSTM for multivariate time series forecasting

Keruo Jiang; Zhen Huang; Xinyan Zhou; Chudong Tong; Minjie Zhu; Heshan Wang; Keruo Jiang; Zhen Huang; Xinyan Zhou; Chudong Tong; Minjie Zhu; Heshan Wang

doi:10.3934/mbe.2023739

Mathematical Biosciences and Engineering

2023, Volume 20, Issue 9: 16596-16627. doi: 10.3934/mbe.2023739

Previous Article Next Article

Research article

Deep belief improved bidirectional LSTM for multivariate time series forecasting

1.
State Grid Ningbo Electric Power Supply Company, Ningbo 315000, China
2.
Faculty of Electrical Engineering & Computer Science, Ningbo University, Ningbo 315211, China
3.
School of Electronic and Information Engineering, Zhengzhou University, Zhengzhou 450001, China

Received: 28 June 2023 Revised: 04 August 2023 Accepted: 07 August 2023 Published: 17 August 2023

Multivariate time series (MTS) play essential roles in daily life because most real-world time series datasets are multivariate and rich in time-dependent information. Traditional forecasting methods for MTS are time-consuming and filled with complicated limitations. One efficient method being explored within the dynamical systems is the extended short-term memory networks (LSTMs). However, existing MTS models only partially use the hidden spatial relationship as effectively as LSTMs. Shallow LSTMs are inadequate in extracting features from high-dimensional MTS; however, the multilayer bidirectional LSTM (BiLSTM) can learn more MTS features in both directions. This study tries to generate a novel and improved BiLSTM network (DBI-BiLSTM) based on a deep belief network (DBN), bidirectional propagation technique, and a chained structure. The deep structures are constructed by a DBN layer and multiple stacked BiLSTM layers, which increase the feature representation of DBI-BiLSTM and allow for the model to further learn the extended features in two directions. First, the input is processed by DBN to obtain comprehensive features. Then, the known features, divided into clusters based on a global sensitivity analysis method, are used as the inputs of every BiLSTM layer. Meanwhile, the previous outputs of the shallow layer are combined with the clustered features to reconstitute new input signals for the next deep layer. Four experimental real-world time series datasets illustrate our one-step-ahead prediction performance. The simulating results confirm that the DBI-BiLSTM not only outperforms the traditional shallow artificial neural networks (ANNs), deep LSTMs, and some recently improved LSTMs, but also learns more features of the MTS data. As compared with conventional LSTM, the percentage improvement of DBI-BiLSTM on the four MTS datasets is 85.41, 75.47, 61.66 and 30.72%, respectively.
- deep long short-term memory,
- time series forecasting,
- feature extraction,
- deep belief network
Citation: Keruo Jiang, Zhen Huang, Xinyan Zhou, Chudong Tong, Minjie Zhu, Heshan Wang. Deep belief improved bidirectional LSTM for multivariate time series forecasting[J]. Mathematical Biosciences and Engineering, 2023, 20(9): 16596-16627. doi: 10.3934/mbe.2023739

Related Papers:

Abstract

Multivariate time series (MTS) play essential roles in daily life because most real-world time series datasets are multivariate and rich in time-dependent information. Traditional forecasting methods for MTS are time-consuming and filled with complicated limitations. One efficient method being explored within the dynamical systems is the extended short-term memory networks (LSTMs). However, existing MTS models only partially use the hidden spatial relationship as effectively as LSTMs. Shallow LSTMs are inadequate in extracting features from high-dimensional MTS; however, the multilayer bidirectional LSTM (BiLSTM) can learn more MTS features in both directions. This study tries to generate a novel and improved BiLSTM network (DBI-BiLSTM) based on a deep belief network (DBN), bidirectional propagation technique, and a chained structure. The deep structures are constructed by a DBN layer and multiple stacked BiLSTM layers, which increase the feature representation of DBI-BiLSTM and allow for the model to further learn the extended features in two directions. First, the input is processed by DBN to obtain comprehensive features. Then, the known features, divided into clusters based on a global sensitivity analysis method, are used as the inputs of every BiLSTM layer. Meanwhile, the previous outputs of the shallow layer are combined with the clustered features to reconstitute new input signals for the next deep layer. Four experimental real-world time series datasets illustrate our one-step-ahead prediction performance. The simulating results confirm that the DBI-BiLSTM not only outperforms the traditional shallow artificial neural networks (ANNs), deep LSTMs, and some recently improved LSTMs, but also learns more features of the MTS data. As compared with conventional LSTM, the percentage improvement of DBI-BiLSTM on the four MTS datasets is 85.41, 75.47, 61.66 and 30.72%, respectively.

References

[1]	Y. Liu, H. Yang, S. Gong, Y. Liu, X. Xiong, A daily activity feature extraction approach based on time series of sensor events, Math. Biosci. Eng., 17 (2020), 5173–5189. https://doi.org/ 0.3934/mbe.2020280
[2]	H. Li, J. Tong, A novel clustering algorithm for time-series data based on precise correlation coefficient matching in the IoT, Math. Biosci. Eng., 16 (2019), 6654–6671. https://doi.org/10.3934/mbe.2019331 doi: 10.3934/mbe.2019331
[3]	H. M. Srivastava, I. C. Area Carracedo, J. L. Nieto, Power-series solution of compartmental epidemiological models, Math. Biosci. Eng., 18 (2021), 3274–3290. https://doi.org/10.3934/mbe.2021163 doi: 10.3934/mbe.2021163
[4]	M. Li, S. Chen, X. Chen, Y. Zhang, Y. Wang, Q. Tian, Symbiotic graph neural networks for 3d skeleton-based human action recognition and motion prediction, IEEE Trans. Pattern Anal. Mach. Intell., 44 (2021), 3316–3333. https://doi.org/10.1109/TPAMI.2021.3053765 doi: 10.1109/TPAMI.2021.3053765
[5]	M. Gan, Y. Cheng, K. Liu, G. Zhang, Seasonal and trend time series forecasting based on a quasi-linear autoregressive model, Appl. Soft Comput., 24 (2014), 13–18. https://doi.org/10.1016/j.asoc.2014.06.047 doi: 10.1016/j.asoc.2014.06.047
[6]	J. Wang, S. Zhang, An improved deep learning approach based on exponential moving average algorithm for atrial fibrillation signals identification, Neurocomputing, 513 (2013), 127–136. https://doi.org/10.1016/j.neucom.2022.09.079 doi: 10.1016/j.neucom.2022.09.079
[7]	Y. Hu, F. Hao, C. Meng, L Sun, D. Xu, T. Zhang, Spatial general autoregressive model-based image interpolation accommodates arbitrary scale factors, Math. Biosci. Eng., 17 (2020), 6573–6600. https://doi.org/ 10.3934/mbe.2020343 doi: 10.3934/mbe.2020343
[8]	X. Yu, Z. Chen, L. Qi, Comparative study of SARIMA and NARX models in predicting the incidence of schistosomiasis in China, Math. Biosci. Eng., 16 (2019), 2266–2276. https://doi.org/10.3934/mbe.2019112 doi: 10.3934/mbe.2019112
[9]	H. Tong, Non-Linear Time Series: A Dynamical System Approach, Oxford University Press, 1990.
[10]	D. T. Tran, A. Iosifidis, J. Kanniainen, M. Gabbouj, Temporal attention-augmented bilinear network for financial time-series data analysis, IEEE Trans. Neural Networks Learn. Syst., 30 (2018), 1407–1418. https://doi.org/10.1109/TNNLS.2018.2869225 doi: 10.1109/TNNLS.2018.2869225
[11]	D. Li, X. Wang, J. Sun, H. Yang, AI-HydSu: An advanced hybrid approach using support vector regression and particle swarm optimization for dissolved oxygen forecasting, Math. Biosci. Eng., 18 (2021), 3646–3666. https://doi.org/10.3934/mbe.2021182 doi: 10.3934/mbe.2021182
[12]	Y. C. Kuan, C. T. Hong, P. C. Chen, W. T. Liu, C. C. Chung, Logistic regression and artificial neural network-based simple predicting models for obstructive sleep apnea by age, sex, and body mass index, Math. Biosci. Eng., 19 (2022), 11409–11421. https://doi.org/10.3934/mbe.2022532 doi: 10.3934/mbe.2022532
[13]	F. Yang, D. Wang, F. Xu, Z. Huang, K. L. Tsui, Lifespan prediction of lithium-ion batteries based on various extracted features and gradient boosting regression tree model, J. Power Sources, 476 (2020), 228654. https://doi.org/10.1016/j.jpowsour.2020.228654 doi: 10.1016/j.jpowsour.2020.228654
[14]	Y. Liang, S. Zhang, H. Qiao, Y. Cheng, iEnhancer-MFGBDT: Identifying enhancers and their strength by fusing multiple features and gradient boosting decision tree, Math. Biosci. Eng., 18 (2021), 8797–8814. https://doi.org/10.3934/mbe.2021434 doi: 10.3934/mbe.2021434
[15]	H. Wan, S. Guo, K. Yin, X. Liang, Y. Lin, CTS-LSTM: LSTM-based neural networks for correlated time series prediction, Knowl. Based Syst., 191 (2020), 105239. https://doi.org/10.1016/j.knosys.2019.105239 doi: 10.1016/j.knosys.2019.105239
[16]	Y. Rizk, M. Awad, On extreme learning machines in sequential and time series prediction: A non-iterative and approximate training algorithm for recurrent neural networks, Neurocomputing, 325 (2019), 1–19. https://doi.org/10.1016/j.neucom.2018.09.012 doi: 10.1016/j.neucom.2018.09.012
[17]	Y. Liu, C. Gong, L. Yang, Y. Chen, DSTP-RNN: A dual-stage two-phase attention-based recurrent neural network for long-term and multivariate time series prediction, Expert Syst. Appl., 143 (2020), 113082. https://doi.org/10.1016/j.eswa.2019.113082 doi: 10.1016/j.eswa.2019.113082
[18]	Y. Bengio, P. Simard, P. Frasconi, Learning long-term dependencies with gradient descent is difficult, IEEE Trans. Neural Networks, 5 (1994), 157–166. https://doi.org/10.1109/72.279181 doi: 10.1109/72.279181
[19]	S. Hochreiter, J. Schmidhuber, Long short-term memory, Neural Comput., 9 (1997), 1735–1780.https://doi.org/10.1162/neco.1997.9.8.1735 doi: 10.1162/neco.1997.9.8.1735
[20]	V. Eramo, F. G. Lavacca, T. Catena, P. J. P. Salazar, Application of a long short term memory neural predictor with asymmetric loss function for the resource allocation in NFV network architectures, Comput. Networks, 193 (2021), 108104. https://doi.org/10.1016/j.comnet.2021.108104 doi: 10.1016/j.comnet.2021.108104
[21]	V. Eramo, T. Catena, Application of an innovative convolutional/LSTM neural network for computing resource allocation in NFV network architectures, IEEE Trans. Network Service Manage., 19 (2022), 2929–2943. https://doi.org/10.1109/TNSM.2022.3142182 doi: 10.1109/TNSM.2022.3142182
[22]	T. Catena, V. Eramo, M. Panella, A. Rosato, Distributed LSTM-based cloud resource allocation in network function virtualization architectures, Comput. Networks, 213 (2022), 109111. https://doi.org/10.1016/j.comnet.2022.109111 doi: 10.1016/j.comnet.2022.109111
[23]	M. Schuster, K. K. Paliwal, Bidirectional recurrent neural networks, IEEE Trans. Signal Process., 45 (1997), 2673–2681. https://doi.org/10.1109/78.650093 doi: 10.1109/78.650093
[24]	A. A. Ewees, M. A. Al-qaness, L. Abualigah, M. Abd Elaziz, HBO-LSTM: Optimized long short term memory with heap-based optimizer for wind power forecasting, Energy Convers. Manage., 268 (2022), 116022. https://doi.org/10.1016/j.enconman.2022.116022 doi: 10.1016/j.enconman.2022.116022
[25]	J. Liu, X. Huang, Q. Li, Z. Chen, G. Liu, Y. Tai, Hourly stepwise forecasting for solar irradiance using integrated hybrid models CNN-LSTM-MLP combined with error correction and VMD, Energy Convers. Manage., 280 (2023), 116804. https://doi.org/10.1016/j.enconman.2023.116804 doi: 10.1016/j.enconman.2023.116804
[26]	M. Neshat, M. M. Nezhad, N. Y. Sergiienko, S. Mirjalili, G. Piras, D. Astiaso Garcia, Wave power forecasting using an effective decomposition-based convolutional Bi-directional model with equilibrium Nelder-Mead optimizer, Energy, 256 (2022), 124623. https://doi.org/10.1016/j.energy.2022.124623 doi: 10.1016/j.energy.2022.124623
[27]	Y. Li, Z. Zhu, D. Kong, H. Han, Y. Zhao, EA-LSTM: Evolutionary attention-based LSTM for time series prediction, Knowl. Based Syst., 181 (2019), 104785. https://doi.org/10.1016/j.knosys.2019.05.028 doi: 10.1016/j.knosys.2019.05.028
[28]	G. E. Hinton, R. R. Salakhutdinov, Reducing the dimensionality of data with neural networks, Science, 313 (2006), 504–507. https://doi.org/10.1126/science.1127647 doi: 10.1126/science.1127647
[29]	X. Sun, T. Li, Q. Li, Y. Huang, Y. Li, Deep belief echo-state network and its application to time series prediction, Knowl. Based Syst., 130 (2017), 17–29. https://doi.org/10.1016/j.knosys.2017.05.022 doi: 10.1016/j.knosys.2017.05.022
[30]	X. Li, Q. Liu, Y. Wu, Prediction on blockchain virtual currency transaction under long short-term memory model and deep belief network, Appl. Soft Comput., 116 (2022), 108349. https://doi.org/10.1016/j.asoc.2021.108349 doi: 10.1016/j.asoc.2021.108349
[31]	Z. Wu, Q. Li, H. Zhang, Chain-structure echo state network with stochastic optimization: Methodology and application, IEEE Trans. Neural Networks Learn. Syst., 33 (2021), 1974–1985. https://doi.org/10.1109/TNNLS.2021.3098866 doi: 10.1109/TNNLS.2021.3098866
[32]	H. Zhang, B. Hu, X. Wang, J. Xu, L. Wang, Q. Sun, et al., Self-organizing deep belief modular echo state network for time series prediction, Knowl. Based Syst., 222 (2021), 107007. https://doi.org/10.1016/j.knosys.2021.107007 doi: 10.1016/j.knosys.2021.107007
[33]	G. E. Hinton, S. Osindero, Y. W. The, A fast learning algorithm for deep belief nets, Neural Comput., 18 (2006), 1527–1554. https://doi.org/10.1162/neco.2006.18.7.1527 doi: 10.1162/neco.2006.18.7.1527
[34]	T. Tieleman, Training restricted Boltzmann machines using approximations to the likelihood gradient, in Proceedings of the 25th International Conference on Machine Learning, 2008, 1064–1071. https://doi.org/10.1145/1390156.1390290
[35]	A. Saltelli, P. Annoni, I. Azzini, F. Campolongo, M. Ratto, S. Tarantola, Variance based sensitivity analysis of model output, Design and estimator for the total sensitivity index, Comput. Phys. Commun., 181 (2010), 259–270. https://doi.org/10.1016/j.cpc.2009.09.018 doi: 10.1016/j.cpc.2009.09.018
[36]	Y. LeCun, Y. Bengio, G. Hinton, Deep learning, Nature, 521 (2015), 436–444. https://doi.org/10.1038/nature14539 doi: 10.1038/nature14539
[37]	G. Kurnaz, A. S. Demir, Prediction of SO2 and PM10 air pollutants using a deep learning-based recurrent neural network: Case of industrial city Sakarya. Urban Climate, 41 (2022), 101051. https://doi.org/10.1016/j.uclim.2021.101051 doi: 10.1016/j.uclim.2021.101051
[38]	D. P. Kingma, J. Ba, Adam: A method for stochastic optimization, preprint, arXiv: 1412.6980, 2014. https://doi.org/10.48550/arXiv.1412.6980
[39]	H. Wang, Q. Wu, J. Xin, J. Wang, H. Zhang, Optimizing deep belief echo state network with a sensitivity analysis input scaling auto-encoder algorithm, Knowl, Based Syst., 191 (2020), 105257. https://doi.org/10.1016/j.knosys.2019.105257 doi: 10.1016/j.knosys.2019.105257
[40]	F. Zamora-Martinez, P. Romeu, P. Botella-Rocamora, J. Pardo, On-line learning of indoor temperature forecasting models towards energy efficiency, Energy Build, , 83 (2014), 162–172. https://doi.org/10.1016/j.enbuild.2014.04.034 doi: 10.1016/j.enbuild.2014.04.034
[41]	T. H. Fanaee, J. Gama, Event labeling combining ensemble detectors and background knowledge, Progress Artif. Intell., 2 (2014), 113–127. https://doi.org/10.1007/s13748-013-0040-3 doi: 10.1007/s13748-013-0040-3
[42]	J. L. Elman, Finding structure in time, Cognit. Sci., 14 (1990), 179–211. https://doi.org/10.1207/s15516709cog1402_1 doi: 10.1207/s15516709cog1402_1
[43]	J. Chung, C. Gulcehre, K. H. Cho, et al. Empirical evaluation of gated recurrent neural networks on sequence modeling, preprint, arXiv: 1412.3555.
[44]	S. Kim, M. Kang, Financial series prediction using attention LSTM, preprint, arXiv: 1902.10877.
[45]	Z. Cui, R. Ke, Z. Pu, Y. Wang, Stacked bidirectional and unidirectional LSTM recurrent neural network for forecasting network-wide traffic state with missing values, Trans. Res. Part C Emerging Technol., 118 (2020), 102674. https://doi.org/10.1016/j.trc.2020.102674 doi: 10.1016/j.trc.2020.102674
[46]	F. Karim, S. Majumdar, H. Darabi, S. Harford, Multivariate LSTM-FCNs for time series classification, Neural Networks, 116 (2019), 237–245. https://doi.org/10.1016/j.neunet.2019.04.014 doi: 10.1016/j.neunet.2019.04.014

Reader Comments

Your name:*

Email:*
© 2023 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)