Recurrent Neural Network Architectures

Bianchi, Filippo Maria; Maiorino, Enrico; Kampffmeyer, Michael C.; Rizzi, Antonello; Jenssen, Robert

doi:10.1007/978-3-319-70338-1_3

Filippo Maria Bianchi¹⁹,
Enrico Maiorino²⁰,
Michael C. Kampffmeyer¹⁹,
Antonello Rizzi²¹ &
…
Robert Jenssen¹⁹

Part of the book series: SpringerBriefs in Computer Science ((BRIEFSCOMPUTER))

2450 Accesses
9 Citations

Abstract

In this chapter, we present three different recurrent neural network architectures that we employ for the prediction of real-valued time series. All the models reviewed in this chapter can be trained through the previously discussed backpropagation through time procedure. First, we present the most basic version of recurrent neural networks, called Elman recurrent neural network. Then, we introduce two popular gated architectures, which are long short-term memory and the gated recurrent units. We discuss the main advantages of these more sophisticated architectures, especially regarding their capability to process much longer dependencies in time by maintaining an internal memory for longer periods. For each one of the reviewed network, we provide the details and we show the equations for updating the internal state and computing the output at each time step. Then, for each recurrent neural network we also provide a quick overview of its main applications in previous works in the context of real-valued time series forecasting.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

eBook: USD 18.99; Price excludes VAT (USA)

Softcover Book: USD 18.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The logistic sigmoid is defined as $\sigma (x) = \frac{1}{1+e^{-x}}$.

References

Bianchi FM, Kampffmeyer M, Maiorino E, Jenssen R (2017) Temporal overdrive recurrent neural network. arXiv:170105159
Cai X, Zhang N, Venayagamoorthy GK, Wunsch DC (2007) Time series prediction with recurrent neural networks trained by a hybrid PSO-EA algorithm. Neurocomputing 70(13–15):2342–2353. https://doi.org/10.1016/j.neucom.2005.12.138
Chitsaz H, Shaker H, Zareipour H, Wood D, Amjady N (2015) Short-term electricity load forecasting of buildings in microgrids. Energy Build 99:50–60. https://doi.org/10.1016/j.enbuild.2015.04.011
Cho V (2003) A comparison of three different approaches to tourist arrival forecasting. Tour Manag 24(3):323–330. https://doi.org/10.1016/S0261-5177(02)00068-7
Cho K, Van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv:14061078
Chung J, Gülçehre Ç, Cho K, Bengio Y (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv:1412.3555
Eck D, Schmidhuber J (2002) Finding temporal structure in music: Blues improvisation with LSTM recurrent networks. In: Proceedings of the 2002 12th IEEE workshop on neural networks for signal processing, 2002. IEEE, pp 747–756
Google Scholar
Elman JL (1995) Language as a dynamical system. In: Mind as motion: explorations in the dynamics of cognition, pp 195–223
Google Scholar
Felder M, Kaifel A, Graves A (2010) Wind power prediction using mixture density recurrent neural networks. Poster P0, vol 153, pp 1–7
Google Scholar
Gers FA, Schmidhuber J (2001) Lstm recurrent networks learn simple context-free and context-sensitive languages. IEEE Trans Neural Netw 12(6):1333–1340
Article Google Scholar
Graves A, Mohamed AR, Hinton G (2013) Speech recognition with deep recurrent neural networks. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, pp 6645–6649
Google Scholar
Graves A, Schmidhuber J (2005) Framewise phoneme classification with bidirectional LSTM and other neural network architectures. iJCNN 2005, Neural Netw 18(56):602–610. https://doi.org/10.1016/j.neunet.2005.06.042
Graves A, Schmidhuber J (2009) Offline handwriting recognition with multidimensional recurrent neural networks. In: Advances in neural information processing systems, pp 545–552
Google Scholar
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Article Google Scholar
Mandal P, Senjyu T, Urasaki N, Funabashi T (2006) A neural network based several-hour-ahead electric load forecasting using similar days approach. International J Electr Power Energy Syst 28(6):367–373. https://doi.org/10.1016/j.ijepes.2005.12.007
Ma X, Tao Z, Wang Y, Yu H, Wang Y (2015) Long short-term memory neural network for traffic speed prediction using remote microwave sensor data. Transp Res Part C Emerg Technol 54:187–197. https://doi.org/10.1016/j.trc.2015.03.014
Mori HMH, Ogasawara TOT (1993) A recurrent neural network for short-term load forecasting. In: 1993 proceedings of the second international forum on applications of neural networks to power systems, vol 31, pp 276–281. https://doi.org/10.1109/ANN.1993.264315
Ogata T, Murase M, Tani J, Komatani K, Okuno HG (2007) Two-way translation of compound sentences and arm motions by recurrent neural networks. In: IROS 2007, IEEE/RSJ International conference on intelligent robots and systems, 2007. IEEE, pp 1858–1863
Google Scholar
Pawlowski K, Kurach K (2015) Detecting methane outbreaks from time series data with deep neural networks. In: Proceedings rough sets, fuzzy sets, data mining, and granular computing—15th international conference, RSFDGrC 2015, Tianjin, China, November 20–23, 2015, vol 9437, pp 475–484. https://doi.org/10.1007/978-3-319-25783-9_42
Vinyals O, Toshev A, Bengio S, Erhan D (2017) Show and tell: lessons learned from the 2015 MSCOCO image captioning challenge. IEEE Trans Pattern Anal Mach Intell 39(4):652–663. https://doi.org/10.1109/TPAMI.2016.2587640
Zaremba W (2015) An empirical exploration of recurrent network architectures. In: Proceedings of the 32nd international conference on machine learning, Lille, France
Google Scholar

Download references

Author information

Authors and Affiliations

UiT The Arctic University of Norway, Tromsø, Norway
Filippo Maria Bianchi, Michael C. Kampffmeyer & Robert Jenssen
Harvard Medical School, Boston, MA, USA
Enrico Maiorino
Sapienza University of Rome, Rome, Italy
Antonello Rizzi

Authors

Filippo Maria Bianchi
View author publications
You can also search for this author in PubMed Google Scholar
Enrico Maiorino
View author publications
You can also search for this author in PubMed Google Scholar
Michael C. Kampffmeyer
View author publications
You can also search for this author in PubMed Google Scholar
Antonello Rizzi
View author publications
You can also search for this author in PubMed Google Scholar
Robert Jenssen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Filippo Maria Bianchi .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Bianchi, F.M., Maiorino, E., Kampffmeyer, M.C., Rizzi, A., Jenssen, R. (2017). Recurrent Neural Network Architectures. In: Recurrent Neural Networks for Short-Term Load Forecasting. SpringerBriefs in Computer Science. Springer, Cham. https://doi.org/10.1007/978-3-319-70338-1_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-70338-1_3
Published: 10 November 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-70337-4
Online ISBN: 978-3-319-70338-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics