Skip to main content
Log in

Temperature-Based Deep Boltzmann Machines

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

Deep learning techniques have been paramount in the last years, mainly due to their outstanding results in a number of applications, that range from speech recognition to face-based user identification. Despite other techniques employed for such purposes, Deep Boltzmann Machines (DBMs) are among the most used ones, which are composed of layers of Restricted Boltzmann Machines stacked on top of each other. In this work, we evaluate the concept of temperature in DBMs, which play a key role in Boltzmann-related distributions, but it has never been considered in this context up to date. Therefore, the main contribution of this paper is to take into account this information, as well as the impact of replacing a standard Sigmoid function by another one and to evaluate their influence in DBMs considering the task of binary image reconstruction. We expect this work can foster future research considering the usage of different temperatures during learning in DBMs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. Notice that \(\tilde{\mathbf{v }}\) and \(\tilde{\mathbf{h }}\) are obtained by sampling from h and \(\tilde{\mathbf{v }}\), respectively.

  2. http://yann.lecun.com/exdb/mnist/.

  3. The images are originally available in grayscale with resolution of \(28\times 28\), but they were reduced to \(14\times 14\) images.

  4. The original training set was reduced to \(2\%\) of its former size, which corresponds to 1200 images.

  5. https://people.cs.umass.edu/~marlin/data.shtml.

  6. https://archive.ics.uci.edu/ml/datasets/Semeion+Handwritten+Digit.

  7. Similar architectures have been commonly employed in the literature [12, 16, 24, 25, 34].

  8. Notice all parameters and architectures have been empirically chosen [21].

  9. One sampling iteration was used for all learning algorithms.

  10. We did not fine-tune parameters using back-propagation, since the main goal of this paper is to show the temperature does affect the behavior of DBMs.

References

  1. Bekenstein JD (1973) Black holes and entropy. Phys Rev D 7(8):2333

    Article  MathSciNet  MATH  Google Scholar 

  2. Beraldo e Silva L, Lima M, Sodré L, Perez J (2014) Statistical mechanics of self-gravitating systems: mixing as a criterion for indistinguishability. Phys Rev D 90(12):123004

    Article  Google Scholar 

  3. Duong CN, Luu K, Quach KG, Bui TD (2015) Beyond principal components: deep Boltzmann machines for face modeling. In: 2015 IEEE conference on computer vision and pattern recognition, CVPR’15, pp 4786–4794

  4. Gadjiev B, Progulova T (2015) Origin of generalized entropies and generalized statistical mechanics for superstatistical multifractal systems. In: International workshop on Bayesian inference and maximum entropy methods in science and engineering, vol 1641, pp 595–602

  5. Goh H, Thome N, Cord M, Lim JH (2012) Unsupervised and Supervised Visual Codes with Restricted Boltzmann Machines. In: Proceedings of computer vision—ECCV 2012: 12th European conference on computer vision, Florence, Italy, October 7–13, 2012, Part V. Springer, Berlin, pp 298–311. doi:10.1007/978-3-642-33715-4_22

  6. Gompertz B (1825) On the nature of the function expressive of the law of human mortality, and on a new mode of determining the value of life contingencies. Philos Trans R Soc Lond 115:513–583

    Article  Google Scholar 

  7. Gordon BL (2002) Maxwell–Boltzmann statistics and the metaphysics of modality. Synthese 133(3):393–417

    Article  MathSciNet  MATH  Google Scholar 

  8. Hinton G, Salakhutdinov R (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507

    Article  MathSciNet  MATH  Google Scholar 

  9. Hinton G, Salakhutdinov R (2011) Discovering binary codes for documents by learning deep generative models. Top Cogn Sci 3(1):74–91

    Article  Google Scholar 

  10. Hinton GE (2002) Training products of experts by minimizing contrastive divergence. Neural Comput 14(8):1771–1800

    Article  MATH  Google Scholar 

  11. Hinton GE (2012) A practical guide to training restricted Boltzmann machines. In: Montavon G, Orr GB, Müller K-R (eds) Neural networks: tricks of the trade, 2nd edn. Springer, Berlin, pp 599–619. doi:10.1007/978-3-642-35289-8_32

  12. Hinton GE, Osindero S, Teh YW (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554

    Article  MathSciNet  MATH  Google Scholar 

  13. Kennedy J (2011) Particle swarm optimization. In: Sammut C, Webb GI (eds) Encyclopedia of machine learning. Springer, pp 760–766

  14. Larochelle H, Mandel M, Pascanu R, Bengio Y (2012) Learning algorithms for the classification restricted Boltzmann machine. J Mach Learn Res 13(1):643–669

    MathSciNet  MATH  Google Scholar 

  15. LeCun Y, Bengio Y, Hinton GE (2015) Deep learning. Nature 521:436–444

    Article  Google Scholar 

  16. Li G, Deng L, Xu Y, Wen C, Wang W, Pei J, Shi L (2016) Temperature based restricted Boltzmann machines. Sci Rep 6(19):133

    Google Scholar 

  17. Mendes G, Ribeiro M, Mendes R, Lenzi E, Nobre F (2015) Nonlinear Kramers equation associated with nonextensive statistical mechanics. Phys Rev E 91(5):052–106

    Article  MathSciNet  Google Scholar 

  18. Niven RK (2005) Exact Maxwell–Boltzmann, Bose–Einstein and fermi-dirac statistics. Phys Lett A 342(4):286–293

    Article  MathSciNet  MATH  Google Scholar 

  19. Papa JP, Rosa GH, Costa KAP, Marana AN, Scheirer W, Cox DD (2015) On the model selection of Bernoulli restricted Boltzmann machines through harmony search. In: Proceedings of the genetic and evolutionary computation conference, GECCO’15. ACM, New York, pp 1449–1450

  20. Papa JP, Rosa GH, Marana AN, Scheirer W, Cox DD (2015) Model selection for discriminative restricted boltzmann machines through meta-heuristic techniques. J Comput Sci 9:14–18

    Article  Google Scholar 

  21. Papa JP, Scheirer W, Cox DD (2016) Fine-tuning deep belief networks using harmony search. Appl Soft Comput 46:875–885

    Article  Google Scholar 

  22. Ranzato M, Boureau Y, Cun Y (2008) Sparse feature learning for deep belief networks. In: Platt J, Koller D, Singer Y, Roweis S (eds) Advances in neural information processing systems, vol 20. Curran Associates Inc., Red Hook, pp 1185–1192

    Google Scholar 

  23. Rényi, A. (1961) On measures of entropy and information. In: Neyman J (ed) Proceedings IV Berkeley symposium on mathematical statistics and probability, vol 1. pp 547–561

  24. Salakhutdinov R, Hinton G (2009) Semantic hashing. Int J Approx Reason 50(7):969–978

    Article  Google Scholar 

  25. Salakhutdinov R, Hinton GE (2009) Deep Boltzmann machines. In: Neil L, Mark R (eds) AISTATS, vol 1. Microtome Publishing, Brookline, MA, pp 3

  26. Salakhutdinov R, Hinton GE (2012) An efficient learning procedure for deep Boltzmann machines. Neural Comput 24(8):1967–2006

    Article  MathSciNet  MATH  Google Scholar 

  27. Schmidhuber J (2015) Deep learning in neural networks: an overview. Neural Netw 61:85–117

    Article  Google Scholar 

  28. Shim JW, Gatignol R (2010) Robust thermal boundary conditions applicable to a wall along which temperature varies in lattice-gas cellular automata. Phys Rev E 81(4):046–703

    Article  Google Scholar 

  29. Smolensky P (1986) Information processing in dynamical systems: foundations of harmony theory. In: Technical reports, DTIC Document

  30. Sohn K, Lee H, Yan X (2015) Learning structured output representation using deep conditional generative models. In: Cortes C, Lawrence ND, Lee DD, Sugiyama M, Garnett R (eds) Advances in neural information processing systems, vol 28. Curran Associates Inc., Red Hook, pp 3465–3473

    Google Scholar 

  31. Srivastava N, Hinton GE, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958

    MathSciNet  MATH  Google Scholar 

  32. Tieleman T (2008) Training restricted Boltzmann machines using approximations to the likelihood gradient. In: Proceedings of the 25th international conference on machine learning, ICML ’08. ACM, New York, pp 1064–1071 (2008)

  33. Tomczak JM, Gonczarek A (2016) Learning invariant features using subspace restricted Boltzmann machine. Neural Process Lett 9896:1–10

  34. Wicht B, Fischer A, Hennebert J (2016) On CPU performance optimization of restricted Boltzmann machine and convolutional rbm. In: IAPR workshop on artificial neural networks in pattern recognition. Springer, pp 163–174 (2016)

  35. Wilcoxon F (1945) Individual comparisons by ranking methods. Biom Bull 1(6):80–83. doi:10.2307/3001968

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank FAPESP Grants #2014/16250-9, #2014/12236-1, and #2016/19403-6, Capes and CNPq Grant #306166/2014-3.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to João Paulo Papa.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Passos, L.A., Papa, J.P. Temperature-Based Deep Boltzmann Machines. Neural Process Lett 48, 95–107 (2018). https://doi.org/10.1007/s11063-017-9707-2

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-017-9707-2

Keywords

Navigation