Abstract
The Restricted Boltzmann Machine (RBM) is a kind of stochastic neural network. It can be used as basic building blocks to form deep architectures. Since Hinton solved the problem of computational inefficiency by using a so called greedy layer-wise unsupervised pre-training algorithm, much more attention is focused on deep learning and achieved significant success in areas of speech recognition, object recognition, natural language processing, etc. In addition to initializing deep networks, RBMs can also be used to learn features from the raw data. In this paper, we proposed a method to learn much better discriminative features for RBMs based on using a novel objective function. We test our idea on MNIST handwritten digit dataset. In our experiments, the features learnt by RBM were further fed to a multinomial logistic regression and results show that our objective function could result in much higher accuracy ratio of classification.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bengio, Y.: Learning deep architectures for AI. Found. Trends Mach. Learn. 2(1), 1–127 (2009)
Brakel, P., Dieleman, S., Schrauwen, B.: Training restricted Boltzmann machines with multi-tempering: harnessing parallelization. In: Villa, A.E.P., Duch, W., Érdi, P., Masulli, F., Palm, G. (eds.) ICANN 2012. LNCS, vol. 7553, pp. 92–99. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33266-1_12
Buchaca, D., Romero, E., Mazzanti, F., Delgado, J.: Stopping criteria in contrastive divergence: alternatives to the reconstruction error (2013). arXiv preprint arXiv:1312.6062
Desjardins, G., Courville, A.C., Bengio, Y., Vincent, P., Delalleau, O.: Tempered markov chain Monte Carlo for training of restricted Boltzmann machines. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics (AISTATS 2010), vol. 9, pp. 145–152 (2010)
Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
Hinton, G.E.: Training products of experts by minimizing contrastive divergence. Neural Comput. 14(8), 1771–1800 (2002)
Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)
Larochelle, H., Bengio, Y.: Classification using discriminative restricted Boltzmann machines. In: Proceedings of the 25th International Conference on Machine Learning, ICML 2008, pp. 536–543. ACM, New York (2008)
LeCun, Y., Chopra, S., Hadsell, R., Ranzato, M., Huang, F.: A tutorial on energy-based learning. In: Predicting Structured Data. MIT Press (2006)
Lee, H., Ekanadham, C., Ng, A.Y.: Sparse deep belief net model for visual area v2. In: Advances in Neural Information Processing Systems 20, pp. 873–880. Curran Associates, Inc. (2008)
Deng, L., Yu, D.: Deep convex network: a scalable architecture for speech pattern classification. In: International Speech Communication Association, August 2011
Ma, X., Wang, X.: Average contrastive divergence for training restricted Boltzmann machines. Entropy 18(1), 35 (2016)
Ranzato, M., Hinton, G.E.: Modeling pixel means and covariances using factorized third-order Boltzmann machines. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2551–2558, June 2010
Salakhutdinov, R., Hinton, G.E.: Deep Boltzmann machines. J. Mach. Learn. Res. 5(2), 1967–2006 (2009)
Sarikaya, R., Hinto, G.E., Deoras, A.: Application of deep belief networks for natural language understanding. IEEE/ACM Trans. Audio, Speech Lang. Process. 22(4), 778–784 (2014)
Smolensky, P.: Information processing in dynamical systems: foundations of harmony theory. In: Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol. 1, pp. 194–281. MIT Press, Cambridge (1986)
Tieleman, T.: Training restricted Boltzmann machines using approximations to the likelihood gradient. In: Proceedings of the 25th International Conference on Machine Learning, ICML 2008, pp. 1064–1071. ACM, New York (2008)
Tieleman, T., Hinton, G.: Using fast weights to improve persistent contrastive divergence. In: Proceedings of the 26th Annual International Conference on Machine Learning, ICML 2009, pp. 1033–1040. ACM, New York (2009)
Stanford University: Difficulty of training deep architectures. http://ufldl.stanford.edu/wiki/index.php/Deep_Networks:_Overview
Acknowledgment
This work is supported by the National Natural Science Foundation of China (Nos.61425002, 61572093, 61402066, 61402067, 31370778, 61370005), Program for Changjiang Scholars and Innovative Research Team in University (No. IRT_15R07), the Program for Liaoning Innovative Research Team in University(No. LT2015002), the Basic Research Program of the Key Lab in Liaoning Province Educational Department (Nos. LZ2014049, LZ2015004), Natural Science Foundation of Liaoning Province (No.2014020132), Scientific Research Fund of Liaoning Provincial Education (Nos. L2015015, L2014499), Liaoning BaiQianWan Talents Program (No.2013921007), and the Program for Liaoning Key Lab of Intelligent Information Processing and Network Technology in University.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Guo, S., Zhou, C., Wang, B., Zhou, S. (2016). A Method of Discriminative Features Extraction for Restricted Boltzmann Machines. In: Yin, H., et al. Intelligent Data Engineering and Automated Learning – IDEAL 2016. IDEAL 2016. Lecture Notes in Computer Science(), vol 9937. Springer, Cham. https://doi.org/10.1007/978-3-319-46257-8_23
Download citation
DOI: https://doi.org/10.1007/978-3-319-46257-8_23
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46256-1
Online ISBN: 978-3-319-46257-8
eBook Packages: Computer ScienceComputer Science (R0)