Transforming Auto-Encoders

Hinton, Geoffrey E.; Krizhevsky, Alex; Wang, Sida D.

doi:10.1007/978-3-642-21735-7_6

Geoffrey E. Hinton¹⁹,
Alex Krizhevsky¹⁹ &
Sida D. Wang¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6791))

Included in the following conference series:

International Conference on Artificial Neural Networks

9306 Accesses
406 Citations
3 Altmetric

Abstract

The artificial neural networks that are used to recognize shapes typically use one or more layers of learned feature detectors that produce scalar outputs. By contrast, the computer vision community uses complicated, hand-engineered features, like SIFT [6], that produce a whole vector of outputs including an explicit representation of the pose of the feature. We show how neural networks can be used to learn features that output a whole vector of instantiation parameters and we argue that this is a much more promising way of dealing with variations in position, orientation, scale and lighting than the methods currently employed in the neural networks community. It is also more promising than the hand-engineered features currently used in computer vision because it provides an efficient way of adapting the features to the domain.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Berkes, P., Turner, R.E., Sahani, M.: A structured model of video reproduces primary visual cortical organisation. PLoS Computational Biology 5(9), 1–16 (2009)
Article MathSciNet Google Scholar
Freeman, W., Adelson, E.: The design and use of steerable filters. IEEE Transactions on Pattern Analysis and Machine Intelligence 13(9), 891–906 (1991)
Article Google Scholar
Hinton, G.E.: Shape representation in parallel systems. In: Proc. 7th International Joint Conference on Artificial Intelligence, vol. 2, pp. 1088–1096 (1981)
Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Lee, H., Grosse, R., Ranganath, R., Ng, A.: Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: Proc. 26th International Conference on Machine Learning (2009)
Google Scholar
Lowe, D.G.: Object recognition from local scale-invariant features. In: Proc. International Conference on Computer Vision (1999)
Google Scholar
Memisevic, R., Hinton, G.: Learning to represent spatial transformations with factored higher-order boltzmann machines. Neural Comp. 22, 1473–1492 (2010)
Article MATH Google Scholar
Nair, V., Hinton, G.E.: Rectified linear units improve restricted boltzmann machines. In: Proc. 27th International Conference on Machine Learning (2010)
Google Scholar
Pelli, D.G., Tillman, K.A.: The uncrowded window of object recognition. Nature Neuroscience 11, 1129–1135 (2008)
Article Google Scholar
Ranzato, M., Huang, F., Boureau, Y., LeCun, Y.: Unsupervised learning of invariant feature hierarchies with applications to object recognition. In: Proc. Computer Vision and Pattern Recognition Conference (CVPR 2007). IEEE Press, Los Alamitos (2007)
Google Scholar
Riesenhuber, M., Poggio, T.: Hierarchical models of object recognition in cortex. Nature Neuroscience 2, 1019–1025 (1999)
Article Google Scholar
Zemel, R.S., Mozer, M.C., Hinton, G.E.: Traffic: Recognizing objects using hier-archical reference frame transformations. In: Touretzky, D.S. (ed.) Advances in Neural Information Processing Systems, pp. 266–273. Morgan Kauffman, San Mateo (1990)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Toronto, Canada
Geoffrey E. Hinton, Alex Krizhevsky & Sida D. Wang

Authors

Geoffrey E. Hinton
View author publications
You can also search for this author in PubMed Google Scholar
Alex Krizhevsky
View author publications
You can also search for this author in PubMed Google Scholar
Sida D. Wang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Information and Computer Science, Aalto University School of Science, P.O. Box 15400, 00076, Aalto, Finland
Timo Honkela & Samuel Kaski &
School of Physics, Astronomy and Informatics, Department of Informatics, Nicolaus Copernicus University, ul. Grudziadzka 5, 87-100, Torun, Poland
Włodzisław Duch
Department of Statistical Science, University College London, 1-19 Torrington Place, WC1E 7HB, London, UK
Mark Girolami

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hinton, G.E., Krizhevsky, A., Wang, S.D. (2011). Transforming Auto-Encoders. In: Honkela, T., Duch, W., Girolami, M., Kaski, S. (eds) Artificial Neural Networks and Machine Learning – ICANN 2011. ICANN 2011. Lecture Notes in Computer Science, vol 6791. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21735-7_6

Download citation

DOI: https://doi.org/10.1007/978-3-642-21735-7_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-21734-0
Online ISBN: 978-3-642-21735-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics