Abstract
We introduce a learning architecture that can serve compression while it also satisfies the constraints of factored reinforcement learning. Our novel Cartesian factors enable one to decrease the number of variables being relevant for the ongoing task, an exponential gain in the size of the state space. We demonstrate the working, the limitations and the promises of the abstractions: we develop a representation of space in allothetic coordinates from egocentric observations and argue that the lower dimensional allothetic representation can be used for path planning. Our results on the learning of Cartesian factors indicate that (a) shallow autoencoders perform well in our numerical example and (b) if deeper networks are needed, e.g., for classification or regression, then sparsity should also be enforced at (some of) the intermediate layers.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Becker, S.R., Candès, E.J., Grant, M.C.: Templates for convex cone problems with applications to sparse signal recovery. Math. Prog. Comp. 3(3), 165–218 (2011)
Bergstra, J., Breuleux, O., Bastien, F., Lamblin, P., Pascanu, R., Desjardins, G., Turian, J., Warde-Farley, D., Bengio, Y.: Theano. In: Python Sci. Comp. vol. 4, p. 3. Austin, TX (2010)
Boutilier, C., Dearden, R., Goldszmidt, M.: Stochastic dynamic programming with factored representations. Artif. Intel. 121(1), 49–107 (2000)
Culberson, J.C.: On the futility of blind search: an algorithmic view of no free lunch. Evol. Comp. 6(2), 109–127 (1998)
Dahl, G.E., Sainath, T.N., Hinton, G.E.: Improving deep neural networks for LVCSR using rectified linear units and dropout. In: Acoustics, Speech and Signal Processing (ICASSP), pp. 8609–8613. IEEE (2013)
Dai, W., Milenkovic, O.: Subspace pursuit for compressive sensing signal reconstruction. Info. Theo. 55(5), 2230–2249 (2009)
Daswani, M., Sunehag, P., Hutter, M.: Feature reinforcement learning: state of the art. In: Sequential decision-making with big data: AAAI 2014. Assoc. Adv. Artif. Intel. (2014)
Dowe, D.L., Hernández-Orallo, J., Das, P.K.: Compression and intelligence: social environments and communication. In: Schmidhuber, J., Thórisson, K.R., Looks, M. (eds.) AGI 2011. LNCS, vol. 6830, pp. 204–211. Springer, Heidelberg (2011)
Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159 (2011)
Erhan, D., Bengio, Y., Courville, A., Manzagol, P.A., Vincent, P., Bengio, S.: Why does unsupervised pre-training help deep learning? J. Mach. Learn. Res. 11, 625–660 (2010)
Grant, M., Boyd, S.: CVX: Matlab software for disciplined convex programming, version 2.1., March 2014. http://cvxr.com/cvx
Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
Hochreiter, S.: Untersuchungen zu dynamischen neuronalen netzen. Master’s thesis, Institut für Informatik, Technische Universität, München (1991)
Hochreiter, S., Bengio, Y., Frasconi, P., Schmidhuber, J.: Gradient flow in recurrent nets: the difficulty of learning long-term dependencies (2001)
Hutter, M.: Feature reinforcement learning: Part I. unstructured MDPs. J. Artif. Gen. Intel. 1, 3–24 (2009)
Hyvärinen, A., Karhunen, J., Oja, E.: Independent Component Analysis, vol. 46. Wiley, New York (2004)
Kearns, M., Koller, D.: Efficient reinforcement learning in factored MDPs. In: IJCAI, vol. 16, pp. 740–747 (1999)
Kingma, D., Ba, J.: Adam: a method for stochastic optimization. arXiv:1412.6980 (2014)
Lőrincz, A., Szirtes, G.: Here and now: how time segments may become events in the hippocampus. Neural Netw. 22(5), 738–747 (2009)
Makhzani, A., Frey, B.: k-sparse autoencoders. arXiv:1312.5663 (2013)
Makhzani, A., Frey, B.J.: Winner-take-all autoencoders. In: Advances in Neural Information Processing Systems, pp. 2773–2781 (2015)
Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the 27th International Conference on Machine Learning, pp. 807–814 (2010)
Ng, A.Y.: Feature selection, l1 vs. l2 regularization, and rotational invariance. In: Proceedings of the 21st International Conference on Machine Learning, p. 78. ACM (2004)
O’Keefe, J., Dostrovsky, J.: The hippocampus as a spatial map. preliminary evidence from unit activity in the freely-moving rat. Brain Res. 34(1), 171–175 (1971)
O’Keefe, J., Nadel, L.: The Hippocampus as a Cognitive Map. Clarendon Press, Oxford (1978)
Rasmus, A., Berglund, M., Honkala, M., Valpola, H., Raiko, T.: Semi-supervised learning with ladder networks. In: Advances in Neural Information Processing Systems, pp. 3532–3540 (2015)
Salakhutdinov, R.: Learning deep generative models. Ann. Rev. Stat. Appl. 2, 361–385 (2015)
Schmidhuber, J.: Driven by compression progress: a simple principle explains essential aspects of subjective beauty, novelty, surprise, interestingness, attention, curiosity, creativity, art, science, music, jokes. In: Pezzulo, G., Butz, M.V., Sigaud, O., Baldassarre, G. (eds.) Anticipatory Behavior in Adaptive Learning Systems. LNCS, vol. 5499, pp. 48–76. Springer, Heidelberg (2009)
Schönfeld, F., Wiskott, L.: Modeling place field activity with hierarchical slow feature analysis. Frontiers Comp. Neurosci. 9 (2015)
Sun, Y., Mao, H., Sang, Y., Yi, Z.: Explicit guiding auto-encoders for learning meaningful representation. Neural Comp. Appl., 1–8 (2015)
Szepesvári, C., Lőrincz, A.: An integrated architecture for motion-control and path-planning. J. Robot. Syst. 15(1), 1–15 (1998)
Szita, I., Lőrincz, A.: Optimistic initialization and greediness lead to polynomial time learning in factored MDPs. In: Proceddings of the 26th International Conference on Machine Learning, pp. 1001–1008. ACM (2009)
Szita, I., Takács, B., Lőrincz, A.: \(\varepsilon \)-MDPs: Learning in varying environments. J. Mach. Learn. Res. 3, 145–174 (2003)
Tenenbaum, J.B., Freeman, W.T.: Separating style and content with bilinear models. Neural Comp. 12, 1247–1283 (2000)
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. Royal Stat. Soc. Ser. B (Meth.) 58, 267–288 (1996)
Tősér, Z., Lőrincz, A.: The cyber-physical system approach towards artificial general intelligence: the problem of verification. In: Bieger, J., Goertzel, B., Potapov, A. (eds.) AGI 2015. LNCS, vol. 9205, pp. 373–383. Springer, Heidelberg (2015)
Tropp, J.A., Gilbert, A.C.: Signal recovery from random measurements via orthogonal matching pursuit. Info. Theo. 53(12), 4655–4666 (2007)
Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.A.: Stacked denoising autoencoders. J. Mach. Learn. Res. 11, 3371–3408 (2010)
Zeiler, M.D.: Adadelta: an adaptive learning rate method. arXiv:1212.5701 (2012)
Acknowledgments
This work was supported by the EIT Digital grant (Grant No. 16257). Helpful comments from Gábor Szirtes are gratefully acknowledged.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Lőrincz, A., Sárkány, A., Milacski, Z.Á., Tősér, Z. (2016). Estimating Cartesian Compression via Deep Learning. In: Steunebrink, B., Wang, P., Goertzel, B. (eds) Artificial General Intelligence. AGI 2016. Lecture Notes in Computer Science(), vol 9782. Springer, Cham. https://doi.org/10.1007/978-3-319-41649-6_30
Download citation
DOI: https://doi.org/10.1007/978-3-319-41649-6_30
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-41648-9
Online ISBN: 978-3-319-41649-6
eBook Packages: Computer ScienceComputer Science (R0)