Convolution by Evolution: Differentiable Pattern Producing Networks

Authors:
Chrisantha Fernando

Google DeepMind, London, United Kingdom

Google DeepMind, London, United Kingdom
View Profile

,
Dylan Banarse

Google DeepMind, London, United Kingdom

Google DeepMind, London, United Kingdom
View Profile

,
Malcolm Reynolds

Google DeepMind, London, United Kingdom

Google DeepMind, London, United Kingdom
View Profile

,
Frederic Besse

Google DeepMind, London, United Kingdom

Google DeepMind, London, United Kingdom
View Profile

,
David Pfau

Google DeepMind, London, United Kingdom

Google DeepMind, London, United Kingdom
View Profile

,
Max Jaderberg

Google DeepMind, London, United Kingdom

Google DeepMind, London, United Kingdom
View Profile

,
Marc Lanctot

Google DeepMind, London, United Kingdom

Google DeepMind, London, United Kingdom
View Profile

,
Daan Wierstra

Google DeepMind, London, United Kingdom

Google DeepMind, London, United Kingdom
View Profile

GECCO '16: Proceedings of the Genetic and Evolutionary Computation Conference 2016July 2016Pages 109–116https://doi.org/10.1145/2908812.2908890

Published:20 July 2016Publication History

GECCO '16: Proceedings of the Genetic and Evolutionary Computation Conference 2016

Pages 109–116

ABSTRACT

In this work we introduce a differentiable version of the Compositional Pattern Producing Network, called the DPPN. Unlike a standard CPPN, the topology of a DPPN is evolved but the weights are learned. A Lamarckian algorithm, that combines evolution and learning, produces DPPNs to reconstruct an image. Our main result is that DPPNs can be evolved/trained to compress the weights of a denoising autoencoder from 157684 to roughly 200 parameters, while achieving a reconstruction accuracy comparable to a fully connected network with more than two orders of magnitude more parameters. The regularization ability of the DPPN allows it to rediscover (approximate) convolutional network architectures embedded within a fully connected architecture. Such convolutional architectures are the current state of the art for many computer vision applications, so it is satisfying that DPPNs are capable of discovering this structure rather than having to build it in by design. DPPNs exhibit better generalization when tested on the Omniglot dataset after being trained on MNIST, than directly encoded fully connected autoencoders. DPPNs are therefore a new framework for integrating learning and evolution.

References

J. Bayer, D. Wierstra, J. Togelius, and J. Schmidhuber. Evolving memory cell structures for sequence learning. In Artificial Neural Networks--ICANN 2009, pages 755--764. Springer, 2009. Google ScholarDigital Library
L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille. Semantic image segmentation with deep convolutional nets and fully connected crfs. arXiv preprint arXiv:1412.7062, 2014.Google Scholar
M. Denil, B. Shakibi, L. Dinh, N. de Freitas, et al. Predicting parameters in deep learning. In Advances in Neural Information Processing Systems, pages 2148--2156, 2013.Google ScholarDigital Library
J. Duchi, E. Hazan, and Y. Singer. Adaptive subgradient methods for online learning and stochastic optimization. The Journal of Machine Learning Research, 12:2121--2159, 2011. Google ScholarDigital Library
D. J. Felleman and D. C. Van Essen. Distributed hierarchical processing in the primate cerebral cortex. Cerebral cortex, 1(1):1--47, 1991.Google ScholarCross Ref
K. Fukushima. Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological cybernetics, 36(4):193--202, 1980.Google ScholarCross Ref
A. Graves. Generating sequences with recurrent neural networks. arXiv preprint arXiv:1308.0850, 2013.Google Scholar
I. Harvey. The microbial genetic algorithm. In Advances in artificial life. Darwin Meets von Neumann, pages 126--133. Springer, 2011. Google ScholarDigital Library
K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. arXiv preprint arXiv:1512.03385, 2015.Google Scholar
D. H. Hubel and T. N. Wiesel. Receptive fields, binocular interaction and functional architecture in the cat's visual cortex. The Journal of physiology, 160(1):106, 1962.Google ScholarCross Ref
M. Jaderberg, A. Vedaldi, and A. Zisserman. Speeding up convolutional neural networks with low rank expansions. arXiv preprint arXiv:1405.3866, 2014.Google Scholar
Ł. Kaiser and I. Sutskever. Neural gpus learn algorithms. arXiv preprint arXiv:1511.08228, 2015.Google Scholar
D. Kingma and J. Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.Google Scholar
A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pages 1097--1105, 2012.Google ScholarDigital Library
B. M. Lake, R. Salakhutdinov, and J. B. Tenenbaum. Human-level concept learning through probabilistic program induction. Science, 350(6266):1332--1338, 2015.Google ScholarCross Ref
Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel. Backpropagation applied to handwritten zip code recognition. Neural computation, 1(4):541--551, 1989. Google ScholarDigital Library
Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278--2324, 1998.Google ScholarCross Ref
V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, et al. Human-level control through deep reinforcement learning. Nature, 518(7540):529--533, 2015.Google ScholarCross Ref
G. Morse, S. Risi, C. R. Snyder, and K. O. Stanley. Single-unit pattern generators for quadruped locomotion. In Proceedings of the 15th annual conference on Genetic and evolutionary computation, pages 719--726. ACM, 2013. Google ScholarDigital Library
A. Nguyen, J. Yosinski, and J. Clune. Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. Computer Vision and Pattern Recognition (CVPR), 2015 IEEE Conference on, 2015.Google ScholarCross Ref
J. K. Pugh and K. O. Stanley. Evolving multimodal controllers with hyperneat. In Proceedings of the 15th annual conference on Genetic and evolutionary computation, pages 735--742. ACM, 2013. Google ScholarDigital Library
D. E. Rumelhart, G. E. Hinton, and R. J. Williams. Learning representations by back-propagating errors. Cognitive modeling, 5:3, 1988.Google Scholar
M. Santos, E. Szathmáry, and J. F. Fontanari. Phenotypic plasticity, the baldwin effect, and the speeding up of evolution: The computational roots of an illusion. Journal of theoretical biology, 371:127--136, 2015.Google Scholar
J. Secretan, N. Beato, D. B. D'Ambrosio, A. Rodriguez, A. Campbell, J. T. Folsom-Kovarik, and K. O. Stanley. Picbreeder: A case study in collaborative evolutionary exploration of design space. Evolutionary Computation, 19(3):373--403, 2011. Google ScholarDigital Library
K. O. Stanley. Compositional pattern producing networks: A novel abstraction of development. Genetic programming and evolvable machines, 8(2):131--162, 2007. Google ScholarDigital Library
K. O. Stanley, D. B. D'Ambrosio, and J. Gauci. A hypercube-based encoding for evolving large-scale neural networks. Artificial life, 15(2):185--212, 2009. Google ScholarDigital Library
K. O. Stanley and R. Miikkulainen. Evolving neural networks through augmenting topologies. Evolutionary computation, 10(2):99--127, 2002. Google ScholarDigital Library
P. Verbancsics and J. Harguess. Generative neuroevolution for deep learning. arXiv preprint arXiv:1312.5355, 2013.Google Scholar
P. Vincent, H. Larochelle, Y. Bengio, and P.-A. Manzagol. Extracting and composing robust features with denoising autoencoders. In Proceedings of the 25th international conference on Machine learning, pages 1096--1103. ACM, 2008. Google ScholarDigital Library
Z. Yang, M. Moczulski, M. Denil, N. de Freitas, A. Smola, L. Song, and Z. Wang. Deep fried convnets. In International Conference on Computer Vision (ICCV), 2015. Google ScholarDigital Library
X. Yao, Y. Liu, and G. Lin. Evolutionary programming made faster. Evolutionary Computation, IEEE Transactions on, 3(2):82--102, 1999. Google ScholarDigital Library
W. Zaremba. An empirical exploration of recurrent network architectures.Google Scholar

Index Terms

Convolution by Evolution: Differentiable Pattern Producing Networks
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
      1. Bio-inspired approaches
        Genetic algorithms
      2. Neural networks

Recommendations

Scaffolding for interactively evolving novel drum tracks for existing songs
Evo'08: Proceedings of the 2008 conference on Applications of evolutionary computing

A major challenge in computer-generated music is to produce music that sounds natural. This paper introduces NEAT Drummer, which takes steps toward natural creativity. NEAT Drummer evolves a kind of artificial neural network called a Compositional ...
Read More
Convolutional adaptive denoising autoencoders for hierarchical feature extraction

Convolutional neural networks (CNNs) are typical structures for deep learning and are widely used in image recognition and classification. However, the random initialization strategy tends to become stuck at local plateaus or even diverge, which results ...
Read More
A hypercube-based encoding for evolving large-scale neural networks

Research in neuroevolution---that is, evolving artificial neural networks (ANNs) through evolutionary algorithms---is inspired by the evolution of biological brains, which can contain trillions of connections. Yet while neuroevolution has produced ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
GECCO '16: Proceedings of the Genetic and Evolutionary Computation Conference 2016
July 2016
1196 pages
ISBN:9781450342063
DOI:10.1145/2908812
Editor:
Tobias Friedrich
Hasso Plattner Institute
,
General Chair:
Frank Neumann
University of Adelaide
,
Program Chair:
Andrew M. Sutton
Hasso Plattner Institute
Copyright © 2016 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 20 July 2016
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
CPPNs
MNIST
compositional pattern producing networks
denoising autoencoder
Qualifiers
- research-article
Conference

Acceptance Rates
GECCO '16 Paper Acceptance Rate137of381submissions,36%Overall Acceptance Rate1,669of4,410submissions,38%
More
Upcoming Conference
GECCO '24

Sponsor:

sigevo

Genetic and Evolutionary Computation Conference

July 14 - 18, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 51
  Total Citations
  View Citations
- 1,584
  Total Downloads
- Downloads (Last 12 months)126
- Downloads (Last 6 weeks)13
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Convolution by Evolution: Differentiable Pattern Producing Networks

GECCO '16: Proceedings of the Genetic and Evolutionary Computation Conference 2016

ABSTRACT

References

Cited By

Index Terms

Recommendations

Scaffolding for interactively evolving novel drum tracks for existing songs

Convolutional adaptive denoising autoencoders for hierarchical feature extraction

A hypercube-based encoding for evolving large-scale neural networks