research-article

Evolutionary neural AutoML for deep learning

Authors:
Jason Liang

The University of Texas at Austin

The University of Texas at Austin
View Profile

,
Elliot Meyerson

The University of Texas at Austin

The University of Texas at Austin
View Profile

,
Babak Hodjat

The University of Texas at Austin

The University of Texas at Austin
View Profile

,
Dan Fink

The University of Texas at Austin

The University of Texas at Austin
View Profile

,
Karl Mutch

The University of Texas at Austin

The University of Texas at Austin
View Profile

,
Risto Miikkulainen

The University of Texas at Austin

The University of Texas at Austin
View Profile

GECCO '19: Proceedings of the Genetic and Evolutionary Computation ConferenceJuly 2019Pages 401–409https://doi.org/10.1145/3321707.3321721

Published:13 July 2019Publication History

GECCO '19: Proceedings of the Genetic and Evolutionary Computation Conference

Pages 401–409

ABSTRACT

Deep neural networks (DNNs) have produced state-of-the-art results in many benchmarks and problem domains. However, the success of DNNs depends on the proper configuration of its architecture and hyperparameters. Such a configuration is difficult and as a result, DNNs are often not used to their full potential. In addition, DNNs in commercial applications often need to satisfy real-world design constraints such as size or number of parameters. To make configuration easier, automatic machine learning (AutoML) systems for deep learning have been developed, focusing mostly on optimization of hyperparameters.

This paper takes AutoML a step further. It introduces an evolutionary AutoML framework called LEAF that not only optimizes hyperparameters but also network architectures and the size of the network. LEAF makes use of both state-of-the-art evolutionary algorithms (EAs) and distributed computing frameworks. Experimental results on medical image classification and natural language analysis show that the framework can be used to achieve state-of-the-art performance. In particular, LEAF demonstrates that architecture optimization provides a significant boost over hyperparameter optimization, and that networks can be minimized at the same time with little drop in performance. LEAF therefore forms a foundation for democratizing and improving AI, as well as making AI practical in future applications.

References

2017. AutoML for large scale image classification and object detection. https://ai.googleblog.com/2017/11/automl-for-large-scale-image.html. (Nov 2017).Google Scholar
2019. Amazon Web Services (AWS) - Cloud Computing Services. aws.amazon.com. (2019).Google Scholar
2019. Google Cloud. cloud.google.com. (2019).Google Scholar
2019. Jigsaw Toxic Comment Classification Challenge. kaggle.com/jigsaw-toxic-comment-classification-challenge. (2019).Google Scholar
2019. Metric Optimization Engine. https://github.com/Yelp/MOE. (2019).Google Scholar
2019. Microsoft Azure Cloud Computing Platform and Services. azure.microsoft.com. (2019).Google Scholar
2019. Research:Detox/Data Release. meta.wikimedia.org/wiki/Research:Detox. (2019).Google Scholar
2019. SigOpt. sigopt.com/product. (2019).Google Scholar
2019. StudioML. https://www.studio.ml/. (2019).Google Scholar
2019. Using the Microsoft TLC Machine Learning Tool. jamesmccaffrey.wordpress.com/2017/01/13/using-the-microsoft-tlc-machine-learning-tool. (2019).Google Scholar
James Bergstra and Yoshua Bengio. 2012. Random search for hyper-parameter optimization. Journal of Machine Learning Research 13, Feb (2012), 281--305. Google ScholarDigital Library
Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2016. Enriching Word Vectors with Subword Information. arXiv preprint arXiv:1607.04606 (2016).Google Scholar
Zhengping Che, Sanjay Purushotham, Kyunghyun Cho, David Sontag, and Yan Liu. 2016. Recurrent Neural Networks for Multivariate Time Series with Missing Values. CoRR abs/1606.01865 (2016). http://arxiv.org/abs/1606.01865Google Scholar
F. Chollet et al. 2015. Keras. (2015).Google Scholar
Theodora Chu, Kylie Jue, and Max Wang. 2016. Comment Abuse Classification with Deep Learning. Von https://web.stanford.edu/class/cs224n/reports/2762092.pdf abgerufen (2016).Google Scholar
Ronan Collobert and Jason Weston. 2008. A unified architecture for natural language processing: Deep neural networks with multitask learning. In Proceedings of the 25th international conference on Machine learning. ACM, 160--167. Google ScholarDigital Library
Kalyanmoy Deb. 2015. Multi-objective evolutionary algorithms. In Springer Handbook of Computational Intelligence. Springer, 995--1015.Google Scholar
Thomas Elsken, J Hendrik Metzen, and Frank Hutter. 1804. Efficient Multi-objective Neural Architecture Search via Lamarckian Evolution. ArXiv e-prints (1804).Google Scholar
Faustino Gomez, Jürgen Schmidhuber, and Risto Miikkulainen. 2008. Accelerated neural evolution through cooperatively coevolved synapses. Journal of Machine Learning Research 9, May (2008), 937--965. Google ScholarDigital Library
Faustino J Gomez and Risto Miikkulainen. 1999. Solving non-Markovian control tasks with neuroevolution. In IJCAI, Vol. 99. 1356--1361. Google ScholarDigital Library
Alex Graves, Abdel-rahman Mohamed, and Geoffrey Hinton. 2013. Speech recognition with deep recurrent neural networks. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 6645--6649.Google ScholarCross Ref
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770--778.Google ScholarCross Ref
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Identity Mappings in Deep Residual Networks. CoRR abs/1603.05027 (2016). http://arxiv.org/abs/1603.05027Google Scholar
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Identity mappings in deep residual networks. In European conference on computer vision. Springer, 630--645.Google ScholarCross Ref
Andrew G Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017).Google Scholar
Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kilian Q Weinberger. 2017. Densely Connected Convolutional Networks.. In CVPR, Vol. 1. 3.Google Scholar
Eamonn Keogh and Abdullah Mueen. 2011. Curse of dimensionality. In Encyclopedia of machine learning. Springer, 257--258.Google Scholar
D. P. Kingma and J. Ba. 2014. Adam: A Method for Stochastic Optimization. CoRR abs/1412.6980 (2014).Google Scholar
Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. Nature 521, 7553 (2015), 436--444.Google Scholar
Jason Liang, Elliot Meyerson, and Risto Miikkulainen. 2018. Evolutionary Architecture Search for Deep Multitask Networks. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO '18). ACM, New York, NY, USA, 466--473. Google ScholarDigital Library
Hanxiao Liu, Karen Simonyan, Oriol Vinyals, Chrisantha Fernando, and Koray Kavukcuoglu. 2017. Hierarchical representations for efficient architecture search. arXiv preprint arXiv:1711.00436 (2017).Google Scholar
Ilya Loshchilov and Frank Hutter. 2016. CMA-ES for Hyperparameter Optimization of Deep Neural Networks. arXiv preprint arXiv:1604.07269 (2016).Google Scholar
Zhichao Lu, Ian Whalen, Vishnu Boddeti, Yashesh Dhebar, Kalyanmoy Deb, Erik Goodman, and Wolfgang Banzhaf. 2018. NSGA-NET: A Multi-Objective Genetic Algorithm for Neural Architecture Search. arXiv preprint arXiv:1810.03522 (2018).Google Scholar
E. Meyerson and R. Miikkulainen. 2018. Beyond Shared Hierarchies: Deep Multitask Learning through Soft Layer Ordering. ICLR (2018).Google Scholar
R. Miikkulainen, J. Liang, E. Meyerson, et al. 2017. Evolving deep neural networks. arXiv preprint arXiv:1703.00548 (2017).Google Scholar
Tomáš Mikolov, Martin Karafát, Lukáš Burget, Jan Černockỳ, and Sanjeev Khudanpur. 2010. Recurrent neural network based language model. In Eleventh Annual Conference of the International Speech Communication Association.Google ScholarCross Ref
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems. 3111--3119. Google ScholarDigital Library
David E Moriarty and Risto Miikkulainen. 1998. Hierarchical evolution of neural networks. In Evolutionary Computation Proceedings, 1998. IEEE World Congress on Computational Intelligence., The 1998 IEEE International Conference on. IEEE, 428--433.Google ScholarCross Ref
Joe Yue-Hei Ng, Matthew J. Hausknecht, Sudheendra Vijayanarasimhan, Oriol Vinyals, Rajat Monga, and George Toderici. 2015. Beyond Short Snippets: Deep Networks for Video Classification. CoRR abs/1503.08909 (2015). http://arxiv.org/abs/1503.08909Google Scholar
Mitchell A Potter and Kenneth A De Jong. 1994. A cooperative coevolutionary approach to function optimization. In International Conference on Parallel Problem Solving from Nature. Springer, 249--257. Google ScholarDigital Library
Pranav Rajpurkar, Jeremy Irvin, Kaylie Zhu, Brandon Yang, Hershel Mehta, Tony Duan, Daisy Ding, Aarti Bagul, Curtis Langlotz, Katie Shpanskaya, et al. 2017. Chexnet: Radiologist-level pneumonia detection on chest x-rays with deep learning. arXiv preprint arXiv:1711.05225 (2017).Google Scholar
Esteban Real, Alok Aggarwal, Yanping Huang, and Quoc V Le. 2018. Regularized evolution for image classifier architecture search. arXiv preprint arXiv:1802.01548 (2018).Google Scholar
E. Real, S. Moore, A. Selle, et al. 2017. Large-scale evolution of image classifiers. arXiv preprint arXiv:1703.01041 (2017). Google ScholarDigital Library
Jasper Snoek, Hugo Larochelle, and Ryan P Adams. 2012. Practical bayesian optimization of machine learning algorithms. In Advances in neural information processing systems. 2951--2959. Google ScholarDigital Library
Jasper Snoek, Oren Rippel, Kevin Swersky, Ryan Kiros, Nadathur Satish, Narayanan Sundaram, Md Mostofa Ali Patwary, Mr Prabhat, and Ryan P Adams. 2015. Scalable Bayesian Optimization Using Deep Neural Networks.. In ICML. 2171--2180. Google ScholarDigital Library
Kenneth O. Stanley and Risto Miikkulainen. 2002. Evolving Neural Networks Through Augmenting Topologies. Evolutionary Computation 10 (2002), 99--127. stanley:ec02 Google ScholarDigital Library
M. Suganuma, S. Shirakawa, and T. Nagao. 2017. A genetic programming approach to designing convolutional neural network architectures. In Proc. of GECCO. ACM, 497--504. Google ScholarDigital Library
Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1--9.Google ScholarCross Ref
Pascal Vincent, Hugo Larochelle, Isabelle Lajoie, Yoshua Bengio, and Pierre-Antoine Manzagol. 2010. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. Journal of machine learning research 11, Dec (2010), 3371--3408. Google ScholarDigital Library
Xiaosong Wang, Yifan Peng, Le Lu, Zhiyong Lu, Mohammadhadi Bagheri, and Ronald M Summers. 2017. Chestx-ray8: Hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In Computer Vision and Pattern Recognition (CVPR), 2017 IEEE Conference on. IEEE, 3462--3471.Google ScholarCross Ref
Ronald J Williams. 1992. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine learning 8, 3--4 (1992), 229--256. Google ScholarDigital Library
Kohsuke Yanai and Hitoshi Iba. 2001. Multi-agent robot learning by means of genetic programming: Solving an escape problem. In International Conference on Evolvable Systems. Springer, 192--203. Google ScholarDigital Library
Chern Han Yong and Risto Miikkulainen. 2001. Cooperative coevolution of multi-agent systems. University of Texas at Austin, Austin, TX (2001).Google ScholarDigital Library
Aimin Zhou, Bo-Yang Qu, Hui Li, Shi-Zheng Zhao, Ponnuthurai Nagaratnam Suganthan, and Qingfu Zhang. 2011. Multiobjective evolutionary algorithms: A survey of the state of the art. Swarm and Evolutionary Computation 1, 1 (2011), 32--49.Google ScholarCross Ref
Barret Zoph and Quoc V. Le. 2016. Neural Architecture Search with Reinforcement Learning. CoRR abs/1611.01578 (2016). http://arxiv.org/abs/1611.01578Google Scholar

Index Terms

Evolutionary neural AutoML for deep learning
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
    2. Distributed artificial intelligence
  2. Machine learning
    1. Machine learning approaches
      1. Neural networks

Recommendations

Flood susceptibility mapping using AutoML and a deep learning framework with evolutionary algorithms for hyperparameter optimization
Abstract
Flooding is one of the most common natural hazards that have extremely detrimental consequences. Understanding which areas are vulnerable to flooding is crucial to addressing these effects. In this work, we use machine learning models and ...
Graphical abstract

Display Omitted
Highlights
- A 3D CNN model is proposed to assess flood susceptibility.
- Model performance is compared with state-of-the-art machine learning and AutoML models.
- A novel hyperparameter optimization model designed for the 3D CNN model.
- The ...
Read More
Evolutionary architecture search for deep multitask networks
GECCO '18: Proceedings of the Genetic and Evolutionary Computation Conference

Multitask learning, i.e. learning several tasks at once with the same neural network, can improve performance in each of the tasks. Designing deep neural network architectures for multitask learning is a challenge: There are many ways to tie the tasks ...
Read More
Autonomous Learning Rate Optimization for Deep Learning
Learning and Intelligent Optimization
Abstract
A significant question in deep learning is: what should that learning rate be? The answer to this question is often tedious and time consuming to obtain, and a great deal of arcane knowledge has accumulated in recent years over how to pick and ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
GECCO '19: Proceedings of the Genetic and Evolutionary Computation Conference
July 2019
1545 pages
ISBN:9781450361118
DOI:10.1145/3321707
Editor:
Manuel López-Ibáñez
University of Manchester, UK
,
General Chairs:
Anne Auger
Inria and Ecole Polytechnique, France
,
Thomas Stützle
IRIDIA, Université libre de Bruxelles Belgium
Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 13 July 2019
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
AutoML
artificial intelligence
neural networks/deep learning
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,669of4,410submissions,38%
Upcoming Conference
GECCO '24

Sponsor:

sigevo

Genetic and Evolutionary Computation Conference

July 14 - 18, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 71
  Total Citations
  View Citations
- 1,235
  Total Downloads
- Downloads (Last 12 months)110
- Downloads (Last 6 weeks)16
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Evolutionary neural AutoML for deep learning

GECCO '19: Proceedings of the Genetic and Evolutionary Computation Conference

ABSTRACT

References

Cited By

Index Terms

Recommendations

Flood susceptibility mapping using AutoML and a deep learning framework with evolutionary algorithms for hyperparameter optimization

Evolutionary architecture search for deep multitask networks

Autonomous Learning Rate Optimization for Deep Learning