Selection-Expansion: A Unifying Framework for Motion-Planning and Diversity Search Algorithms

Chenu, Alexandre; Perrin-Gilbert, Nicolas; Doncieux, Stéphane; Sigaud, Olivier

doi:10.1007/978-3-030-86380-7_46

Alexandre Chenu¹²,
Nicolas Perrin-Gilbert¹²,
Stéphane Doncieux¹² &
…
Olivier Sigaud¹²

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12894))

Included in the following conference series:

International Conference on Artificial Neural Networks

2032 Accesses

Abstract

Reinforcement learning agents need a reward signal to learn successful policies. When this signal is sparse or the corresponding gradient is deceptive, such agents need a dedicated mechanism to efficiently explore their search space without relying on the reward. Looking for a large diversity of behaviors or using Motion Planning (MP) algorithms are two options in this context. In this paper, we build on the common roots between these two options to investigate the properties of two diversity search algorithms, the Novelty Search and the Goal Exploration Process algorithms. These algorithms look for diversity in an outcome space or behavioral space which is generally hand-designed to represent what matters for a given task. The relation to MP algorithms reveals that the smoothness, or lack of smoothness of the mapping between the policy parameter space and the outcome space plays a key role in the search efficiency. In particular, we show empirically that, if the mapping is smooth enough, i.e. if two close policies in the parameter space lead to similar outcomes, then diversity algorithms tend to inherit exploration properties of MP algorithms. By contrast, if it is not, diversity algorithms lose the properties of their MP counterparts and their performance strongly depends on heuristics like filtering mechanisms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Also called behavioral space in the literature.
2.
Additional details are available in https://arxiv.org/pdf/2104.04768.pdf.
3.
The non-rectangular shape of \(\mathcal {O}\) in the 3D ballistic throw environment makes some cells of the expansion grid unreachable, which explains why GEP eventually covers only about \(\sim 60\%\) of \(\mathcal {O}\).

References

Akkaya, I., et al.: Solving rubik’s cube with a robot hand. arXiv preprint arXiv:1910.07113 (2019)
Cideron, G., Pierrot, T., Perrin, N., Beguir, K., Sigaud, O.: Qd-rl: Efficient mixing of quality and diversity in reinforcement learning (2020)
Google Scholar
Colas, C., Sigaud, O., Oudeyer, P.: GEP-PG: decoupling exploration and exploitation in deep reinforcement learning algorithms. CoRR arXiv:1802.05054 (2018)
Cully, A., Demiris, Y.: Quality and diversity optimization: a unifying modular framework. IEEE Trans. Evol. Comput. 22(2), 245–259 (2018)
Article Google Scholar
Deb, K., Deb, D.: Analysing mutation schemes for real-parameter genetic algorithms. Int. J. Artif. Intell. Soft Comput. 4, 1–28 (2014)
Google Scholar
Doncieux, S., Laflaquière, A., Coninx, A.: Novelty search: a theoretical perspective. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 99–106. ACM, Prague Czech Republic (July 2019)
Google Scholar
Ecoffet, A., Huizinga, J., Lehman, J., Stanley, K.O., Clune, J.: Go-explore: a new approach for hard-exploration problems (2019)
Google Scholar
Forestier, S.: Intrinsically Motivated Goal Exploration in Child Development and Artificial Intelligence: Learning and Development of Speech and Tool Use. Ph.D. thesis, U. Bordeaux (2019)
Google Scholar
Hsu, D., Latombe, J., Motwani, R.: Path planning in expansive configuration spaces. In: Proceedings ICRA, vol. 3, pp. 2719–2726 (1997)
Google Scholar
Kleinbort, M., Solovey, K., Littlefield, Z., Bekris, K.E., Halperin, D.: Probabilistic completeness of RRT for geometric and kinodynamic planning with forward propagation. arXiv:1809.07051 [cs] (September 2018)
LaValle, S.M.: Rapidly-exploring random trees: A new tool for path planning. Technical Report, 98–11, Computer Science Department, Iowa State University (1998)
Google Scholar
Lehman, J., Chen, J., Clune, J., Stanley, K.O.: Safe mutations for deep and recurrent neural networks through output gradients. arXiv:1712.06563 [cs] (May 2018)
Lehman, J., Stanley, K.O.: Abandoning objectives: evolution through the search for novelty alone. Evol. Comput. 19(2), 189–223 (2011)
Article Google Scholar
Matheron, G., Perrin, N., Sigaud, O.: The problem with DDPG: understanding failures in deterministic environments with sparse rewards. arXiv preprint arXiv:1911.11679 (2019)
Matheron, G., Perrin, N., Sigaud, O.: Pbcs : efficient exploration and exploitation using a synergy between reinforcement learning and motion planning (2020)
Google Scholar
Silver, D.: Mastering the game of go without human knowledge. Nature 550(7676), 354–359 (2017)
Google Scholar
Such, F.P., Madhavan, V., Conti, E., Lehman, J., Stanley, K.O., Clune, J.: Deep neuroevolution: Genetic algorithms are a competitive alternative for training deep neural networks for reinforcement learning. CoRR 1712.06567, pp. 1–2 (2017)
Google Scholar
Grandmaster level in starcraft ii using multi-agent reinforcement learning. Nature 575(7782), 350–354 (2019)
Google Scholar

Download references

Acknowledgements

This work was partially supported by the French National Research Agency (ANR), Project ANR-18-CE33-0005 HUSKI.

Author information

Authors and Affiliations

Sorbonne Université, CNRS, Institut des Systèmes Intelligents Et de Robotique, ISIR, 75005, Paris, France
Alexandre Chenu, Nicolas Perrin-Gilbert, Stéphane Doncieux & Olivier Sigaud

Authors

Alexandre Chenu
View author publications
You can also search for this author in PubMed Google Scholar
Nicolas Perrin-Gilbert
View author publications
You can also search for this author in PubMed Google Scholar
Stéphane Doncieux
View author publications
You can also search for this author in PubMed Google Scholar
Olivier Sigaud
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Comenius University in Bratislava, Bratislava, Slovakia
Igor Farkaš
iMotions A/S, Copenhagen, Denmark
Paolo Masulli
University of Tübingen, Tübingen, Baden-Württemberg, Germany
Sebastian Otte
Universität Hamburg, Hamburg, Germany
Stefan Wermter

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chenu, A., Perrin-Gilbert, N., Doncieux, S., Sigaud, O. (2021). Selection-Expansion: A Unifying Framework for Motion-Planning and Diversity Search Algorithms. In: Farkaš, I., Masulli, P., Otte, S., Wermter, S. (eds) Artificial Neural Networks and Machine Learning – ICANN 2021. ICANN 2021. Lecture Notes in Computer Science(), vol 12894. Springer, Cham. https://doi.org/10.1007/978-3-030-86380-7_46

Download citation

DOI: https://doi.org/10.1007/978-3-030-86380-7_46
Published: 07 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86379-1
Online ISBN: 978-3-030-86380-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Selection-Expansion: A Unifying Framework for Motion-Planning and Diversity Search Algorithms