Skip to main content

Selection-Expansion: A Unifying Framework for Motion-Planning and Diversity Search Algorithms

  • Conference paper
  • First Online:
Book cover Artificial Neural Networks and Machine Learning – ICANN 2021 (ICANN 2021)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12894))

Included in the following conference series:

  • 2032 Accesses

Abstract

Reinforcement learning agents need a reward signal to learn successful policies. When this signal is sparse or the corresponding gradient is deceptive, such agents need a dedicated mechanism to efficiently explore their search space without relying on the reward. Looking for a large diversity of behaviors or using Motion Planning (MP) algorithms are two options in this context. In this paper, we build on the common roots between these two options to investigate the properties of two diversity search algorithms, the Novelty Search and the Goal Exploration Process algorithms. These algorithms look for diversity in an outcome space or behavioral space which is generally hand-designed to represent what matters for a given task. The relation to MP algorithms reveals that the smoothness, or lack of smoothness of the mapping between the policy parameter space and the outcome space plays a key role in the search efficiency. In particular, we show empirically that, if the mapping is smooth enough, i.e. if two close policies in the parameter space lead to similar outcomes, then diversity algorithms tend to inherit exploration properties of MP algorithms. By contrast, if it is not, diversity algorithms lose the properties of their MP counterparts and their performance strongly depends on heuristics like filtering mechanisms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Also called behavioral space in the literature.

  2. 2.

    Additional details are available in https://arxiv.org/pdf/2104.04768.pdf.

  3. 3.

    The non-rectangular shape of \(\mathcal {O}\) in the 3D ballistic throw environment makes some cells of the expansion grid unreachable, which explains why GEP eventually covers only about \(\sim 60\%\) of \(\mathcal {O}\).

References

  1. Akkaya, I., et al.: Solving rubik’s cube with a robot hand. arXiv preprint arXiv:1910.07113 (2019)

  2. Cideron, G., Pierrot, T., Perrin, N., Beguir, K., Sigaud, O.: Qd-rl: Efficient mixing of quality and diversity in reinforcement learning (2020)

    Google Scholar 

  3. Colas, C., Sigaud, O., Oudeyer, P.: GEP-PG: decoupling exploration and exploitation in deep reinforcement learning algorithms. CoRR arXiv:1802.05054 (2018)

  4. Cully, A., Demiris, Y.: Quality and diversity optimization: a unifying modular framework. IEEE Trans. Evol. Comput. 22(2), 245–259 (2018)

    Article  Google Scholar 

  5. Deb, K., Deb, D.: Analysing mutation schemes for real-parameter genetic algorithms. Int. J. Artif. Intell. Soft Comput. 4, 1–28 (2014)

    Google Scholar 

  6. Doncieux, S., Laflaquière, A., Coninx, A.: Novelty search: a theoretical perspective. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 99–106. ACM, Prague Czech Republic (July 2019)

    Google Scholar 

  7. Ecoffet, A., Huizinga, J., Lehman, J., Stanley, K.O., Clune, J.: Go-explore: a new approach for hard-exploration problems (2019)

    Google Scholar 

  8. Forestier, S.: Intrinsically Motivated Goal Exploration in Child Development and Artificial Intelligence: Learning and Development of Speech and Tool Use. Ph.D. thesis, U. Bordeaux (2019)

    Google Scholar 

  9. Hsu, D., Latombe, J., Motwani, R.: Path planning in expansive configuration spaces. In: Proceedings ICRA, vol. 3, pp. 2719–2726 (1997)

    Google Scholar 

  10. Kleinbort, M., Solovey, K., Littlefield, Z., Bekris, K.E., Halperin, D.: Probabilistic completeness of RRT for geometric and kinodynamic planning with forward propagation. arXiv:1809.07051 [cs] (September 2018)

  11. LaValle, S.M.: Rapidly-exploring random trees: A new tool for path planning. Technical Report, 98–11, Computer Science Department, Iowa State University (1998)

    Google Scholar 

  12. Lehman, J., Chen, J., Clune, J., Stanley, K.O.: Safe mutations for deep and recurrent neural networks through output gradients. arXiv:1712.06563 [cs] (May 2018)

  13. Lehman, J., Stanley, K.O.: Abandoning objectives: evolution through the search for novelty alone. Evol. Comput. 19(2), 189–223 (2011)

    Article  Google Scholar 

  14. Matheron, G., Perrin, N., Sigaud, O.: The problem with DDPG: understanding failures in deterministic environments with sparse rewards. arXiv preprint arXiv:1911.11679 (2019)

  15. Matheron, G., Perrin, N., Sigaud, O.: Pbcs : efficient exploration and exploitation using a synergy between reinforcement learning and motion planning (2020)

    Google Scholar 

  16. Silver, D.: Mastering the game of go without human knowledge. Nature 550(7676), 354–359 (2017)

    Google Scholar 

  17. Such, F.P., Madhavan, V., Conti, E., Lehman, J., Stanley, K.O., Clune, J.: Deep neuroevolution: Genetic algorithms are a competitive alternative for training deep neural networks for reinforcement learning. CoRR 1712.06567, pp. 1–2 (2017)

    Google Scholar 

  18. Grandmaster level in starcraft ii using multi-agent reinforcement learning. Nature 575(7782), 350–354 (2019)

    Google Scholar 

Download references

Acknowledgements

This work was partially supported by the French National Research Agency (ANR), Project ANR-18-CE33-0005 HUSKI.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chenu, A., Perrin-Gilbert, N., Doncieux, S., Sigaud, O. (2021). Selection-Expansion: A Unifying Framework for Motion-Planning and Diversity Search Algorithms. In: Farkaš, I., Masulli, P., Otte, S., Wermter, S. (eds) Artificial Neural Networks and Machine Learning – ICANN 2021. ICANN 2021. Lecture Notes in Computer Science(), vol 12894. Springer, Cham. https://doi.org/10.1007/978-3-030-86380-7_46

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-86380-7_46

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-86379-1

  • Online ISBN: 978-3-030-86380-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics