Double Critic Deep Reinforcement Learning for Mapless 3D Navigation of Unmanned Aerial Vehicles

Grando, Ricardo Bedin; de Jesus, Junior Costa; Kich, Victor Augusto; Kolling, Alisson Henrique; Drews-Jr, Paulo Lilles Jorge

doi:10.1007/s10846-021-01568-y

Double Critic Deep Reinforcement Learning for Mapless 3D Navigation of Unmanned Aerial Vehicles

Regular paper
Published: 31 January 2022

Volume 104, article number 29, (2022)
Cite this article

Journal of Intelligent & Robotic Systems Aims and scope Submit manuscript

Ricardo Bedin Grando¹,
Junior Costa de Jesus²,
Victor Augusto Kich³,
Alisson Henrique Kolling³ &
…
Paulo Lilles Jorge Drews-Jr²

606 Accesses
23 Citations
Explore all metrics

Abstract

This paper presents a novel deep reinforcement learning-based system for 3D mapless navigation for Unmanned Aerial Vehicles (UAVs). Instead of using an image-based sensing approach, we propose a simple learning system that uses only a few sparse range data from a distance sensor to train a learning agent. We based our approaches on two state-of-art double critics Deep-RL models: Twin Delayed Deep Deterministic Policy Gradient (TD3) and Soft Actor-Critic (SAC). We show that our two approaches manage to outperform an approach based on the Deep Deterministic Policy Gradient (DDPG) technique and the BUG2 algorithm. Also, our new Deep-RL structure based on Recurrent Neural Networks (RNNs) outperforms the current structure used to perform mapless navigation of mobile robots. Overall, we conclude that Deep-RL approaches based on double critic with Recurrent Neural Networks (RNNs) are better suited to perform mapless navigation and obstacle avoidance of UAVs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Soft Actor-Critic for Navigation of Mobile Robots

Article 14 May 2021

Autonomous Obstacle Avoidance and Target Tracking of UAV Based on Deep Reinforcement Learning

Article 19 March 2022

An End-to-End Deep Reinforcement Learning Model Based on Proximal Policy Optimization Algorithm for Autonomous Driving of Off-Road Vehicle

References

Dulac-Arnold, G., Evans, R., van Hasselt, H., Sunehag, P., Lillicrap, T., Hunt, J., Mann, T., Weber, T., Degris, T. , Coppin, B.: “Deep reinforcement learning in large discrete action spaces, arXiv:1512.07679 (2015)
Lillicrap, T. P., Hunt, J. J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D.: “Continuous control with deep reinforcement learning. In: ICLR (2015)
Duan, Y., Chen, X., Houthooft, R., Schulman, J., Abbeel, P.: “Benchmarking deep reinforcement learning for continuous control. In: International conference on machine learning. PMLR, pp 1329–1338 (2016)
Drews-Jr, P.L.J. , Hernández, E., Elfes, A., Nascimento, E.R., Campos, M.F.M.: “Real-time monocular obstacle avoidance using underwater dark channel prior”. In: IEEE/RSJ IROS, pp 4672–4677 (2016)
Xie, L., Wang, S.: A. markham, and. N. Trigoni, “Towards monocular vision based obstacle avoidance through deep reinforcement learning,” arXiv preprint arXiv 1706, 09829 (2017)
Google Scholar
Zhu, Y., Mottaghi, R., Kolve, E., Lim, J.J., Gupta, A., Fei-fei, L., Farhadi, A.: “Target-driven visual navigation in indoor scenes using deep reinforcement learning”. In: IEEE ICRA, pp 3357–3364 (2017)
Tai, L., Paolo, G., Liu, M: “Virtual-to-real deep reinforcement learning: Continuous control of mobile robots for mapless navigation”. In: IEEE/RSJ IROS, pp 31–36 (2017)
Chen, Y.F., Everett, M., Liu, M., How, J.P.: “Socially aware motion planning with deep reinforcement learning”. In: IEEE/RSJ IROS, pp 1343–1350 (2017)
Jesus, J.C., Bottega, J.A., Cuadros, M.A., Gamarra, D.F. : “Deep deterministic policy gradient for navigation of mobile robots in simulated environments”. In: ICAR, pp 362–367 (2019)
Sampedro, C., Rodriguez-Ramos, A., Bavle, H., Carrio, A., de la Puente, P., Campoy, P.: A fully-autonomous aerial robot for search and rescue applications in indoor environments using learning-based techniques. Journal of Intelligent & Robotic Systems 95(2), 601–627 (2019)
Article Google Scholar
Kang, K., Belkhale, S., Kahn, G., Abbeel, P., Levine, S.: “Generalization through simulation: Integrating simulated and real data into deep reinforcement learning for vision-based autonomous flight”. In: IEEE ICRA, pp 6008–6014 (2019)
Grando, R.B., de Jesus, J.C., Drews-Jr, P.L.: “Deep reinforcement learning for mapless navigation of unmanned aerial vehicles”. In: 2020 Latin American Robotics Symposium (LARS), 2020 Brazilian Symposium on Robotics (SBR) and 2020 Workshop on Robotics in Education (WRE). IEEE, pp 1–6 (2020)
Fujimoto, S., Hoof, H., Meger, D.: “Addressing function approximation error in actor-critic methods”. In: International Conference on Machine Learning. PMLR, pp 1587–1596 (2018)
Marino, R., Mastrogiovanni, F., Sgorbissa, A., Zaccaria, R.: “A minimalistic quadrotor navigation strategy for indoor multi-floor scenarios”. In: Intelligent Autonomous Systems 13. Springer, pp 1561–1570 (2016)
Grando, R.B., Costa de Jesus, J., Kich, V.A., Kolling, A.H. , Bortoluzzi, P.N., Pinheiro, P.M., Neto, A.A., Drews-Jr, P.L.J: “Deep reinforcement learning for mapless navigation of a hybrid aerial underwater vehicle with medium transition”, IEEE International Conference on Robotics and Automation (ICRA), pp arXiv:2103 (2021)
Kober, J., Bagnell, J.A., Peters, J.: Reinforcement learning in robotics: a survey. The International Journal of Robotics Research 32(11), 1238–1274 (2013)
Article Google Scholar
Kormushev, P., Calinon, S., Caldwell, D.G.: Reinforcement learning in robotics: Applications and real-world challenges. Robotics 2(3), 122–148 (2013)
Article Google Scholar
Tobin, J., Fong, R., Ray, A., Schneider, J., Zaremba, W., Abbeel, P.: “Domain randomization for transferring deep neural networks from simulation to the real world. In: 2017 IEEE/RSJ international conference on intelligent robots and systems (IROS). IEEE, pp 23–30 (2017)
Ota, K., Sasaki, Y., Jha, D. K., Yoshiyasu, Y., Kanezaki, A.: “Efficient exploration in constrained environments with goal-oriented reference path. In: IEEE/RSJ IROS (2020)
Rodriguez-Ramos, A., Sampedro, C., Bavle, H., Moreno, I.G., Bavle, H., Campoy, P.: “A deep reinforcement learning technique for vision-based autonomous multirotor landing on a moving platform. In: IEEE/RSJ IROS, pp 1010–1017 (2018)
Furrer, F., Burri, M., Achtelik, M., Siegwart, R.: “RotorS – modular gazebo MAV simulator framework. In: Robot Operating System (ROS), pp 595–625 (2016)
He, L., Aouf, N., Whidborne, J.F., Song, B: “Deep reinforcement learning based local planner for uav obstacle avoidance using demonstration data, arXiv:2008.02521 (2020)
Li, B., Gan, Z., Chen, D., Sergey Aleksandrovich, D.: Uav maneuvering target tracking in uncertain environments based on deep reinforcement learning and meta-learning. Remote Sens. 12(22), 3789 (2020)
Article Google Scholar
Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Kopf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Bai, J., Chintala, S.: “Pytorch: An imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, Eds., vol. 32. Curran Associates, Inc. [Online]. Available: https://proceedings.neurips.cc/paper/2019/file/bdbca288fee7f92f2bfa9f7012727740-Paper.pdf (2019)
Rukhovich, D., Sofiiuk, K., Galeev, D., Barinova, O., Konushin, A.: “Iterdet, Iterative scheme for object detection in crowded environments” (2021)
Tao, A., Sapra, K., Catanzaro, B.: Hierarchical multi-scale attention for semantic segmentation, vol. arXiv:2005.10821 (2020)
Shoeybi, M., Patwary, M., Puri, R., LeGresley, P., Casper, J., Catanzaro, B: “Megatron-lm: Training multi-billion parameter language models using model parallelism” (2020)
Quigley, M., Conley, K., Gerkey, B., Faust, J., Foote, T., Leibs, J., Wheeler, R., Ng, A.Y.: “ROS: an open-source robot operating system. In: IEEE ICRA - Workshop on open source software. Kobe, Japan, vol. 3, p 5 (2009)
Drews-Jr, P.L.J., Neto, A.A., Campos, M.F.M.: “Hybrid unmanned aerial underwater vehicle: Modeling and simulation. pp. 4637–4642 (2014)
Neto, A.A., Mozelli, L.A., Drews-Jr, P.L.J., Campos, M.F.M.: “Attitude control for an hybrid unmanned aerial underwater vehicle: A robust switched strategy with global stability. In: IEEE ICRA, pp 395–400 (2015)
Horn, A.C., Pinheiro, P.M., Silva, C.B., Neto, A.A., Drews-Jr, P.L.J.: “A study on configuration of propellers for multirotor-like hybrid aerial-aquatic vehicles. In: ICAR, pp 173–178 (2019)
Horn, A.C., Pinheiro, P.M., Grando, R.B., da Silva, C. B., Neto, A.A., Drews, P.L.: “A novel concept for hybrid unmanned aerial underwater vehicles focused on aquatic performance”, pp. 1–6 (2020)
Bedin Grando, R., Drews Jr, P.L.J., Alves Neto, A.: Ardupilot and ros-based control system concept for a hybrid unmanned aerial underwater vehicle. In: II Brazilian Humanoid Robot Workshop and III Brazilian Workshop on Service Robotics, pp 26–29 (2019)
Grando, R.B., Pinheiro, P.M., Bortoluzzi, N.P., da Silva, C.B., Zauk, O.F., Piñeiro, M.O., Aoki, M.V., Kelbouscas, A.L., Lima, Y.B., Drews, P.L., Neto, A.A.: “Visual-based autonomous unmanned aerial vehicle for inspection in indoor environments,” pp. 1–6 (2020)
Koenig, N., A Howard, A.: “Design and use paradigms for gazebo, an open-source multi-robot simulator”. In: IEEE/RSJ IROS, vol. 3, pp 2149–2154 (2004)

Download references

Acknowledgement

We want to thank to National Council for Scientific and Technological Development (CNPq), the Coordination for the Improvement of Higher Education Personnel (CAPES) - Finance Code 001, PRH-ANP and all participants of VersusAI. We also want to acknowledge that the data used in this works differs completely from our previous work, a new evaluation was carried out for all statistics presented.

Funding

This work was manly founded by the Coordination for the Improvement of Higher Education Personnel (CAPES). It also had support from the National Council for Scientific and Technological Development (CNPq) and from the National Agency of Petroleum, Natural Gas, and Biofuels (PRH-ANP).

Author information

Authors and Affiliations

Technological University of Uruguay Rivera, Rivera, Uruguay
Ricardo Bedin Grando
Federal University of Rio Grande Rio Grande, Rio Grande, RS, Brazil
Junior Costa de Jesus & Paulo Lilles Jorge Drews-Jr
Federal University of Santa Maria Santa Maria, Santa Maria, RS, Brazil
Victor Augusto Kich & Alisson Henrique Kolling

Authors

Ricardo Bedin Grando
View author publications
You can also search for this author in PubMed Google Scholar
Junior Costa de Jesus
View author publications
You can also search for this author in PubMed Google Scholar
Victor Augusto Kich
View author publications
You can also search for this author in PubMed Google Scholar
Alisson Henrique Kolling
View author publications
You can also search for this author in PubMed Google Scholar
Paulo Lilles Jorge Drews-Jr
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

- Ricardo Bedin Grando conceptualized the study, wrote the article, developed and programmed the experiments, and collected and analyzed the test data.

- Junior Costa de Jesus wrote the article and gathered and analyzed the test data.

- Victor Augusto Kich wrote the article, programmed the experiments, and gathered and analyzed the test data.

- Alisson Henrique Kolling wrote the article, programmed the experiments, and gathered and analyzed the test data.

- Paulo Lilles Jorge Drews Jr. conceptualized the research, wrote the article, and led the debate on the article’s major topics.

Corresponding author

Correspondence to Ricardo Bedin Grando.

Ethics declarations

Ethics approval

All authors have ethcally approved this work.

Consent for Publication

This paper can be published, the permission was approved by the author and the all the coauthors.

Competing interests

The authors inform this work present no competing interest.

Additional information

Available of data and material

GitHub’s repository https://github.com/ricardoGrando/hydrone_deep_rl_jint.

Consent to Participate

The consent to participate in this article was given by all members.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Grando, R.B., de Jesus, J., Kich, V.A. et al. Double Critic Deep Reinforcement Learning for Mapless 3D Navigation of Unmanned Aerial Vehicles. J Intell Robot Syst 104, 29 (2022). https://doi.org/10.1007/s10846-021-01568-y

Download citation

Received: 08 November 2021
Accepted: 24 December 2021
Published: 31 January 2022
DOI: https://doi.org/10.1007/s10846-021-01568-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Double Critic Deep Reinforcement Learning for Mapless 3D Navigation of Unmanned Aerial Vehicles

Abstract

Access this article

Similar content being viewed by others

Soft Actor-Critic for Navigation of Mobile Robots

Autonomous Obstacle Avoidance and Target Tracking of UAV Based on Deep Reinforcement Learning

An End-to-End Deep Reinforcement Learning Model Based on Proximal Policy Optimization Algorithm for Autonomous Driving of Off-Road Vehicle

References

Acknowledgement

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval

Consent for Publication

Competing interests

Additional information

Available of data and material

Consent to Participate

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation