A Deep Reinforcement Learning Based Intelligent Decision Method for UCAV Air Combat

Liu, Pin; Ma, Yaofei

doi:10.1007/978-981-10-6463-0_24

Pin Liu¹⁵ &
Yaofei Ma¹⁵

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 751))

Included in the following conference series:

Asian Simulation Conference

3687 Accesses
40 Citations

Abstract

Based on deep reinforcement learning, an intelligent tactical decision method is proposed to solve the problem of Unmanned Combat Aerial Vehicle (UCAV) air combat decision-making. The increasing complexity of the air combat environment leads to a curse of dimensionality when using reinforcement learning to solve the air combat problem. In this paper, we employed the deep neural network as the function approximator, and combined it with Q-learning to achieve the accurate fitting of action-value function, which is a good way to reduce the curse of dimensionality brought by traditional reinforcement learning. In order to verify the validity of the algorithm, simulation of our deep Q-learning network (DQN) is carried out on the air combat platform. The simulation results show that the DQN algorithm has a good performance in both the reward and action-value utility. The proposed algorithm provides a new idea for the research of UCAV intelligent decision.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Schmitt, M.N.: Unmanned combat aircraft systems (Armed Drones) and international humanitarian law: simplifying the oft benighted debate. J. Soc. Sci. Electron. Publish. 2, 1006–1013 (2012)
Google Scholar
Tsach, S., Peled, A., Penn, D., et al.: Development trends for next generation of UAV systems. In: AIAA Infotech@Aerospace 2007 Conference and Exhibit, Rohnert Park, California (2007)
Google Scholar
Burgin, G., Sidor, L.B.: Rule-based air combat simulation. Technical report, TITAN-TLJ-H-1501, Titan Systems Inc., La Jolla, Calif, USA (1988)
Google Scholar
Virtanen, K., Raivio, T., Hamalainen, R.P.: Modeling pilot’s sequential maneuvering decisions by a multistage influence diagram. J. Guidance Control Dyn. 27(4), 665–677 (2004)
Article Google Scholar
Perelman, A., Shima, T., Rusnak, I.: Cooperative differential games strategies for active aircraft protection from a homing missile. In: AIAA Guidance, Navigation, and Control Conference, Toronto, Ontario, Canada (2010). J. Guidance Control Dyn. 34(3), 761–773 (2010)
Google Scholar
Nohejl, A.: Grammar-based genetic programming. Department of Software and Computer Science Education (2011)
Google Scholar
Schvaneveldt, R.W., Goldsmith, T.E., Benson, A.E., et al.: Neural Network Models of Air Combat Maneuvering. New Mexico State University, New Mexico (1992)
Google Scholar
Mnih, V., Kavukcuoglu, K., Silver, D., et al.: Human-level control through deep reinforcement learning. J. Nature 518(7540), 529–533 (2015)
Article Google Scholar
Silver, D., Huang, A., Maddison, C.J., et al.: Mastering the game of Go with deep neural networks and tree search. J. Nature 529(7587), 484–489 (2016)
Article Google Scholar
Liu, H.L., Taniguchi, T., Takano, T., et al.: Visualization of driving behavior using deep sparse autoencoder. In: IEEE Intelligent Vehicles Symposium, pp. 1427–1434 (2014)
Google Scholar
Liu, H.L., Taniguchi, T., Tanaka, Y., et al.: Visualization of driving behavior based on hidden feature extraction by using deep learning. J IEEE Trans. Intell. Transp. Syst. 99, 1–13 (2017)
Google Scholar
Levine, S., Wagener, N., Abbeel, P.: Learning contact-rich manipulation skills with guided policy search. In: Proceedings of the IEEE International Conference on Robotics and Automation, pp. 156–163. IEEE, Seattle (2015)
Google Scholar
Levine, S., Pastor, P., Krizhevsky, A., et al.: Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection. arXiv preprint arXiv:1603.02199 (2016)
Roessingh, J.J., Merk, R.J., Huibers, P., et al.: Smart Bandits in air-to-air combat training: combining different behavioural models in a common architecture. In: 21st Annual Conference on Behavior Representation in Modeling and Simulation, Amelia Island, Florida, USA (2012)
Google Scholar
McGrew, J.S., How, J.P., Williams, B., et al.: Air-combat strategy using approximate dynamic programming. J. Guidance Control Dyn. 33(5), 1641–1654 (2012)
Article Google Scholar
Teng, T.H., Tan, A.H., Tan, A.H., et al.: Self-organizing neural networks for learning air combat maneuvers. In: The 2012 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2012)
Google Scholar
Ma, Y., Ma, X., Song, X.: A case study on air combat decision using approximated dynamic programming. Math. Probl. Eng. 2014 (2014)
Google Scholar
Maei, H.R., Szepesvári, C., Bhatnagar, S., et al.: Convergent temporal-difference learning with arbitrary smooth function approximation. In: Advances in Neural Information Processing Systems, pp. 1204–1212. MIT Press, Vancouver (2009)
Google Scholar
Duan, H., Shao, X., Hou, W., et al.: An incremental learning algorithm for Lagrangian support vector machines. J. Pattern Recogn. Lett. 30(15), 1384–1391 (2009)
Article Google Scholar
Lange, S., Riedmiller, M., Voigtländer, A.: Autonomous reinforcement learning on raw visual input data in a real world application. In: Proceedings of the International Joint Conference on Neural Networks, pp. 1–8. IEEE, Brisbane (2012)
Google Scholar
Yang, D., Ma, Y.: An air combat decision-making method based on knowledge and grammar evolution. In: Zhang, L., Song, X., Wu, Y. (eds.) AsiaSim/SCS AutumnSim -2016. CCIS, vol. 644, pp. 508–518. Springer, Singapore (2016). doi:10.1007/978-981-10-2666-9_52
Chapter Google Scholar
Ma, Y., Gong, G., Peng, X.: Cognition behavior model for air combat based on reinforcement learning. J. Beijing Univ. Aeronaut. Astronaut. 36(4), 379–383 (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Automation Science and Electrical Engineering, Beihang University, Beijing, China
Pin Liu & Yaofei Ma

Authors

Pin Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yaofei Ma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pin Liu .

Editor information

Editors and Affiliations

Universiti Teknologi Malaysia, Skudai, Malaysia
Mohamed Sultan Mohamed Ali
Universiti Teknologi Malaysia, Skudai, Malaysia
Herman Wahid
Universiti Teknologi Malaysia, Skudai, Malaysia
Nurul Adilla Mohd Subha
Universiti Teknologi Malaysia, Skudai, Malaysia
Shafishuhaza Sahlan
Universiti Teknologi Malaysia, Skudai, Malaysia
Mohd Amri Md. Yunus
Universiti Teknologi Malaysia, Skudai, Malaysia
Ahmad Ridhwan Wahap

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, P., Ma, Y. (2017). A Deep Reinforcement Learning Based Intelligent Decision Method for UCAV Air Combat. In: Mohamed Ali, M., Wahid, H., Mohd Subha, N., Sahlan, S., Md. Yunus, M., Wahap, A. (eds) Modeling, Design and Simulation of Systems. AsiaSim 2017. Communications in Computer and Information Science, vol 751. Springer, Singapore. https://doi.org/10.1007/978-981-10-6463-0_24

Download citation

DOI: https://doi.org/10.1007/978-981-10-6463-0_24
Published: 26 August 2017
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-6462-3
Online ISBN: 978-981-10-6463-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics