Skip to main content

Comparing API Call Sequence Algorithms for Malware Detection

  • Conference paper
  • First Online:
Web, Artificial Intelligence and Network Applications (WAINA 2020)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1150))

Abstract

Malware became more and more sophisticated and increasingly difficult to detect, thanks to the use of evasion techniques, including anti-emulation, encapsulation, obfuscation, packing, anti-virtualization, and anti-debugger. New malware variants are generated by removing, replacing, and adding useless API calls to the malicious code. To face this increasing number of malware, it is necessary to design new detection methods, which are in charge of quickly analyzing large dataset and its variants. In this work, the sequence of state transitions performed by the applications during their execution are modeled by Markov chains, and used for malware classification. The implemented Markov chain-based detector is compared with the sequence alignment algorithm, which is widely used in the literature. The considered dataset includes 7.3 K malware and 1.2 K benign Windows applications collected over public datasets. Experimental results show that the Markov chain detector detects malware with up to 95% F-measure and outperforms detector based on sequence alignment.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 229.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 299.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Hosmer, C.: Polymorphic & Metamorphic Malware. https://www.blackhat.com/presentations/bh-usa-08/Hosmer/BH_US_08_Hosmer_Polymorphic_Malware.pdf. Accessed July 2019

  2. Ficco, M., Venticinque, S., Rak, M.: Malware detection for secure microgrids: CoSSMic case study. In: Proceedings of the IEEE International Conference on iThings/GreenCom/CPSCom/SmartData 2017, pp. 336–341 (2017)

    Google Scholar 

  3. Zhang, N., Yuan, K., Naveed, M., Zhou, X., Wang, X.: Leave me alone: app-level protection against runtime information gathering on Android. In: IEEE Symposium on Security and Privacy, pp. 915–930, May 2015

    Google Scholar 

  4. Aafer, Y., Du, W., Yin, H.: DroidAPIMiner: mining API-level features for robust malware detection in Android. In: Proceedings of the 9th International ICST Conference on Security and Privacy in Communication Networks, pp. 86–103 (2013)

    Google Scholar 

  5. D’Angelo, G., Ficco, M., Palmieri, F.: Malware detection in mobile environments based on autoencoders and API-images. J. Parallel Distrib. Comput. 137, 26–33 (2020)

    Article  Google Scholar 

  6. Chuang, H.Y., Wang, S.-D.: Machine learning based hybrid behavior models for Android malware analysis. In: Proceedings of the 9th IEEE International Conference Software Quality, Reliability and Security, pp. 201–206, August 2015

    Google Scholar 

  7. Feng, Y., Anand, S., Dillig, I., Aiken, A.: Apposcopy: semantics based detection of Android malware through static analysis. In: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, pp. 576–587, November 2014

    Google Scholar 

  8. Ficco, M.: Detecting IoT malware by Markov chain behavioral models. In: Proceedings of the IEEE International Conference on Cloud Engineering (IC2E), pp. 229–234 (2019)

    Google Scholar 

  9. Martín, A., Rodríguez-Fernández, V., Camacho, D.: CANDYMAN: classifying Android malware families by modelling dynamic traces with Markov chains. Eng. Appl. Artif. Intell. 74, 121–133 (2018)

    Article  Google Scholar 

  10. Natani, P., Vidyarthi, D.: Malware detection using API function frequency with ensemble based classifier. In: Proceedings of the 1st Security in Computing and Communications (SSCC 2013). LNCS, vol. 377, pp. 378–388, August 2013

    Google Scholar 

  11. Wu, L., Ping, R., Ke, L., Hai-xin, D.: Behavior-based malware analysis and detection. In: Proceedings of the 1st International Workshop on Complexity and Data Mining (IWCDM 2011), pp. 39–42, September 2011

    Google Scholar 

  12. Cho, I.K., Kim, T., Shim, Y.J., Park, H., Choi, B., Im, E.G.: Malware similarity analysis using API sequence alignments. J. Internet Serv. Inf. Secur. 4, 103–114 (2014)

    Google Scholar 

  13. Kim, H., Khoo, W., Li, P.: Polymorphic attacks against sequence-based software birthmarks. In: Proceedings of the 2nd ACM SIGPLAN Workshop on Software Security and Protection, pp. 1–8 (2012)

    Google Scholar 

  14. Elhadi, A., Maarof, M., Barry, B.: Improving the detection of malware behavior using simplified data dependent API call graph. Int. J. Secur. Appl. 7(5), 29–42 (2013)

    Google Scholar 

  15. Mariconti, E., Onwuzurike, L., Andriotis, P., De Cristofaro, E., Ross, G., Stringhini, G.: MaMaDroid: detecting Android malware by building Markov chains of behavioral models. In: Proceedings of the 24th Network and Distributed System Security Symposium (NDSS 2017), pp. 1–22, November 2017

    Google Scholar 

  16. Canfora, G., Mercaldo, F., Visaggio, C.A.: An HMM and structural entropy based detector for Android malware: an empirical study. Comput. Secur. 61, 1–18 (2016)

    Article  Google Scholar 

  17. Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48(3), 443–453 (1970)

    Article  Google Scholar 

  18. Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. J. Mol. Biol. 147(1), 195–197 (1981)

    Article  Google Scholar 

  19. Multiple Sequence Alignment (MSA). http://www.ebi.ac.uk/Tools/msa/. Accessed Feb 2019

  20. ClustalX, Clustal: Multiple Sequence Alignment. http://www.clustal.org/. Accessed Jan 2019

  21. Kim, H., Kim, J., Kim, Y., Kim, I., Kim, K.J., Kim, H.: Improvement of malware detection and classification using API call sequence alignment and visualization. Cluster Comput. J. 22, 921–929 (2019)

    Article  Google Scholar 

  22. He, R., Hu, B.-G., Zheng, W.-S., Kong, X.-W.: Robust principal component analysis based on maximum correntropy criterion. IEEE Trans. Image Process. 20(6), 1485–1494 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  23. TEKDEFENSE malware dataset. http://www.tekdefense.com/downloads/malware-samples/. Accessed Jan 2019

  24. Malware dataset for security researchers, data scientists. https://github.com/ocatak/malware_api_class. Accessed Jan 2019

  25. Cuckoo Sandbox. https://cuckoosandbox.org/. Accessed Feb 2019

  26. Weka, Open Source Machine Learning Software in Java. https://www.cs.waikato.ac.nz/~ml/weka/. Accessed Feb 2018

  27. Ficco, M., Esposito, C., Xiang, Y., Palmieri, F.: Pseudo-dynamic testing of realistic edge-fog cloud ecosystems. IEEE Commun. Mag. 55(11), 98–104 (2017)

    Article  Google Scholar 

  28. D’Angelo, G., Palmieri, F., Rampone, S.: Detecting unfair recommendations in trust-based pervasive environments. Inf. Sci. 486, 31–51 (2019)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Massimo Ficco .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ficco, M. (2020). Comparing API Call Sequence Algorithms for Malware Detection. In: Barolli, L., Amato, F., Moscato, F., Enokido, T., Takizawa, M. (eds) Web, Artificial Intelligence and Network Applications. WAINA 2020. Advances in Intelligent Systems and Computing, vol 1150. Springer, Cham. https://doi.org/10.1007/978-3-030-44038-1_77

Download citation

Publish with us

Policies and ethics