Abstract
Malicious JavaScript is one of the most common tools for attackers to exploit the vulnerability of web applications. It can carry potential risks such as spreading malware, phishing, or collecting sensitive information. Though there are numerous types of malicious JavaScript that are difficult to detect, generalizing the malicious script’s signature can help catch more complex JavaScripts that use obfuscation techniques. This paper aims at detecting malicious JavaScripts based on structure and attribute analysis of abstract syntax trees (ASTs) that capture the generalized semantic meaning of the source code. We apply a graph convolutional neural network (GCN) to process the AST features and get a graph representation via neural message passing with neighborhood aggregation. The attention layer enriches our method to track pertinent parts of scripts that may contain the signature of malicious intent. We comprehensively evaluate the performance of our proposed approach on a real-world dataset to detect malicious websites. The proposed method demonstrates promising performance in terms of detection accuracy and robustness against obfuscated samples.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Belkin, M., Niyogi, P., Sindhwani, V.: Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J. Mach. Learn. Res. 7, 2399–2434 (2006)
Cova, M., Kruegel, C., Vigna, G.: Detection and analysis of drive-by-download attacks and malicious JavaScript code. In: Proceedings of the 19th International Conference on World Wide Web, WWW 2010, pp. 281–290. Association for Computing Machinery, New York (2010). https://doi.org/10.1145/1772690.1772720
Douligeris, C., Mitrokotsa, A.: DDoS attacks and defense mechanisms: classification and state-of-the-art. Comput. Netw. 44(5), 643–666 (2004)
The estree spec. https://github.com/estree/estree. Accessed 20 Jan 2021
Fang, Y., Huang, C., Liu, L., Xue, M.: Research on malicious JavaScript detection technology based on LSTM. IEEE Access 6, 59118–59125 (2018)
Fass, A., Krawczyk, R.P., Backes, M., Stock, B.: JaSt: fully syntactic detection of malicious (obfuscated) JavaScript. In: Giuffrida, C., Bardin, S., Blanc, G. (eds.) DIMVA 2018. LNCS, vol. 10885, pp. 303–325. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93411-2_14
Gupta, S., Gupta, B.: Enhanced XSS defensive framework for web applications deployed in the virtual machines of cloud computing environment. Procedia Technol. 24, 1595–1602 (2016). https://doi.org/10.1016/j.protcy.2016.05.152. https://www.sciencedirect.com/science/article/pii/S2212017316302419. International Conference on Emerging Trends in Engineering, Science and Technology (ICETEST - 2015)
Hamilton, W.L.: Graph representation learning. In: Synthesis Lectures on Artificial Intelligence and Machine Learning, vol. 14, no. 3, pp. 1–159 (2020)
Kamkar, S.: phpwn: attacking sessions and pseudo-random numbers in PHP. In: Blackhat (2010)
Majestic. https://majestic.com/. Accessed 26 Jan 2021
Data modes. https://graphneural.network/data-modes/. Accessed 17 Apr 2021
Ndichu, S., Kim, S., Ozawa, S.: Deobfuscation, unpacking, and decoding of obfuscated malicious JavaScript for machine learning models detection performance improvement. CAAI Trans. Intell. Technol. 5, 184–192 (2020)
Raychev, V., Bielik, P., Vechev, M., Krause, A.: Learning programs from noisy data. SIGPLAN Not. 51(1), 761–774 (2016)
Rozi, M.F., Kim, S., Ozawa, S.: Deep neural networks for malicious JavaScript detection using bytecode sequences. In: 2020 International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2020)
Song, X., Chen, C., Cui, B., Fu, J.: Malicious JavaScript detection based on bidirectional LSTM model. Appl. Sci. 10(10), 3440 (2020). https://doi.org/10.3390/app10103440. https://www.mdpi.com/2076-3417/10/10/3440
Usage statistics of JavaScript as client-side programming language on websites. https://w3techs.com/technologies/details/cp-javascript. Accessed 08 May 2021
Vaswani, A., et al.: Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS 2017, pp. 6000–6010. Curran Associates Inc., Red Hook (2017)
Virustotal. https://www.virustotal.com/gui/. Accessed 15 Jan 2021
Wassermann, G., Su, Z.: Static detection of cross-site scripting vulnerabilities. In: 2008 ACM/IEEE 30th International Conference on Software Engineering, pp. 171–180 (2008). https://doi.org/10.1145/1368088.1368112
Weston, J., Ratle, F., Collobert, R.: Deep learning via semi-supervised embedding. In: Proceedings of the 25th International Conference on Machine Learning, ICML 2008, pp. 1168–1175. Association for Computing Machinery, New York (2008). https://doi.org/10.1145/1390156.1390303
Yaworski, P.: Real-world bug hunting: a field guide to web hacking 14(3) (2019)
Zhou, K., et al.: Understanding and resolving performance degradation in graph convolutional networks. arXiv e-prints arXiv:2006.07107, June 2020
Zhu, X., Ghahramani, Z., Lafferty, J.D.: Semi-supervised learning using gaussian fields and harmonic functions. In: Fawcett, T., Mishra, N. (eds.) Proceedings of the Twentieth International Conference on Machine Learning (ICML 2003), Washington, DC, USA, 21–24 August 2003, pp. 912–919. AAAI Press (2003). http://www.aaai.org/Library/ICML/2003/icml03-118.php
Acknowledgements
This research was partially supported by the Ministry of Education, Science, Sports, and Culture, Grant-in-Aid for Scientific Research (B) 21H03444.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Rozi, M.F., Ban, T., Ozawa, S., Kim, S., Takahashi, T., Inoue, D. (2021). JStrack: Enriching Malicious JavaScript Detection Based on AST Graph Analysis and Attention Mechanism. In: Mantoro, T., Lee, M., Ayu, M.A., Wong, K.W., Hidayanto, A.N. (eds) Neural Information Processing. ICONIP 2021. Lecture Notes in Computer Science(), vol 13109. Springer, Cham. https://doi.org/10.1007/978-3-030-92270-2_57
Download citation
DOI: https://doi.org/10.1007/978-3-030-92270-2_57
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-92269-6
Online ISBN: 978-3-030-92270-2
eBook Packages: Computer ScienceComputer Science (R0)