ABSTRACT
Existing techniques on adversarial malware generation employ feature mutations based on feature vectors extracted from malware. However, most (if not all) of these techniques suffer from a common limitation: feasibility of these attacks is unknown. The synthesized mutations may break the inherent constraints posed by code structures of the malware, causing either crashes or malfunctioning of malicious payloads. To address the limitation, we present Malware Recomposition Variation (MRV), an approach that conducts semantic analysis of existing malware to systematically construct new malware variants for malware detectors to test and strengthen their detection signatures/models. In particular, we use two variation strategies (i.e., malware evolution attack and malware confusion attack) following structures of existing malware to enhance feasibility of the attacks. Upon the given malware, we conduct semantic-feature mutation analysis and phylogenetic analysis to synthesize mutation strategies. Based on these strategies, we perform program transplantation to automatically mutate malware bytecode to generate new malware variants. We evaluate our MRV approach on actual malware variants, and our empirical evaluation on 1,935 Android benign apps and 1,917 malware shows that MRV produces malware variants that can have high likelihood to evade detection while still retaining their malicious behaviors. We also propose and evaluate three defense mechanisms to counter MRV.
- Airpush Detector. https://goo.gl/QVn82.Google Scholar
- Airpush Opt-out. http://www.airpush.com/optout/.Google Scholar
- Contagio. http://contagiominidump.blogspot.com/.Google Scholar
- Virusshare. http://virusshare.com/.Google Scholar
- Virustotal. https://www.virustotal.com/.Google Scholar
- Weka 3: Data mining software in Java. http://www.cs.waikato.ac.nz/ml/weka/.Google Scholar
- D. Arp, M. SPreitzenbarth, M. Hubner, H. Gascon, and K. Rieck. DREBIN: effective and explainable detection of Android malware in your pocket. In Proc. NDSS, 2014.Google ScholarCross Ref
- K. W. Y. Au, Y. F. Zhou, Z. Huang, and D. Lie. PScout: Analyzing the Android permission specification. In Proc. CCS, pages 217--228, 2012. Google ScholarDigital Library
- E. T. Barr, M. Harman, Y. Jia, A. Marginean, and J. Petke. Automated software transplantation. In Proc. ISSTA, 2015. Google ScholarDigital Library
- A. D. Baxevanis and B. F. F. Ouellette. Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins. John WileySons, 2004.Google Scholar
- B. Biggio, I. Corona, D. Maiorca, B. Nelson, N. Šrndić, P. Laskov, G. Giacinto, and F. Roli. Evasion attacks against machine learning at test time. In Proc. KDD, pages 387--402, 2013. Google ScholarDigital Library
- B. Biggio, G. Fumera, and F. Roli. Security evaluation of pattern classifiers under attack. IEEE Trans. TKDE, pages 984--996, 2014. Google ScholarDigital Library
- A. A. Cárdenas and J. S. Baras. Evaluation of classifiers: Practical considerations for security applications. In Proc. AAAI Workshop Evaluation Methods for Machine Learning, pages 777--780, 2006.Google Scholar
- N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer. Smote: Synthetic minority over-sampling technique. Journal of artificial intelligence research, pages 321--357, 2002. Google ScholarDigital Library
- K. Chen, P. Wang, Y. Lee, X. Wang, N. Zhang, H. Huang, W. Zou, and P. Liu. Finding unknown malice in 10 seconds: Mass vetting for new threats at the Google-play scale. In Proc. USENIX Security, pages 659--674, 2015. Google ScholarDigital Library
- A. Demontis, M. Melis, B. Biggio, D. Maiorca, D. Arp, K. Rieck, I. Corona, G. Giacinto, and F. Roli. Yes, machine learning can be more secure! a case study on Android malware detection. IEEE Trans. TDSC, 2017.Google Scholar
- Y. Feng, I. Dillig, and A. Aiken. Apposcopy: Semantics-based detection of Android malware through static analysis. In Proc. FSE, pages 576--587, 2014. Google ScholarDigital Library
- I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. Generative adversarial nets. In Proc. NIPS, pages 2672--2680, 2014. Google ScholarDigital Library
- K. Grosse, N. Papernot, P. Manoharan, M. Backes, and P. McDaniel. Adversarial perturbations against deep neural networks for malware classification. arXiv preprint arXiv:1606.04435, 2016.Google Scholar
- S. Hao, B. Liu, S. Nath, and R. Govindan. PUMA: Programmable UI-automation for large-scale analysis of mobile apps. In Proc. Mobisys, pages 204--217, 2014. Google ScholarDigital Library
- W. Hu and Y. Tan. Generating adversarial malware examples for black-box attacks based on GAN. arXiv preprint arXiv:1702.05983, 2017.Google Scholar
- L. Huang, A. D. Joseph, B. Nelson, B. I. Rubinstein, and J. Tygar. Adversarial machine learning. In Proc. AISec, 2011. Google ScholarDigital Library
- D. Kong and G. Yan. Discriminant malware distance learning on structural information for automated malware classification. In KDD, pages 1357--1365, 2013. Google ScholarDigital Library
- H. W. Kuhn and B. Yaw. The hungarian method for the assignment problem. Naval Research Logistics Quarterly, pages 83--97, 1955.Google ScholarCross Ref
- P. Legendre and L. Legendre. Numerical Ecology: Developments in Environmental Modelling. Elsevier., 1998.Google Scholar
- Y. Liu, X. Chen, C. Liu, and D. Song. Delving into transferable adversarial examples and black-box attacks. In Proc. ICLR, 2017.Google Scholar
- Montrojans, ghosts, and more mean bumps ahead for mobile and connected thingskey. https://www.mcafee.com/us/resources/reports/rp-mobile-threat-report-2017.pdf.Google Scholar
- Monkey. http://developer.Android.com/tools/help/monkey.html.Google Scholar
- N. Papernot, P. McDaniel, and I. Goodfellow. Transferability in machine learning: from phenomena to black-box attacks using adversarial samples. arXiv preprint arXiv:1605.07277, 2016.Google Scholar
- N. Papernot, P. McDaniel, I. Goodfellow, S. Jha, Z. B. Celik, and A. Swami. Practical black-box attacks against deep learning systems using adversarial examples. In Proc. ASIACCS, 2017. Google ScholarDigital Library
- N. Papernot, P. McDaniel, S. Jha, M. Fredrikson, Z. B. Celik, and A. Swami. The limitations of deep learning in adversarial settings. In Proc. Euro S&P, pages 372--387, 2016.Google ScholarCross Ref
- J. R. Quinlan. Induction of decision trees. Machine learning, 1986. Google ScholarDigital Library
- V. Rastogi, Y. Chen, and X. Jiang. DroidChameleon: Evaluating Android anti-malware against transformation attacks. In Proc. ASIACCS, pages 329--334, 2013. Google ScholarDigital Library
- N. Rndic and P. Laskov. Practical evasion of a learning-based classifier: A case study. In Proc. IEEE S & P, pages 197--211, 2014. Google ScholarDigital Library
- S. Roy, J. DeLoach, Y. Li, N. Herndon, D. Caragea, X. Ou, V. P. Ranganath, H. Li, and N. Guevara. Experimental study with real-world data for Android app security analysis using machine learning. In Proc. ACSAC, pages 81--90, 2015. Google ScholarDigital Library
- S. Sidiroglou-Douskos, E. Lahtinen, F. Long, and M. Rinard. Automatic error elimination by horizontal code transfer across multiple applications. In Proc. PLDI, pages 43--54, 2015. Google ScholarDigital Library
- A. Singh, A. Walenstein, and A. Lakhotia. Tracking concept drift in malware families. In Proc. AISec, pages 81--92, 2012. Google ScholarDigital Library
- C. Smutz and A. Stavrou. When a tree falls: Using diversity in ensemble classifiers to identify evasion in malware detectors. In Proc. NDSS, 2016.Google ScholarCross Ref
- C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199, 2013.Google Scholar
- R. Valleé-Rai, P. Co, E. Gagnon, L. Hendren, P. Lam, and V. Sundaresan. Soot: A Java bytecode optimization framework. In Proc. CASON, 1999. Google ScholarDigital Library
- W. Xu, Y. Qi, and D. Evans. Automatically evading classifiers. In Proc. NDSS, 2016.Google Scholar
- W. Yang, M. R. Prasad, and T. Xie. A grey-box approach for automated GUI-model generation of mobile applications. In Proc. FASE, pages 250--265. 2013. Google ScholarDigital Library
- W. Yang, X. Xiao, B. Andow, S. Li, T. Xie, and W. Enck. AppContext: Differentiating malicious and benign mobile app behaviors using context. In Proc. ICSE, pages 303--313, 2015. Google ScholarDigital Library
- Y. Zhou and X. Jiang. Dissecting Android malware: Characterization and evolution. In Proc. IEEE S & P, pages 95--109, 2012. Google ScholarDigital Library
Index Terms
- Malware Detection in Adversarial Settings: Exploiting Feature Evolutions and Confusions in Android Apps
Recommendations
Malware detection using adaptive data compression
AISec '08: Proceedings of the 1st ACM workshop on Workshop on AISecA popular approach in current commercial anti-malware software detects malicious programs by searching in the code of programs for scan strings that are byte sequences indicative of malicious code. The scan strings, also known as the signatures of ...
Malware Detection Method Focusing on Anti-debugging Functions
CANDAR '14: Proceedings of the 2014 Second International Symposium on Computing and NetworkingMalware has received much attention in recent years. Antivirus software is widely used as a countermeasure against malware. However, some kinds of malware can evade detection by antivirus software, hence, a new detection method is required. In this ...
Opcode sequences as representation of executables for data-mining-based unknown malware detection
Malware can be defined as any type of malicious code that has the potential to harm a computer or network. The volume of malware is growing faster every year and poses a serious global security threat. Consequently, malware detection has become a ...
Comments