research-article

Public Access

Malware Detection in Adversarial Settings: Exploiting Feature Evolutions and Confusions in Android Apps

Authors:
Wei Yang

University of Illinois at Urbana-Champaign

University of Illinois at Urbana-Champaign
Search about this author

,
Deguang Kong

Yahoo Research

Yahoo Research
View Profile

,
Tao Xie

University of Illinois at Urbana-Champaign

University of Illinois at Urbana-Champaign
View Profile

,
Carl A. Gunter

University of Illinois at Urbana-Champaign

University of Illinois at Urbana-Champaign
View Profile

ACSAC '17: Proceedings of the 33rd Annual Computer Security Applications ConferenceDecember 2017Pages 288–302https://doi.org/10.1145/3134600.3134642

Published:04 December 2017Publication History

ACSAC '17: Proceedings of the 33rd Annual Computer Security Applications Conference

Pages 288–302

ABSTRACT

Existing techniques on adversarial malware generation employ feature mutations based on feature vectors extracted from malware. However, most (if not all) of these techniques suffer from a common limitation: feasibility of these attacks is unknown. The synthesized mutations may break the inherent constraints posed by code structures of the malware, causing either crashes or malfunctioning of malicious payloads. To address the limitation, we present Malware Recomposition Variation (MRV), an approach that conducts semantic analysis of existing malware to systematically construct new malware variants for malware detectors to test and strengthen their detection signatures/models. In particular, we use two variation strategies (i.e., malware evolution attack and malware confusion attack) following structures of existing malware to enhance feasibility of the attacks. Upon the given malware, we conduct semantic-feature mutation analysis and phylogenetic analysis to synthesize mutation strategies. Based on these strategies, we perform program transplantation to automatically mutate malware bytecode to generate new malware variants. We evaluate our MRV approach on actual malware variants, and our empirical evaluation on 1,935 Android benign apps and 1,917 malware shows that MRV produces malware variants that can have high likelihood to evade detection while still retaining their malicious behaviors. We also propose and evaluate three defense mechanisms to counter MRV.

References

Airpush Detector. https://goo.gl/QVn82.Google Scholar
Airpush Opt-out. http://www.airpush.com/optout/.Google Scholar
Contagio. http://contagiominidump.blogspot.com/.Google Scholar
Virusshare. http://virusshare.com/.Google Scholar
Virustotal. https://www.virustotal.com/.Google Scholar
Weka 3: Data mining software in Java. http://www.cs.waikato.ac.nz/ml/weka/.Google Scholar
D. Arp, M. SPreitzenbarth, M. Hubner, H. Gascon, and K. Rieck. DREBIN: effective and explainable detection of Android malware in your pocket. In Proc. NDSS, 2014.Google ScholarCross Ref
K. W. Y. Au, Y. F. Zhou, Z. Huang, and D. Lie. PScout: Analyzing the Android permission specification. In Proc. CCS, pages 217--228, 2012. Google ScholarDigital Library
E. T. Barr, M. Harman, Y. Jia, A. Marginean, and J. Petke. Automated software transplantation. In Proc. ISSTA, 2015. Google ScholarDigital Library
A. D. Baxevanis and B. F. F. Ouellette. Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins. John WileySons, 2004.Google Scholar
B. Biggio, I. Corona, D. Maiorca, B. Nelson, N. Šrndić, P. Laskov, G. Giacinto, and F. Roli. Evasion attacks against machine learning at test time. In Proc. KDD, pages 387--402, 2013. Google ScholarDigital Library
B. Biggio, G. Fumera, and F. Roli. Security evaluation of pattern classifiers under attack. IEEE Trans. TKDE, pages 984--996, 2014. Google ScholarDigital Library
A. A. Cárdenas and J. S. Baras. Evaluation of classifiers: Practical considerations for security applications. In Proc. AAAI Workshop Evaluation Methods for Machine Learning, pages 777--780, 2006.Google Scholar
N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer. Smote: Synthetic minority over-sampling technique. Journal of artificial intelligence research, pages 321--357, 2002. Google ScholarDigital Library
K. Chen, P. Wang, Y. Lee, X. Wang, N. Zhang, H. Huang, W. Zou, and P. Liu. Finding unknown malice in 10 seconds: Mass vetting for new threats at the Google-play scale. In Proc. USENIX Security, pages 659--674, 2015. Google ScholarDigital Library
A. Demontis, M. Melis, B. Biggio, D. Maiorca, D. Arp, K. Rieck, I. Corona, G. Giacinto, and F. Roli. Yes, machine learning can be more secure! a case study on Android malware detection. IEEE Trans. TDSC, 2017.Google Scholar
Y. Feng, I. Dillig, and A. Aiken. Apposcopy: Semantics-based detection of Android malware through static analysis. In Proc. FSE, pages 576--587, 2014. Google ScholarDigital Library
I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. Generative adversarial nets. In Proc. NIPS, pages 2672--2680, 2014. Google ScholarDigital Library
K. Grosse, N. Papernot, P. Manoharan, M. Backes, and P. McDaniel. Adversarial perturbations against deep neural networks for malware classification. arXiv preprint arXiv:1606.04435, 2016.Google Scholar
S. Hao, B. Liu, S. Nath, and R. Govindan. PUMA: Programmable UI-automation for large-scale analysis of mobile apps. In Proc. Mobisys, pages 204--217, 2014. Google ScholarDigital Library
W. Hu and Y. Tan. Generating adversarial malware examples for black-box attacks based on GAN. arXiv preprint arXiv:1702.05983, 2017.Google Scholar
L. Huang, A. D. Joseph, B. Nelson, B. I. Rubinstein, and J. Tygar. Adversarial machine learning. In Proc. AISec, 2011. Google ScholarDigital Library
D. Kong and G. Yan. Discriminant malware distance learning on structural information for automated malware classification. In KDD, pages 1357--1365, 2013. Google ScholarDigital Library
H. W. Kuhn and B. Yaw. The hungarian method for the assignment problem. Naval Research Logistics Quarterly, pages 83--97, 1955.Google ScholarCross Ref
P. Legendre and L. Legendre. Numerical Ecology: Developments in Environmental Modelling. Elsevier., 1998.Google Scholar
Y. Liu, X. Chen, C. Liu, and D. Song. Delving into transferable adversarial examples and black-box attacks. In Proc. ICLR, 2017.Google Scholar
Montrojans, ghosts, and more mean bumps ahead for mobile and connected thingskey. https://www.mcafee.com/us/resources/reports/rp-mobile-threat-report-2017.pdf.Google Scholar
Monkey. http://developer.Android.com/tools/help/monkey.html.Google Scholar
N. Papernot, P. McDaniel, and I. Goodfellow. Transferability in machine learning: from phenomena to black-box attacks using adversarial samples. arXiv preprint arXiv:1605.07277, 2016.Google Scholar
N. Papernot, P. McDaniel, I. Goodfellow, S. Jha, Z. B. Celik, and A. Swami. Practical black-box attacks against deep learning systems using adversarial examples. In Proc. ASIACCS, 2017. Google ScholarDigital Library
N. Papernot, P. McDaniel, S. Jha, M. Fredrikson, Z. B. Celik, and A. Swami. The limitations of deep learning in adversarial settings. In Proc. Euro S&P, pages 372--387, 2016.Google ScholarCross Ref
J. R. Quinlan. Induction of decision trees. Machine learning, 1986. Google ScholarDigital Library
V. Rastogi, Y. Chen, and X. Jiang. DroidChameleon: Evaluating Android anti-malware against transformation attacks. In Proc. ASIACCS, pages 329--334, 2013. Google ScholarDigital Library
N. Rndic and P. Laskov. Practical evasion of a learning-based classifier: A case study. In Proc. IEEE S & P, pages 197--211, 2014. Google ScholarDigital Library
S. Roy, J. DeLoach, Y. Li, N. Herndon, D. Caragea, X. Ou, V. P. Ranganath, H. Li, and N. Guevara. Experimental study with real-world data for Android app security analysis using machine learning. In Proc. ACSAC, pages 81--90, 2015. Google ScholarDigital Library
S. Sidiroglou-Douskos, E. Lahtinen, F. Long, and M. Rinard. Automatic error elimination by horizontal code transfer across multiple applications. In Proc. PLDI, pages 43--54, 2015. Google ScholarDigital Library
A. Singh, A. Walenstein, and A. Lakhotia. Tracking concept drift in malware families. In Proc. AISec, pages 81--92, 2012. Google ScholarDigital Library
C. Smutz and A. Stavrou. When a tree falls: Using diversity in ensemble classifiers to identify evasion in malware detectors. In Proc. NDSS, 2016.Google ScholarCross Ref
C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199, 2013.Google Scholar
R. Valleé-Rai, P. Co, E. Gagnon, L. Hendren, P. Lam, and V. Sundaresan. Soot: A Java bytecode optimization framework. In Proc. CASON, 1999. Google ScholarDigital Library
W. Xu, Y. Qi, and D. Evans. Automatically evading classifiers. In Proc. NDSS, 2016.Google Scholar
W. Yang, M. R. Prasad, and T. Xie. A grey-box approach for automated GUI-model generation of mobile applications. In Proc. FASE, pages 250--265. 2013. Google ScholarDigital Library
W. Yang, X. Xiao, B. Andow, S. Li, T. Xie, and W. Enck. AppContext: Differentiating malicious and benign mobile app behaviors using context. In Proc. ICSE, pages 303--313, 2015. Google ScholarDigital Library
Y. Zhou and X. Jiang. Dissecting Android malware: Characterization and evolution. In Proc. IEEE S & P, pages 95--109, 2012. Google ScholarDigital Library

Index Terms

Malware Detection in Adversarial Settings: Exploiting Feature Evolutions and Confusions in Android Apps
1. Security and privacy
  1. Intrusion/anomaly detection and malware mitigation

Recommendations

Malware detection using adaptive data compression
AISec '08: Proceedings of the 1st ACM workshop on Workshop on AISec

A popular approach in current commercial anti-malware software detects malicious programs by searching in the code of programs for scan strings that are byte sequences indicative of malicious code. The scan strings, also known as the signatures of ...
Read More
Malware Detection Method Focusing on Anti-debugging Functions
CANDAR '14: Proceedings of the 2014 Second International Symposium on Computing and Networking

Malware has received much attention in recent years. Antivirus software is widely used as a countermeasure against malware. However, some kinds of malware can evade detection by antivirus software, hence, a new detection method is required. In this ...
Read More
Opcode sequences as representation of executables for data-mining-based unknown malware detection

Malware can be defined as any type of malicious code that has the potential to harm a computer or network. The volume of malware is growing faster every year and poses a serious global security threat. Consequently, malware detection has become a ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ACSAC '17: Proceedings of the 33rd Annual Computer Security Applications Conference
December 2017
618 pages
ISBN:9781450353458
DOI:10.1145/3134600

Copyright © 2017 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 4 December 2017
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Adversarial classification
Malware detection
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate104of497submissions,21%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 66
  Total Citations
  View Citations
- 1,666
  Total Downloads
- Downloads (Last 12 months)277
- Downloads (Last 6 weeks)47
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Malware Detection in Adversarial Settings: Exploiting Feature Evolutions and Confusions in Android Apps

ACSAC '17: Proceedings of the 33rd Annual Computer Security Applications Conference

ABSTRACT

References

Cited By

Index Terms

Recommendations

Malware detection using adaptive data compression

Malware Detection Method Focusing on Anti-debugging Functions

Opcode sequences as representation of executables for data-mining-based unknown malware detection

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Malware Detection in Adversarial Settings: Exploiting Feature Evolutions and Confusions in Android Apps

ACSAC '17: Proceedings of the 33rd Annual Computer Security Applications Conference

ABSTRACT

References

Cited By

Index Terms

Recommendations

Malware detection using adaptive data compression

Malware Detection Method Focusing on Anti-debugging Functions

Opcode sequences as representation of executables for data-mining-based unknown malware detection

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media