research-article

Learning stateful models for network honeypots

Authors:
Tammo Krueger

Technische Universität Berlin, Berlin, Germany

Technische Universität Berlin, Berlin, Germany
View Profile

,
Hugo Gascon

Technische Universität Berlin, Berlin, Germany

Technische Universität Berlin, Berlin, Germany
View Profile

,
Nicole Krämer

Technische Universität München, München, Germany

Technische Universität München, München, Germany
View Profile

,
Konrad Rieck

University of Göttingen, Göttingen, Germany

University of Göttingen, Göttingen, Germany
View Profile

AISec '12: Proceedings of the 5th ACM workshop on Security and artificial intelligenceOctober 2012Pages 37–48https://doi.org/10.1145/2381896.2381904

Published:19 October 2012Publication History

AISec '12: Proceedings of the 5th ACM workshop on Security and artificial intelligence

Pages 37–48

ABSTRACT

Attacks like call fraud and identity theft often involve sophisticated stateful attack patterns which, on top of normal communication, try to harm systems on a higher semantic level than usual attack scenarios. To detect these kind of threats via specially deployed honeypots, at least a minimal understanding of the inherent state machine of a specific service is needed to lure potential attackers and to keep a communication for a sufficiently large number of steps. To this end we propose PRISMA, a method for protocol inspection and state machine analysis, which infers a functional state machine and message format of a protocol from network traffic alone. We apply our method to three real-life network traces ranging from 10,000 up to 2 million messages of both binary and textual protocols. We show that PRISMA is capable of simulating complete and correct sessions based on the learned models. A case study on malware traffic reveals the different states of the execution, rendering PRISMA a valuable tool for malware analysis.

References

R. Albright, J. Cox, D. Duling, A. Langville, and C. Meyer. Algorithms, initializations, and convergence for the nonnegative matrix factorization. Technical Report 81706, North Carolina State University, 2006.Google Scholar
L. E. Baum and J. A. Eagon. An inequality with applications to statistical estimation for probabilistic functions of markov processes and to a model for ecology. Bulletin of the American Mathematical Society, 73(3):360--363, 1967.Google ScholarCross Ref
M. A. Beddoe. Network Protocol Analysis using Bioinformatics Algorithms. Technical report, McAfee Inc., 2005.Google Scholar
J. Caballero, P. Poosankam, and C. Kreibich. Dispatcher: Enabling Active Botnet Infiltration Using Automatic Protocol Reverse-Engineering. In Proceedings of the 16th ACM conference on Computer and Communications Security (CCS), 2009. Google ScholarDigital Library
J. Caballero, H. Yin, and Z. Liang. Polyglot: Automatic Extraction of Protocol Message Format using Dynamic Binary Analysis. In Proceedings of the 14th ACM Conference on Computer and Communications Security (CSS), 2007. Google ScholarDigital Library
P. Comparetti and G. Wondracek. Prospex: Protocol Specification Extraction. In Proceedings of the 30th IEEE Symposium on Security and Privacy, 2009. Google ScholarDigital Library
W. Cui and J. Kannan. Discoverer: Automatic Protocol Reverse Engineering From Network Traces. In Proceedings of the 16th USENIX Security Symposium, 2007. Google ScholarDigital Library
W. Cui, V. Paxson, N. C. Weaver, and R. H. Katz. Protocol-Independent Adaptive Replay of Application Dialog. In Proceedings of the 13th Network and Distributed System Security Symposium (NDSS), 2006.Google Scholar
W. Cui, M. Peinado, K. Chen, and H. Wang. Tupni: Automatic Reverse Engineering of Input Formats. In Proceedings of the 15th ACM conference on Computer and Communications Security (CCS), 2008. Google ScholarDigital Library
A. M. Fraser. Hidden Markov Models and Dynamical Systems. Society for Industrial and Applied Mathematics, 2008. Google ScholarDigital Library
M. Heiler and C. Schnörr. Learning sparse representations by non-negative matrix factorization and sequential cone programming. Journal of Machine Learning Research, 7:1385--1407, 2006. Google ScholarDigital Library
P. Hethmon. Extensions to FTP. RFC 3659 (Proposed Standard), Mar. 2007.Google Scholar
P. Hethmon and R. Elz. Feature negotiation mechanism for the File Transfer Protocol. RFC 2389 (Proposed Standard), Aug. 1998. Google ScholarDigital Library
P. Holland. Weighted ridge regression: Combining ridge and robust regression methods. Technical Report 11, National Bureau of Econ. Research, 1973.Google Scholar
S. Holm. A simple sequentially rejective multiple test procedure. Scand. Journal of Statistics, 6:65--70, 1979.Google Scholar
P. O. Hoyer. Non-negative matrix factorization with sparseness constraints. Journal of Machine Learning Research, 5:1457--1469, 2004. Google ScholarDigital Library
G. Jacob, R. Hund, C. Kruegel, and T. Holz. Jackstraws: Picking command and control connections from bot traffic. Proceedings of the 20th USENIX Security Symposium, 2011. Google ScholarDigital Library
I. T. Jolliffe. Principal Component Analysis. Springer, 1986.Google ScholarCross Ref
H. Kaplan and D. Wing. The SIP identity baiting attack. Internet-draft, Internet Engineering Task Force, 2008.Google Scholar
T. Krueger, N. Krämer, and K. Rieck. ASAP: automatic semantics-aware analysis of network payloads. Proceedings of the ECML/PKDD conference on Privacy and security issues in data mining and machine learning, 2011. Google ScholarDigital Library
D. D. Lee and H. S. Seung. Learning the parts of objects by non-negative matrix factorization. Nature, 401:788--791, 1999.Google ScholarCross Ref
C. Leita and M. Dacier. Automatic Handling of Protocol Dependencies and Reaction to 0-Day Attacks with ScriptGen Based Honeypots. In Proceedings of the 9th international conference on Recent Advances in Intrusion Detection (RAID), 2006. Google ScholarDigital Library
C. Leita and K. Mermoud. Scriptgen: An Automated Script Generation Tool For honeyd. In Proceedings of the 21st Annual Computer Security Applications Conference (ACSAC), 2005. Google ScholarDigital Library
Z. Lin, X. Jiang, and D. Xu. Automatic Protocol Format Reverse Engineering through Context-Aware Monitored Execution. In Proceedings of the 15th Network and Distributed System Security Symposium (NDSS), 2008.Google Scholar
D. Mankins, D. Franklin, and A. Owen. Directory oriented FTP commands. RFC 775, Dec. 1980. Google ScholarDigital Library
E. F. Moore. Gedanken-experiments on sequential machines. Automata Studies, 34:129--153, 1956.Google Scholar
J. Newsome, D. Brumley, and J. Franklin. Replayer Automatic Protocol Replay by Binary Analysis. In Proceedings of the 13th ACM conference on Computer and Communications Security (CCS), 2006. Google ScholarDigital Library
P. Paatero and U. Tapper. Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values. Environmetrics, 5(2):111--126, 1994.Google ScholarCross Ref
R. Pang and V. Paxson. A high-level programming environment for packet trace anonymization and transformation. Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications (SIGCOMM), 2003. Google ScholarDigital Library
J. Postel and J. Reynolds. File Transfer Protocol. RFC 959 (Standard), Oct. 1985. Updated by RFCs 2228, 2640, 2773, 3659. Google ScholarDigital Library
K. Rieck and P. Laskov. Linear-time computation of similarity measures for sequential data. Journal of Machine Learning Research, 9:23--48, 2008. Google ScholarDigital Library
R. Schmidt. Multiple emitter location and signal parameter estimation. IEEE Transactions on Antennas and Propagation, 34(3):276--280, 1986.Google ScholarCross Ref
R. State, O. Festor, H. Abdelnur, V. Pascual, J. Kuthan, R. Coeffic, J. Janak, and J. Floroiu. SIP digest authentication relay attack. Internet-draft, Internet Engineering Task Force, 2008.Google Scholar
Z. Wang, X. Jiang, W. Cui, and X. Wang. ReFormat: Automatic Reverse Engineering of Encrypted Messages. In European Symposium on Research in Computer Security (ESORICS), 2009. Google ScholarDigital Library
G. Wondracek and P. Comparetti. Automatic Network Protocol Analysis. In Proceedings of the 15th Network and Distributed System Security Symposium (NDSS), 2008.Google Scholar

Index Terms

Learning stateful models for network honeypots
1. Networks
  1. Network services
    1. Network monitoring

Recommendations

Heat-seeking honeypots: design and experience
WWW '11: Proceedings of the 20th international conference on World wide web

Many malicious activities on the Web today make use of compromised Web servers, because these servers often have high pageranks and provide free resources. Attackers are therefore constantly searching for vulnerable servers. In this work, we aim to ...
Read More
Collecting Autonomous Spreading Malware Using High-Interaction Honeypots
Information and Communications Security
Abstract
Autonomous spreading malware in the form of worms or bots has become a severe threat in today’s Internet. Collecting the sample as early as possible is a necessary precondition for the further treatment of the spreading malware, e.g., to develop ...
Read More
Intrusion detection system using honeypots and swarm intelligence
ACAI '11: Proceedings of the International Conference on Advances in Computing and Artificial Intelligence

As the number and size of the Network and Internet traffic increase and the need for the intrusion detection grows in step to reduce the overhead required for the intrusion detection and diagnosis, it has made public servers increasingly vulnerable to ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
AISec '12: Proceedings of the 5th ACM workshop on Security and artificial intelligence
October 2012
116 pages
ISBN:9781450316644
DOI:10.1145/2381896
General Chair:
Ting Yu
North Carolina State University, USA
,
Program Chairs:
V. N. Venkatakrishan
University of Illinois at Chicago, USA
,
Apu Kapadia
Indiana University, Bloomington, USA
Copyright © 2012 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 19 October 2012
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
clustering
honeypots
markov models
non-negative matrix factorization
state machine inference
Qualifiers
- research-article
Conference

Acceptance Rates
AISec '12 Paper Acceptance Rate10of24submissions,42%Overall Acceptance Rate94of231submissions,41%
More
Upcoming Conference
CCS '24

Sponsor:

sigsac

ACM SIGSAC Conference on Computer and Communications Security

October 14 - 18, 2024

Salt Lake City , UT , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 41
  Total Citations
  View Citations
- 528
  Total Downloads
- Downloads (Last 12 months)22
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Learning stateful models for network honeypots

AISec '12: Proceedings of the 5th ACM workshop on Security and artificial intelligence

ABSTRACT

References

Cited By

Index Terms

Recommendations

Heat-seeking honeypots: design and experience

Collecting Autonomous Spreading Malware Using High-Interaction Honeypots

Intrusion detection system using honeypots and swarm intelligence

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Learning stateful models for network honeypots

AISec '12: Proceedings of the 5th ACM workshop on Security and artificial intelligence

ABSTRACT

References

Cited By

Index Terms

Recommendations

Heat-seeking honeypots: design and experience

Collecting Autonomous Spreading Malware Using High-Interaction Honeypots

Intrusion detection system using honeypots and swarm intelligence

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media