skip to main content
10.1145/2381896.2381904acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
research-article

Learning stateful models for network honeypots

Published:19 October 2012Publication History

ABSTRACT

Attacks like call fraud and identity theft often involve sophisticated stateful attack patterns which, on top of normal communication, try to harm systems on a higher semantic level than usual attack scenarios. To detect these kind of threats via specially deployed honeypots, at least a minimal understanding of the inherent state machine of a specific service is needed to lure potential attackers and to keep a communication for a sufficiently large number of steps. To this end we propose PRISMA, a method for protocol inspection and state machine analysis, which infers a functional state machine and message format of a protocol from network traffic alone. We apply our method to three real-life network traces ranging from 10,000 up to 2 million messages of both binary and textual protocols. We show that PRISMA is capable of simulating complete and correct sessions based on the learned models. A case study on malware traffic reveals the different states of the execution, rendering PRISMA a valuable tool for malware analysis.

References

  1. R. Albright, J. Cox, D. Duling, A. Langville, and C. Meyer. Algorithms, initializations, and convergence for the nonnegative matrix factorization. Technical Report 81706, North Carolina State University, 2006.Google ScholarGoogle Scholar
  2. L. E. Baum and J. A. Eagon. An inequality with applications to statistical estimation for probabilistic functions of markov processes and to a model for ecology. Bulletin of the American Mathematical Society, 73(3):360--363, 1967.Google ScholarGoogle ScholarCross RefCross Ref
  3. M. A. Beddoe. Network Protocol Analysis using Bioinformatics Algorithms. Technical report, McAfee Inc., 2005.Google ScholarGoogle Scholar
  4. J. Caballero, P. Poosankam, and C. Kreibich. Dispatcher: Enabling Active Botnet Infiltration Using Automatic Protocol Reverse-Engineering. In Proceedings of the 16th ACM conference on Computer and Communications Security (CCS), 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. Caballero, H. Yin, and Z. Liang. Polyglot: Automatic Extraction of Protocol Message Format using Dynamic Binary Analysis. In Proceedings of the 14th ACM Conference on Computer and Communications Security (CSS), 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. P. Comparetti and G. Wondracek. Prospex: Protocol Specification Extraction. In Proceedings of the 30th IEEE Symposium on Security and Privacy, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. W. Cui and J. Kannan. Discoverer: Automatic Protocol Reverse Engineering From Network Traces. In Proceedings of the 16th USENIX Security Symposium, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. W. Cui, V. Paxson, N. C. Weaver, and R. H. Katz. Protocol-Independent Adaptive Replay of Application Dialog. In Proceedings of the 13th Network and Distributed System Security Symposium (NDSS), 2006.Google ScholarGoogle Scholar
  9. W. Cui, M. Peinado, K. Chen, and H. Wang. Tupni: Automatic Reverse Engineering of Input Formats. In Proceedings of the 15th ACM conference on Computer and Communications Security (CCS), 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. A. M. Fraser. Hidden Markov Models and Dynamical Systems. Society for Industrial and Applied Mathematics, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. Heiler and C. Schnörr. Learning sparse representations by non-negative matrix factorization and sequential cone programming. Journal of Machine Learning Research, 7:1385--1407, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. P. Hethmon. Extensions to FTP. RFC 3659 (Proposed Standard), Mar. 2007.Google ScholarGoogle Scholar
  13. P. Hethmon and R. Elz. Feature negotiation mechanism for the File Transfer Protocol. RFC 2389 (Proposed Standard), Aug. 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. P. Holland. Weighted ridge regression: Combining ridge and robust regression methods. Technical Report 11, National Bureau of Econ. Research, 1973.Google ScholarGoogle Scholar
  15. S. Holm. A simple sequentially rejective multiple test procedure. Scand. Journal of Statistics, 6:65--70, 1979.Google ScholarGoogle Scholar
  16. P. O. Hoyer. Non-negative matrix factorization with sparseness constraints. Journal of Machine Learning Research, 5:1457--1469, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. G. Jacob, R. Hund, C. Kruegel, and T. Holz. Jackstraws: Picking command and control connections from bot traffic. Proceedings of the 20th USENIX Security Symposium, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. I. T. Jolliffe. Principal Component Analysis. Springer, 1986.Google ScholarGoogle ScholarCross RefCross Ref
  19. H. Kaplan and D. Wing. The SIP identity baiting attack. Internet-draft, Internet Engineering Task Force, 2008.Google ScholarGoogle Scholar
  20. T. Krueger, N. Krämer, and K. Rieck. ASAP: automatic semantics-aware analysis of network payloads. Proceedings of the ECML/PKDD conference on Privacy and security issues in data mining and machine learning, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. D. D. Lee and H. S. Seung. Learning the parts of objects by non-negative matrix factorization. Nature, 401:788--791, 1999.Google ScholarGoogle ScholarCross RefCross Ref
  22. C. Leita and M. Dacier. Automatic Handling of Protocol Dependencies and Reaction to 0-Day Attacks with ScriptGen Based Honeypots. In Proceedings of the 9th international conference on Recent Advances in Intrusion Detection (RAID), 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. C. Leita and K. Mermoud. Scriptgen: An Automated Script Generation Tool For honeyd. In Proceedings of the 21st Annual Computer Security Applications Conference (ACSAC), 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Z. Lin, X. Jiang, and D. Xu. Automatic Protocol Format Reverse Engineering through Context-Aware Monitored Execution. In Proceedings of the 15th Network and Distributed System Security Symposium (NDSS), 2008.Google ScholarGoogle Scholar
  25. D. Mankins, D. Franklin, and A. Owen. Directory oriented FTP commands. RFC 775, Dec. 1980. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. E. F. Moore. Gedanken-experiments on sequential machines. Automata Studies, 34:129--153, 1956.Google ScholarGoogle Scholar
  27. J. Newsome, D. Brumley, and J. Franklin. Replayer Automatic Protocol Replay by Binary Analysis. In Proceedings of the 13th ACM conference on Computer and Communications Security (CCS), 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. P. Paatero and U. Tapper. Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values. Environmetrics, 5(2):111--126, 1994.Google ScholarGoogle ScholarCross RefCross Ref
  29. R. Pang and V. Paxson. A high-level programming environment for packet trace anonymization and transformation. Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications (SIGCOMM), 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. J. Postel and J. Reynolds. File Transfer Protocol. RFC 959 (Standard), Oct. 1985. Updated by RFCs 2228, 2640, 2773, 3659. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. K. Rieck and P. Laskov. Linear-time computation of similarity measures for sequential data. Journal of Machine Learning Research, 9:23--48, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. R. Schmidt. Multiple emitter location and signal parameter estimation. IEEE Transactions on Antennas and Propagation, 34(3):276--280, 1986.Google ScholarGoogle ScholarCross RefCross Ref
  33. R. State, O. Festor, H. Abdelnur, V. Pascual, J. Kuthan, R. Coeffic, J. Janak, and J. Floroiu. SIP digest authentication relay attack. Internet-draft, Internet Engineering Task Force, 2008.Google ScholarGoogle Scholar
  34. Z. Wang, X. Jiang, W. Cui, and X. Wang. ReFormat: Automatic Reverse Engineering of Encrypted Messages. In European Symposium on Research in Computer Security (ESORICS), 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. G. Wondracek and P. Comparetti. Automatic Network Protocol Analysis. In Proceedings of the 15th Network and Distributed System Security Symposium (NDSS), 2008.Google ScholarGoogle Scholar

Index Terms

  1. Learning stateful models for network honeypots

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      AISec '12: Proceedings of the 5th ACM workshop on Security and artificial intelligence
      October 2012
      116 pages
      ISBN:9781450316644
      DOI:10.1145/2381896

      Copyright © 2012 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 19 October 2012

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      AISec '12 Paper Acceptance Rate10of24submissions,42%Overall Acceptance Rate94of231submissions,41%

      Upcoming Conference

      CCS '24
      ACM SIGSAC Conference on Computer and Communications Security
      October 14 - 18, 2024
      Salt Lake City , UT , USA

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader