ScienceDirect® Home Skip Main Navigation Links
You have guest access to ScienceDirect. Find out more.
 
Home
Browse
My Settings
Alerts
Help
 Quick Search
 Search tips (Opens new window)
    Clear all fields    
advertisementadvertisement
Computer Communications
Volume 31, Issue 1, 15 January 2008, Pages 58-72
 
Font Size: Decrease Font Size  Increase Font Size
 Abstract - selected
Article
Purchase PDF (817 K)

 
 
 
Related Articles in ScienceDirect
View More Related Articles
 
View Record in Scopus
 
doi:10.1016/j.comcom.2007.10.010    How to Cite or Link Using DOI (Opens New Window)
Copyright © 2007 Elsevier B.V. All rights reserved.

Processing of massive audit data streams for real-time anomaly intrusion detection

Wei Wanga, Corresponding Author Contact Information, E-mail The Corresponding Author, Xiaohong Guana, b, E-mail The Corresponding Author and Xiangliang Zhanga, E-mail The Corresponding Author

aState Key Laboratory for Manufacturing Systems (SKLMS) and MOE Key Lab for Intelligent Networks and Network Security (KLINNS), Xi’an Jiaotong University, Xi’an 710049, China bCenter for Intelligent and Networked Systems, Tsinghua University, Beijing 100080, China

Received 26 October 2006; 
accepted 2 October 2007. 
Available online 13 October 2007.

Purchase the full-text article



References and further reading may be available for this article. To view references and further reading you must purchase this article.

Abstract

Intrusion detection is an important technique in the defense-in-depth network security framework. Most current intrusion detection models lack the ability to process massive audit data streams for real-time anomaly detection. In this paper, we present an effective anomaly intrusion detection model based on Principal Component Analysis (PCA). The model is more suitable for high speed processing of massive data streams in real-time from various data sources by considering the frequency property of audit events than by use of the transition property or the correlation property. It can serve as a general framework that a practical Intrusion Detection Systems (IDS) can be implemented in various computing environments. In this method, a multi-pronged anomaly detection model is used to monitor various computer system and network behaviors. Three sources of data, system call data from the University of New Mexico (lpr) and from KLINNS Lab of Xi’an Jiaotong University (ftp), shell command data from AT&T Research laboratory, and network data from MIT Lincoln Lab, are used to validate the model and the method. The frequencies of individual system calls generated by one process and of individual commands embedded in one command block as well as features extracted in one network connection are transformed into an input data vector. Our method is employed to reduce the high dimensional data vectors and thus the detection is handled in a lower dimension with high efficiency and low use of system resources. The distance between a vector and its reconstruction in the reduced subspace is used for anomaly detection. Empirical results show that our model is promising in terms of detection accuracy and computational efficiency, and thus amenable for real-time intrusion detection.

Keywords: Intrusion detection; Principal Component Analysis; Hidden Markov models; Network security; Data streams

Article Outline

1. Introduction
2. Intrusion detection methods based on the transition properties and correlation properties of audit events
2.1. HMM-based intrusion detection method considering the transition properties of audit events
2.1.1. Data sets
2.1.2. Testing results based on the HMM method
2.2. The method considering the correlation properties of audit events
3. The proposed intrusion detection method based on Principal Component Analysis
3.1. Principal Component Analysis
3.2. Intrusion detection model based on PCA
3.2.1. Data preparation
3.2.2. Dimension reduction and feature extraction
3.2.3. Classification
4. Experiments and testing
4.1. Experiments on system call data
4.1.1. Data sets
4.1.2. Testing results and analysis
4.2. Experiments on shell command data
4.2.1. Data sets
4.2.2. Testing results and analysis
4.3. Experiments on network data
4.3.1. Data sets
4.3.2. Testing results and analysis
5. Concluding remarks
Acknowledgements
References
Vitae









Computer Communications
Volume 31, Issue 1, 15 January 2008, Pages 58-72
 
Home
Browse
My Settings
Alerts
Help
Elsevier.com (Opens new window)
About ScienceDirect  |  Contact Us  |  Information for Advertisers  |  Terms & Conditions  |  Privacy Policy
Copyright © 2008 Elsevier B.V. All rights reserved. ScienceDirect® is a registered trademark of Elsevier B.V.