An N-Gram and STF-IDF model for masquerade detection in a UNIX environment

Geng, Dai; Odaka, Thmohiro; Kuroiwa, Jousuke; Ogura, Hisakazu

doi:10.1007/s11416-010-0143-3

An N-Gram and STF-IDF model for masquerade detection in a UNIX environment

Original Paper
Published: 13 May 2010

Volume 7, pages 133–142, (2011)
Cite this article

Journal in Computer Virology Aims and scope Submit manuscript

Dai Geng¹,
Thmohiro Odaka¹,
Jousuke Kuroiwa¹ &
…
Hisakazu Ogura¹

159 Accesses
7 Citations
Explore all metrics

Abstract

A masquerader is someone who impersonates another user and operates a computer system with privileged access. Computer security problems caused by masqueraders are serious. Although anomaly detection is considered to be the best way to detect masqueraders, due to the low probability of detection and high error rate, this method is still in the research phase. Thus far, a number of methods, such as the Support Vector Machine (SVM), the Hidden Markov Model (HMM), and the Naïve Bayes (N. Bayes) classifier technique, have been investigated in order to further improve accuracy of detection. In the present paper, a method of integrating Data Mining and Natural Language Processing, namely, the N-Gram_Square root Term Frequency-Inverse Document Frequency (N-Gram_STF-IDF), is proposed. Using the proposed method, sequences to be detected are segmented via N-Gram characteristics, and non-normal users are then detected using a STF-IDF classifier. We perform an experiment using Schonlau and Greenberg data sets and the proposed method and compare the obtained results with results obtained using various other methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

DTI. Information security breaches survey 2006. Technical report, DTI (Department of Trade and Industry, Britain) (2006)
Gordon, L.A., Loeb, M.P., Lucyshyn, W., Richardson, R.: CSI/FBI Computer crime and security survey 2006. Computer Security Institute publications (2006)
Yampolskiy, R.V.: Human computer interaction based intrusion detection. In: Fourth International Conference on Information Technology, 2007, ITNG’07, pp. 837–842 (2007)
Axelsson, S.: Intrusion detection systems: a survey and taxonomy. Department of Computer Engineering, Chalmers University, Tech. Rep. 1:99–15 (2000)
Murali, A., Rao, M.: A survey on intrusion detection approaches. In: First International Conference on Information and Communication Technologies, ICICT 2005, pp. 233–240 (2005)
Schonlau M., DuMouchel W., Ju W.H., Karr A.F., Theus M., Vardi Y.: Computer intrusion: detecting masquerades. Stat. Sci. 16, 58–74 (2001)
Article MATH MathSciNet Google Scholar
Huang S.H.S., Wu H.C.: Analysis of user command behavior and masquerade detection. J. Inf. Assur. Secur. 4, 265–273 (2009)
Google Scholar
Liao Y., Vemuri V.R., Pasos A.: Adaptive anomaly detection with evolving connectionist systems. J. Netw. Comput. Appl. 30(1), 60–80 (2007)
Article Google Scholar
Guan X., Wang W., Zhang X.: Fast intrusion detection based on a non-negative matrix factorization model. J. Netw. Comput. Appl. 32(1), 31–44 (2009)
Article Google Scholar
Greenberg S.: Using unix: collected traces of 168 users. Department of Computer Science, University of Calgary. Technical Report 88(333), 45 (1988)
Google Scholar
Sebastiani F.: Machine learning in automated text categorization. ACM Comput. Surv. (CSUR) 34(1), 1–47 (2002)
Article Google Scholar
Maxion R.A., Townsend T.N.: Masquerade detection augmented with error analysis. IEEE Trans. Reliab. 53(1), 124–147 (2004)
Article Google Scholar
Kim H.S., Cha S.D.: Empirical evaluation of SVM-based masquerade detection using UNIX commands. Comput. Secur. 24(2), 160–168 (2005)
Article Google Scholar
Warrender, C., Forrest, S., Pearlmutter, B.: Detecting intrusions using system calls: alternative data models. In: IEEE Symposium on Security and Privacy, pp. 133–145. IEEE Computer Society, USA (1999)
Oka, M., Oyama, Y., Abe, H., Kato, K.: Anomaly detection using layered networks based on eigen co-occurrence matrix. Lecture Notes in Computer Science, pp. 223–237 (2004)
Jian Z., Shirai H., Takahashi I., Kuroiwa J., Odaka T., Ogura H.: Masquerade detection by boosting decision stumps using UNIX commands. Comput. Secur. 26(4), 311–318 (2007)
Article Google Scholar
Latendresse, M., Navy, U.S.: Masquerade detection via customized grammars. In: Second International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment. LNCS, vol. 3548, pp. 141–159. Springer, Berlin (2005)
Cavnar, W.B., Trenkle, J.M.: N-gram-based text categorization. In: Proceedings of SDAIR-94, 3rd annual symposium on document analysis and information retrieval, pp. 161–175 (1994)
Jones K.S. et al.: A statistical interpretation of term specificity and its application in retrieval. J. Documentation 60, 493–502 (2004)
Article Google Scholar
Salton G., Buckley C.: Term-weighting approaches in automatic text retrieval* 1. Inf. Processing Manage. 24(5), 513–523 (1988)
Article Google Scholar
Debole, F., Sebastiani, F.: Supervised term weighting for automated text categorization. In: Proceedings of the 2003 ACM symposium on Applied computing, pp. 784–788. ACM, New York (2003)

Download references

Author information

Authors and Affiliations

Graduate School of Engineering, University of Fukui, Fukui, Japan
Dai Geng, Thmohiro Odaka, Jousuke Kuroiwa & Hisakazu Ogura

Authors

Dai Geng
View author publications
You can also search for this author in PubMed Google Scholar
Thmohiro Odaka
View author publications
You can also search for this author in PubMed Google Scholar
Jousuke Kuroiwa
View author publications
You can also search for this author in PubMed Google Scholar
Hisakazu Ogura
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dai Geng.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Geng, D., Odaka, T., Kuroiwa, J. et al. An N-Gram and STF-IDF model for masquerade detection in a UNIX environment. J Comput Virol 7, 133–142 (2011). https://doi.org/10.1007/s11416-010-0143-3

Download citation

Received: 09 June 2009
Accepted: 08 April 2010
Published: 13 May 2010
Issue Date: May 2011
DOI: https://doi.org/10.1007/s11416-010-0143-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An N-Gram and STF-IDF model for masquerade detection in a UNIX environment

Abstract

Access this article

Similar content being viewed by others

A Parameter-Free Method for the Detection of Web Attacks

A Platform for Peptidase Detection Based on Text Mining Techniques and Support Vector Machines

Context-sensitive and keyword density-based supervised machine learning techniques for malicious webpage detection

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An N-Gram and STF-IDF model for masquerade detection in a UNIX environment

Abstract

Access this article

Similar content being viewed by others

A Parameter-Free Method for the Detection of Web Attacks

A Platform for Peptidase Detection Based on Text Mining Techniques and Support Vector Machines

Context-sensitive and keyword density-based supervised machine learning techniques for malicious webpage detection

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation