Abstract
In security-related areas there is concern over novel “zero-day” attacks that penetrate system defenses and wreak havoc. The best methods for countering these threats are recognizing “nonself” as in an Artificial Immune System or recognizing “self” through clustering. For either case, the concern remains that something that appears similar to self could be missed. Given this situation, one could incorrectly assume that a preference for a tighter fit to self over generalizability is important for false positive reduction in this type of learning problem. This article confirms that in anomaly detection as in other forms of classification a tight fit, although important, does not supersede model generality. This is shown using three systems each with a different geometric bias in the decision space. The first two use spherical and ellipsoid clusters with a k-means algorithm modified to work on the one-class/blind classification problem. The third is based on wrapping the self points with a multidimensional convex hull (polytope) algorithm capable of learning disjunctive concepts via a thresholding constant. All three of these algorithms are tested using the Voting dataset from the UCI Machine Learning Repository, the MIT Lincoln Labs intrusion detection dataset, and the lossy-compressed steganalysis domain.
Similar content being viewed by others
References
Avcibas I, Memon N, Sankur B (2002) Image steganalysis with binary similarity measures. In: International conference on image processing, Rochester, NY
Barber CB, Dobkin DP, Huhdanpaa HT (1997) The quickhull algorithm for convex hulls. ACM Trans Math Softw 22:469–483
Barber CB, Huhdanpaa HT (2002) Qhull, Version 2002.1. 283k. Computer Software. Available at: http://www.thesa.com/software/qhull/
Barron AR (1991) Approximation and estimation bounds for artificial neural networks. In: Proceedings of the fourth annual workshop on computational learning theory, Morgan Kaufmann, Palo Alto, CA, pp 243–249
Baum EB, Haussler D (1988) What size net gives valid generalization?. In: Proceedings of neural information processing systems, New York, pp 81–90
Blake CL, Merz CJ (1998) UCI repository of machine learning databases. University of California, Department of Information and Computer Science, Irvine, CA. Available at: http://www.ics.uci.edu/~mlearn/MLRepository.html
Brotherton T, Johnson T (2001) Anomaly detection for advanced military aircraft using neural networks. In: IEEE Aerospace Conference, Big Sky, MT
Chang CI, Chiang SS (2002) Anomaly detection and classificiation for hyperspectral imagery. IEEE Trans Geosci Remote Sens 40(6):1314–1325
Cho SB, Park HJ (2003) Efficient anomaly detection by modeling privilege flows using hidden Markov model. Comput Secur 22(1):45–55
Cohen WW (1988) Generalizing number and learning from multiple examples in explanation based learning. Mach Learn 256–269
Coxeter HSM (1973) Regular polytopes, 3rd ed. Dover, New York
Dasgupta D, Gonzales F (2002) An immunity-based technique to characterize intrusions in computer networks. IEEE Trans Evol Comput 6(3):281–291
Delany SJ, Cunningham P (2006) ECUE: A spam filter that uses machine learning to track concept drift. Technical Report TCD-CS-2006-05. Trinity College Dublin, Computer Science Department, Ireland
Denning DE (1987) An intrusion detection model. IEEE Trans Softw Eng SE-13:222–232
Drummond C, Holte R (2005) Learning to live with false alarms. In: KDD-2005 workshop on data mining methods for anomaly detection, 21–25 August, Chicago, IL, pp 21–24
Duda RO, Hart PE, Stork DG (2001) Pattern classification, 2nd edn. Wiley, New York
Eskin, E (2000) Anomaly detection over noisy data using learned probability distributions. In: Proceedings of the international conference on machine learning, Stanford University, Stanford, CA
Eskin E, Arnold A, Prerau M, Portnoy L, Stolfo S (2002) A geometric framework for unsupervised anomaly detection: detecting intrusions in unlabeled data. Appl Data Min Comp Secur 82–102
Faird H, Lyu S (2003) Higher-order wavelet statistics and their application to digital forensics. In: IEEE workshop on statistical analysis in computer vision, Madison, WI
Fan W, Miller M, Stolfo S, Lee W, Chan P (2004) Using artificial anomalies to detect unknown and known network intrusions. Knowl Inf Syst 6(5):507–527
Fridrich J, Goljan M, Du R (2001) Detecting LSB steganography in color and gray-scale images. In: IEEE Multimedia Magazine, Special Issue on Security, October 2001, pp 22–28
Gupta A, Sekar R (2003) An approach for detecting self-propagating email using anomaly detection. In: Recent advances in intrusion detection: 6th international symposium, RAID 2003, Lecture notes in computer science, Pittsburgh, PA, 8–10 September
Haines J, Lippmann R, Fried D, Tran E, Boswell S, Zissman M (1999) DARPA intrusion detection system evaluation: design and procedures. MIT Lincoln Laboratory Technical Report, Cambridge
Hamerly G, Elkan C (2003) Learning the k in k-means. Adv Neural Inf Process Syst 15: 289–296 (NIPS)
Inous H, Forrest S (2002) Anomaly intrusion detection in dynamic execution environments. In: New Security Paradigms Workshops, pp 52–60
Jackson J (2003) Targeting covert messages: A unique approach for detecting novel steganography. Masters Thesis, Air Force Institute of Technology, Wright Patterson AFB, OH
Kharrazi M, Sencar T, Memon N (2005) Benchmarking steganographic and steganalysis techniques. In: IEEE SPIE, San Jose, CA, 16–20 January
Kubler TL (2006) Ant clustering with locally weighting ant perception and diversified memory. Masters Thesis, Air Force Institute of Technology, Wright Patterson AFB, OH
Lambert T (1998) Convex hull algorithms applet, UNSW School of Computer Science and Engineering. Availabe at: http://www.cse.unsw.edu.au/~lambert/java/3d/hull.html
Lane T, Brodley C (2003) An empirical study of two approaches to sequence learning for anomaly detection. Mach Learn 51(1):73–107
Lazarevic A, Ertoz L, Ozgur A, Srivastava J, Kumar V (2003) Evaluation of outlier detection schemes for detecting network intrusions. In: Proceedings of the third SIAM international conference on data mining, San Francisco, CA
Lyu S, Farid H (2002) Detecting hidden messages using higher-order statistics and support vector machines. In: Information hiding: 5th international workshop, IH 2002, Noordwijkerhout, The Netherlands, 7–9 October
Lyu S, Farid H (2004) Steganalysis using color wavelet statistics and one-class support vector machines. In: SPIE symposium on electronic Imaging, San Jose, CA
MacQueen J (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of the 5th Berkeley symposium on mathematical statistics and probability, pp 281–297
Mahoney M, Chan P (2003) An analysis of the 1999 DARPA/Lincoln laboratory evaluation data for network anomaly detection. In: Proceedings of the recent advances in intrusion detection, RAID 2003. Pittsburgh, PA, 8–10 September
McBride B, Peterson G, (2004) Blind data classification using hyper-dimensional convex polytopes. In: Proceedings of the 17th international FLAIRS conference, Miami, FL, pp 520–526
McBride BT, Peterson GL, Gustafson SC (2005) A new blind method for detecting novel steganography. Digit Invest 2:50–70
Melnik O (2002) Decision region connectivity analysis: A method for analyzing high-dimensional classifiers. Mach Learn 48:(1/2/3)
Mitchell TM (1982) Generalization as search. Artif Intell 18:203–226
Mitchell TM, Keller RM, Kedar-Cabelli ST (1986) Explanation-based generalization: A unifying view. Mach Learn 1(1):47–80
Nguyen H, Melnik O, Nissim K (2003) Explaining high-dimensional data. unpublished presentation. Available at: http://dimax.rutgers.edu/~hnguyen/GOAL.ppt. Accessed 4 Aug 2003
O’Rourke K (1998) Computation geometry in C, 2nd edn. Cambridge University Press, Cambridge, UK
Pelleg D, Moore A (2000) X-means: extending K-means with efficient estimation of the number of clusters. In: Proceedings of the 17th international conference on machine learning (ICML), pp 727–734
Peterson GL, Mills RF, McBride BT, Alred WC (2005) A comparison of generalizability for anomaly detection. In: KDD-2005 workshop on data mining methods for anomaly detection, 21–25 August, Chicago, IL, pp 53–57
Thrun S (1995) Lifelong learning: a case study. Technical Report CMU-CS-95-208, Carnegie Mellon University, Computer Science Department, Pittsburgh, PA
Wah BW (1999) Generalization and generalizability measures. IEEE Trans Knowl Data Eng 11(1):175–186
Widmer G, Kubat M (1996) Learning in the presence of concept drift and hidden contexts. Mach Learn 23(1):68–101
Wong C, Chen C, Yeh S (2000) K-means-based fuzzy classifier design. In: The ninth IEEE international conference on fuzzy systems, vol. 1, pp 48–52
Author information
Authors and Affiliations
Corresponding author
Additional information
Gilbert “Bert” Peterson is an Assistant Professor of Computer Engineering at the Air Force Institute of Technology. Dr. Peterson received a BS degree in Architecture, and an M.S. and Ph.D. in Computer Science at the University of Texas at Arlington. He teaches and conducts research in digital forensics and artificial intelligence.
Brent McBride is a Communications and Information Systems officer in the United States Air Force. He received a B.S. in Computer Science from Brigham Young University and an M.S. in Computer Science from the Air Force Institute of Technology. He currently serves as Senior Software Engineer at the Air Force Wargaming Institute.
Rights and permissions
About this article
Cite this article
Peterson, G.L., McBride, B.T. The importance of generalizability for anomaly detection. Knowl Inf Syst 14, 377–392 (2008). https://doi.org/10.1007/s10115-007-0072-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-007-0072-8