Abstract
As we know, learning in real world is interactive, incremental and dynamical in multiple dimensions, where new data could be appeared at anytime from anywhere and of any type. Therefore, incremental learning is of more and more importance in real world data mining scenarios. Decision trees, due to their characteristics, have been widely used for incremental learning. In this paper, we propose a novel incremental decision tree algorithm based on rough set theory. To improve the computation efficiency of our algorithm, when a new instance arrives, according to the given decision tree adaptation strategies, the algorithm will only modify some existing leaf node in the currently active decision tree or add a new leaf node to the tree, which can avoid the high time complexity of the traditional incremental methods for rebuilding decision trees too many times. Moreover, the rough set based attribute reduction method is used to filter out the redundant attributes from the original set of attributes. And we adopt the two basic notions of rough sets: significance of attributes and dependency of attributes, as the heuristic information for the selection of splitting attributes. Finally, we apply the proposed algorithm to intrusion detection. The experimental results demonstrate that our algorithm can provide competitive solutions to incremental learning.
Similar content being viewed by others
References
Anderson JP (1980) Computer security threat monitoring and surveillance. James P. Anderson Co., Fort Washington
Bace R, Mell P (2001) Intrusion detection systems. NIST special publication on intrusion detection system, SP 800-31
Bai JS, Fan B, Xue JY (2003) Knowledge representation and acquisition approach based on decision tree. In: Proceedings of the international conference on natural language processing and knowledge engineering, pp 533–538
Bay SD (1999) The UCI KDD repository. Available online at: http://kdd.ics.uci.edu
Bian YH (1998) An incremental algorithm for learning certain rules from inconsistent examples. J East China Shipbuild Inst 12(1): 25–30
Borrajo D, Veloso M (1997) Lazy incremental learning of control knowledge for efficiently obtaining quality plans. Artif Intell Rev 11(1-5): 371–405
Catlett J (1991) On Changing Continuous Attributes into Ordered Discrete attributes. In: Proceecings of European working session on learning, Springer LNCS, vol 482, pp 164–178
Chen ST, Chen GL, Guo WZ, Liu YH (2010) Feature selection of the intrusion detection data based on particle swarm optimization and neighborhood reduction. J Comput Res Dev 47(7): 1261–1267
Domingos P, Hulten G (2000) Mining high-speed data streams. In: Proceedings of the sixth ACM SIGKDD international conference on knowledge discovery and data mining, Boston, USA, pp 71–80
Han JW, Kamber M (2001) Data mining: concepts and techniques. Morgan Kaufmann Publishers, San Francisco
Hu XH (1995) Knowledge discovery in databases: an attribute-oriented rough set approach. Ph.D. thesis, Regina University, Canada
Hu KY, Lu YC, Shi CY (2003) Feature ranking in rough sets. AI Commun 16(1): 41–50
Hu QH, Yu DR, Liu JF, Wu CX (2008) Neighborhood rough set based heterogeneous feature subset selection. Inf Sci 178(18): 3577–3594
Huang LJ, Huang MH, Guo B (2007) A new method for constructing decision tree based on rough set theory. In: Proceedings of 2007 IEEE international conference on granular computing, pp 241–244
Huang ZX, Lu XD, Duan HL (2010) Context-aware recommendation using rough set model and collaborative filtering. Artif Intell Rev 35(1): 85–99
Jiang Y, Li ZH, Zhang Q, Liu Y (2004) New method for constructing decision tree based on rough set theory. Comput Appl 24(8): 21–23
Jiang F, Sui YF, Cao CG (2009) Some issues about outlier detection in rough set theory. Expert Syst Appl 36(3): 4680–4687
KDD Cup 99 Dataset (1999) Available online at: http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html
Li XP, Dong M (2008) An algorithm for constructing decision tree based on variable precision rough set model. In: Proceedings of the 4th international conference on natural computation, pp 280–283
Li XY, Ye N (2001) Decision tree classifiers for computer intrusion detection. J Parallel Distrib Comput Pract 4(2): 179–190
Liu ZT (1999) An incremental arithmetic for the smallest reduction of attributes. Chin J Electron 27(11): 96–98
Liu SH, Sheng QJ, Wu B, Shi ZZ (2003) Research on efficient algorithms for rough set methods. Chin J Comput 26(5): 525–529
MacLeod C, Maxwell GM (2001) Incremental evolution in ANNs: neural nets which grow. Artif Intell Rev 16(3): 201–224
Miao DQ, Hu GR (1999) An heuristic algorithm of knowledge reduction. Comput Res Dev 36(6): 681–684
Nguyen HS, Nguyen SH (1998) Discretization methods in data mining. In: Polkowski L, Skowron A (eds) Rough sets in knowledge discovery. Physica, pp 451–482
Øhrn A (1999) Rosetta technical reference manual. Available online at: http://www.idi.ntnu.no/_aleks/rosetta
Pawlak Z (1982) Rough sets. Int J Comput Inf Sci 11(5): 341–356
Pawlak Z (1991) Rough sets: theoretical aspects of reasoning about data. Kluwer Academic Publishing, Dordrecht
Quinlan R (1986) Induction of decision trees. Mach Learn 1(1): 81–106
Quinlan R (1993) C4.5: programs for machine learning. Morgan Kaufmann, San Francisco
Schlimmer JC, Fisher D (1986) A case study of incremental concept induction. In: Proceedings of the fifth national conference on artificial intelligence, pp 496–501
Shan N, Ziarko W (1993) An incremental learning algorithm for constructing decision rules. In: Rough sets, fuzzy sets and knowledge discovery. Springer, Heidelberg, pp 326–334
Utgoff PE (1989) Incremental induction of decision trees. Mach Learn 4(2): 161–186
Utgoff PE (1994) An improved algorithm for incremental induction of decision trees. In: Proceedings of the 11th international conference on machine learning, pp 318–325
Wang GY (2001) Rough set theory and knowledge acquisition. Xian Jiaotong University Press, Xian
Wang GY, Yu H, Yang DC (2002) Decision table reduction based on conditional information entropy. Chin J Comput 25(7): 759–766
Wei JM, Huang D, Wang SQ, Ma ZY (2002) Rough set based decision tree. In: Proceedings of the 4th world congress on intelligent control and automation, pp 426–431
Witten IH, Frank E (2000) Data mining: practical machine learning tools and techniques with Java implementations. Morgan Kaufmann, San Francisco
Wong AKC, Chiu DKY (1987) Synthesizing statistical knowledge from incomplete mixed-mode data. IEEE Trans Pattern Anal Mach Intell 9(6): 796–805
Xu ZY, Liu ZP, Yang BR, Song W (2006) A quick attribute reduction algorithm with complexity of max(O(|C| |U|), O(|C|2 |U/C|)). Chin J Comput 29(3): 391–399
Zheng Z, Wang GY (2004) RRIA: a rough set and rule tree based incremental knowledge acquisition algorithm. Fundamenta Informaticae 59(2/3): 299–313
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Jiang, F., Sui, Y. & Cao, C. An incremental decision tree algorithm based on rough sets and its application in intrusion detection. Artif Intell Rev 40, 517–530 (2013). https://doi.org/10.1007/s10462-011-9293-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10462-011-9293-z