Skip to main content
Log in

An efficient algorithm for incrementally mining frequent closed itemsets

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

The purpose of mining frequent itemsets is to identify the items in groups that always appear together and exceed the user-specified threshold of a transaction database. However, numerous frequent itemsets may exist in a transaction database, hindering decision making. Recently, the mining of frequent closed itemsets has become a major research issue because sets of frequent closed itemsets are condensed yet complete representations of frequent itemsets. Therefore, all frequent itemsets can be derived from a group of frequent closed itemsets. Nonetheless, the number of transactions in a transaction database can increase rapidly in a short time period, and a number of the transactions may be outdated. Thus, frequent closed itemsets may be changed with the addition of new transactions or the deletion of old transactions from the transaction database. Updating previously closed itemsets when transactions are added or removed from the transaction database is challenging.

This study proposes an efficient algorithm for incrementally mining frequent closed itemsets without scanning the original database. The proposed algorithm updates closed itemsets by performing several operations on the previously closed itemsets and added/deleted transactions without searching the previously closed itemsets. The experimental results show that the proposed algorithm significantly outperforms previous methods, which require a substantial length of time to search previously closed itemsets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Algorithm 1
Algorithm 2
Algorithm 3
Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20

Similar content being viewed by others

References

  1. Adnan M, Alhajj R (2009) DRFP-tree: disk resident frequent pattern tree. Appl Intell 30(2):84–97

    Article  Google Scholar 

  2. Agrawal R, Srikant R (1994) Fast algorithm for mining association rules. In: Proc of international conference on very large data bases, pp 487–499

    Google Scholar 

  3. Cheng J, Ke Y, Ng W (2008) A survey on algorithms for mining frequent itemsets over data streams. Knowl Inf Syst 16(1):1–27

    MathSciNet  Google Scholar 

  4. Chi Y, Wang H, Yu PS, Muntz RR (2004) Moment: maintaining closed itemsets over a stream sliding window. In: Proc of 2004 IEEE international conference on data mining, pp 59–66

    Google Scholar 

  5. Donga J, Han M (2007) BitTableFI: an efficient mining frequent itemsets algorithm. Knowl-Based Syst 20(4):329–335

    Article  Google Scholar 

  6. Giannella C, Han J, Pei J (2003) Mining frequent patterns in data streams at multiple time granularities. In: Proceedings of next generation data mining. MIT Press, Cambridge, pp 91–212

    Google Scholar 

  7. Han J, Mao R, Pei J, Yin Y (2004) Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min Knowl Discov 8:53–87

    Article  MathSciNet  Google Scholar 

  8. Han J, Cheng H, Xin D, Yan X (2007) Frequent pattern mining: current status and future directions. Data Min Knowl Discov 15(1):55–86

    Article  MathSciNet  Google Scholar 

  9. Hou W, Yang B, Zhou Z, Wu C (2008) An adaptive frequent itemset mining algorithm for data stream with concept drifts. In: Proc of CSSE 2008, pp 382–385

    Google Scholar 

  10. Jiang N, Gruenwald L (2006) CFI-stream: mining closed itemsets in data streams. In: Proc of 12th ACM SIGKDD international conference on knowledge discovery and data mining, pp 592–597

    Chapter  Google Scholar 

  11. Jin R, Agrawal G (2005) An algorithm for in-core frequent itemset mining on streaming data. In: Proc of 5th IEEE international conference on data mining, pp 210–217

    Google Scholar 

  12. Koh JL, Shieh SF (2004) An efficient approach for maintaining association rules based on adjusting FP-tree structures. In: Proceedings of the 12th international conference on database systems for advanced applications (DASFAA), pp 417–424

    Chapter  Google Scholar 

  13. Li H-F, Lee S-Y (2009) Mining frequent itemsets over data streams using efficient window sliding techniques. Expert Syst Appl, Part 1 36(2):1466–1477

    Article  Google Scholar 

  14. Liu B, Ma Y, Wong CK, Yu PS (2003) Scoring the data using association rules. Appl Intell 18(2):119–135

    Article  MATH  Google Scholar 

  15. Manku GS, Motwani R (2002) Approximate frequency counts over data streams. In: Proc of 28th international conference on very large data bases, pp 346–357

    Chapter  Google Scholar 

  16. Pasquier N, Bastide Y, Taouil R, Lakhal L (1999) Discovering frequent closed itemsets for association rules. In: Proc of 7th international conference on database theory, pp 398–416

    Google Scholar 

  17. Raïssi C, Poncelet P, Teisseire M (2007) Towards a new approach for mining frequent itemsets on data stream. J Intell Inf Syst 28(1):23–36

    Article  Google Scholar 

  18. Wang J, Han J, Pei J (2003) CLOSET+: searching for the best strategies for mining frequent closed itemsets. In: Proc of 9th ACM SIGKDD international conference on knowledge discovery and data mining, pp 236–245

    Google Scholar 

  19. Yen SJ, Lee YS, Gu JY (2012) An efficient approach for updating the structure for mining frequent patterns. In: Proc of IEEE international conference on industrial engineering and engineering management, December 2012

    Google Scholar 

  20. Yen SJ, Wang CK, Ouyang LY (2012) A search space algorithm for mining frequent patterns. J Inf Sci Eng 28(1):177–191

    Google Scholar 

  21. Zaki MJ, Hsiao CJ (2002) CHARM: an efficient algorithm for closed itemset mining. In: Proc of SIAM international conference on data mining, pp 99–104

    Google Scholar 

  22. Lucchese C, Orlando S, Perego R (2006) Fast and memory efficient mining of frequent closed itemsets. IEEE Trans Knowl Data Eng 18(1):21–36

    Article  Google Scholar 

  23. IBM Almaden. Quest synthetic data generation code. http://www.almaden.ibm.com/cs/quest/syndata.html

  24. Microsoft Developer Network (MSDN). http://msdn.microsoft.com/en-us/library/aa217032(v=sql.80).asp

  25. Bache K, Lichman M (2013) UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yue-Shi Lee.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yen, SJ., Lee, YS. & Wang, CK. An efficient algorithm for incrementally mining frequent closed itemsets. Appl Intell 40, 649–668 (2014). https://doi.org/10.1007/s10489-013-0487-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-013-0487-8

Keywords

Navigation