skip to main content
10.1145/2790755.2790771acmotherconferencesArticle/Chapter ViewAbstractPublication PagesideasConference Proceedingsconference-collections
research-article

UP-Hist Tree: An Efficient Data Structure for Mining High Utility Patterns from Transaction Databases

Authors Info & Claims
Published:13 July 2015Publication History

ABSTRACT

High-utility itemset mining is an emerging research area in the field of Data Mining. Several algorithms were proposed to find high-utility itemsets from transaction databases and use a data structure called UP-tree for their working. However, algorithms based on UP-tree generate a lot of candidates due to limited information availability in UP-tree for computing utility value estimates of itemsets. In this paper, we present a data structure named UP-Hist tree which maintains a histogram of item quantities with each node of the tree. The histogram allows computation of better utility estimates for effective pruning of the search space. Extensive experiments on real as well as synthetic datasets show that our algorithm based on UP-Hist tree outperforms the state of the art pattern-growth based algorithms in terms of the total number of candidate high utility itemsets generated that needs to be verified.

References

  1. R. Agrawal, R. Srikant, et al. Fast algorithms for mining association rules. In 20th VLDB, volume 1215, pages 487--499, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. C. F. Ahmed, S. K. Tanbeer, B.-S. Jeong, and Y.-K. Lee. Efficient tree structures for high utility pattern mining in incremental databases. IEEE TKDE, 21(12):1708--1721, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. R. Chan, Q. Yang, and Y.-D. Shen. Mining high utility itemsets. In IEEE ICDM, pages 19--26, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. P. Fournier-Viger, C.-W. Wu, S. Zida, and V. S. Tseng. Fhm: Faster high-utility itemset mining using estimated utility co-occurrence pruning. In Foundations of intelligent systems, pages 83--92. Springer, 2014.Google ScholarGoogle Scholar
  5. B. Goethals and M. Zaki. the fimi repository, 2012.Google ScholarGoogle Scholar
  6. J. Han, J. Pei, and Y. Yin. Mining frequent patterns without candidate generation. In ACM SIGMOD Record, volume 29, pages 1--12, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. C. K.-S. Leung, Q. I. Khan, Z. Li, and T. Hoque. Cantree: A canonical-order tree for incremental frequent-pattern mining. Knowledge and Information Systems, 11(3):287--311, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. H.-F. Li, H.-Y. Huang, Y.-C. Chen, Y.-J. Liu, and S.-Y. Lee. Fast and memory efficient mining of high utility itemsets in data streams. In IEEE ICDM, pages 881--886, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. Liu and J. Qu. Mining high utility itemsets without candidate generation. In Proceedings of the 21st ACM international conference on Information and knowledge management, pages 55--64. ACM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Y. Liu, W.-k. Liao, and A. Choudhary. A fast high utility itemsets mining algorithm. In International workshop on Utility-based data mining, pages 90--99. ACM, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Y. Liu, W.-k. Liao, and A. Choudhary. A two-phase algorithm for fast discovery of high utility itemsets. In Advances in Knowledge Discovery and Data Mining, pages 689--695. Springer, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. T. Lu, Y. Liu, and L. Wang. An algorithm of top-k high utility itemsets mining over data stream. Journal of Software, 9(9):2342--2347, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  13. B.-E. Shie, H.-F. Hsiao, V. S. Tseng, and S. Y. Philip. Mining high utility mobile sequential patterns in mobile commerce environments. In DASFAA, pages 224--238, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. B.-E. Shie, V. S. Tseng, and P. S. Yu. Online mining of temporal maximal utility itemsets from data streams. In ACM Symposium on Applied Computing, pages 1622--1626. ACM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. V. S. Tseng, B.-E. Shie, C.-W. Wu, and P. S. Yu. Efficient algorithms for mining high utility itemsets from transactional databases. IEEE TKDE, 25(8):1772--1786, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. V. S. Tseng, C.-W. Wu, B.-E. Shie, and P. S. Yu. Up-growth: An efficient algorithm for high utility itemset mining. In 16th ACM SIGKDD, pages 253--262, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. C. W. Wu, B.-E. Shie, V. S. Tseng, and P. S. Yu. Mining top-k high utility itemsets. In 18th ACM SIGKDD, pages 78--86. ACM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. Yin, Z. Zheng, L. Cao, Y. Song, and W. Wei. Efficiently mining top-k high utility sequential patterns. In IEEE ICDM, pages 1259--1264. IEEE, 2013.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. UP-Hist Tree: An Efficient Data Structure for Mining High Utility Patterns from Transaction Databases

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        IDEAS '15: Proceedings of the 19th International Database Engineering & Applications Symposium
        July 2015
        251 pages
        ISBN:9781450334143
        DOI:10.1145/2790755

        Copyright © 2015 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 13 July 2015

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed limited

        Acceptance Rates

        Overall Acceptance Rate74of210submissions,35%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader