MBA: Market Basket Analysis Using Frequent Pattern Mining Techniques

Main Article Content

Sallam Osman Fageeri
Mohammad Abu Kausar
Arockiasamy Soosaimanickam

Abstract

This Market Basket Analysis (MBA) is a data mining technique that uses frequent pattern mining algorithms to discover patterns of co-occurrence among items that are frequently purchased together. It is commonly used in retail and e-commerce businesses to generate association rules that describe the relationships between different items, and to make recommendations to customers based on their previous purchases. MBA is a powerful tool for identifying patterns of co-occurrence and generating insights that can improve sales and marketing strategies. Although a numerous works has been carried out to handle the computational cost for discovering the frequent itemsets, but it still needs more exploration and developments. In this paper, we introduce an efficient Bitwise-Based data structure technique for mining frequent pattern in large-scale databases. The algorithm scans the original database once, using the Bitwise-Based data representations as well as vertical database layout, compared to the well-known Apriori and FP-Growth algorithm. Bitwise-Based technique enhance the problems of multiple passes over the original database, hence, minimizes the execution time. Extensive experiments have been carried out to validate our technique, which outperform Apriori, Éclat, FP-growth, and H-mine in terms of execution time for Market Basket Analysis.

Article Details

How to Cite
Fageeri, S. O. ., Kausar, M. A. ., & Soosaimanickam, A. . (2023). MBA: Market Basket Analysis Using Frequent Pattern Mining Techniques. International Journal on Recent and Innovation Trends in Computing and Communication, 11(5s), 15–21. https://doi.org/10.17762/ijritcc.v11i5s.6591
Section
Articles

References

P. B. Jensen, L. J. Jensen, and S. Brunak, "Mining electronic health records: towards better research applications and clinical care," Nature Reviews Genetics, vol. 13, pp. 395-405, 2012.

R. Gupta, "Analysis and design of data mining techniques for prevention and detection of financial frauds," 2013.

A. Bansal and M. R. Rastogi, "LEARNING BEHAVIOR OF ANALYSIS OF HIGHER STUDIES USING DATA MINING," International Journal of Advanced Research in Computer Engineering & Technology (IJARCET), vol. 1, pp. pp: 80-84, 2012.

S. U. Kumar, H. H. Inbarani, and S. S. Kumar, "Bijective soft set based classification of medical data," in Pattern Recognition, Informatics and Medical Engineering (PRIME), 2013 International Conference on, 2013, pp. 517-521.

R. Al Iqbal, "Hybrid clinical decision support system: An automated diagnostic system for rural Bangladesh," in Informatics, Electronics & Vision (ICIEV), 2012 International Conference on, 2012, pp. 76-81.

B. Milovic, "Prediction and decision making in Health Care using Data Mining," International Journal of Public Health Science (IJPHS), vol. 1, pp. 69-78, 2012.

M. V. Joseph, "Data Mining and Business Intelligence Applications in Telecommunication Industry," International Journal of Engineering and Advanced Technology (IJEAT), vol. 2, pp. 525-528, 2013.

R. Sujatha and D. Ezhilmaran, "A Proposal for Analysis of Crime Based on Socio–Economic Impact using Data Mining Techniques," International Journal of Societal Applications of Computer Science, vol. 2, pp. 229-231, 2013.

A. Chauhan, G. Mishra, and G. Kumar, Survey on Data mining Techniques in Intrusion Detection: Lap Lambert Academic Publ, 2012.

S. O. Fageeri, R. Ahmad, and B. Baharum, "A Log File Analysis Technique Using Binary-Based Approach," in Proceedings of the First International Conference on Advanced Data and Information Engineering (DaEng-2013), 2014, pp. 3-11.

R. Agrawal, T. Imieli?ski, and A. Swami, "Mining association rules between sets of items in large databases," in ACM SIGMOD Record, 1993, pp. 207-216.

M. M. Mazid, A. Shawkat Ali, and K. S. Tickle, "Finding a unique association rule mining algorithm based on data characteristics," in Electrical and Computer Engineering, 2008. ICECE 2008. International Conference on, 2008, pp. 902-908.

H. Toivonen, "Sampling large databases for association rules," in VLDB, 1996, pp. 134-145.

H. Cheng, X. Yan, and J. Han, "IncSpan: incremental mining of sequential patterns in large database," in Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, 2004, pp. 527-532.

S. Brin, R. Motwani, J. D. Ullman, and S. Tsur, "Dynamic itemset counting and implication rules for market basket data," in ACM SIGMOD Record, 1997, pp. 255-264.

J. S. Park, M.-S. Chen, and P. S. Yu, "Using a hash-based method with transaction trimming for mining association rules," Knowledge and Data Engineering, IEEE Transactions on, vol. 9, pp. 813-825, 1997.

J. S. Park, M.-S. Chen, and P. S. Yu, "Efficient parallel data mining for association rules," in Proceedings of the fourth international conference on Information and knowledge management, 1995, pp. 31-36.

R. Agrawal and J. C. Shafer, "Parallel mining of association rules," Knowledge and Data Engineering, IEEE Transactions on, vol. 8, pp. 962-969, 1996.

D. W. Cheung, J. Han, V. T. Ng, A. W. Fu, and Y. Fu, "A fast distributed algorithm for mining association rules," in Parallel and Distributed Information Systems, 1996., Fourth International Conference on, 1996, pp. 31-42.

M. J. Zaki, S. Parthasarathy, M. Ogihara, and W. Li, "Parallel algorithms for discovery of association rules," Data mining and knowledge discovery, vol. 1, pp. 343-373, 1997.

A. Savasere, E. R. Omiecinski, and S. B. Navathe, "An efficient algorithm for mining association rules in large databases," 1995.

F. Geerts, B. Goethals, and J. Van den Bussche, "A tight upper bound on the number of candidate patterns," in Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on, 2001, pp. 155-162.

J. S. Park, M.-S. Chen, and P. S. Yu, An effective hash-based algorithm for mining association rules vol. 24: ACM, 1995.

S. Brin, R. Motwani, J. D. Ullman, and S. Tsur, "Dynamic itemset counting and implication rules for market basket data," in ACM SIGMOD Record, 1997, pp. 255-264.

M. J. Zaki, S. Parthasarathy, M. Ogihara, and W. Li, "New Algorithms for Fast Discovery of Association Rules," in KDD, 1997, pp. 283-286.

R. Agrawal and R. Srikant, "Fast algorithms for mining association rules," in Proc. 20th Int. Conf. Very Large Data Bases, VLDB, 1994, pp. 487-499.

J. Han, J. Pei, and Y. Yin, "Mining frequent patterns without candidate generation," in ACM SIGMOD Record, 2000, pp. 1-12.

F.-y. DENG and Z.-y. LIU, "(Dept. of Management Science, Xiamen University, Xiamen 361005, China); An Ameliorating FP-growth Algorithm Based on Patterns-matrix [J]," Journal of Xiamen University (Natural Science), vol. 5, 2005.

Z. Y. LüHongbing, "An Incremental Updating Algorithm to Mine Association Rules Based on Frequent Pattern Growth [J]," Computer Engineering and Applications, vol. 26, p. 055, 2004.

K. Wang, L. Tang, J. Han, and J. Liu, Top down fp-growth for association rule mining: Springer, 2002.

Y. Qiu, Y.-J. Lan, and Q.-S. Xie, "An improved algorithm of mining from FP-tree," in Machine Learning and Cybernetics, 2004. Proceedings of 2004 International Conference on, 2004, pp. 1665-1670.

A. Pietracaprina, "Mining frequent itemsets using patricia tries," 2003.

J.-W. Han, J. Pei, and X.-F. Yan, "From sequential pattern mining to structured pattern mining: a pattern-growth approach," Journal of Computer Science and Technology, vol. 19, pp. 257-279, 2004.

Y.-H. Hu and Y.-L. Chen, "Mining association rules with multiple minimum supports: a new mining algorithm and a support tuning mechanism," Decision Support Systems, vol. 42, pp. 1-24, 2006.

J. Pei, J. Han, H. Lu, S. Nishio, S. Tang, and D. Yang, "H-mine: Hyper-structure mining of frequent patterns in large databases," in Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on, 2001, pp. 441-448.

Fageeri, S.O., Hossain, S.M.E., Arockiasamy, S., Al-Salmi, T.Y. (2022). High-Utility Pattern Mining Using ULB-Miner. In: Aurelia, S., Hiremath, S.S., Subramanian, K., Biswas, S.K. (eds) Sustainable Advanced Computing. Lecture Notes in Electrical Engineering, vol 840.

Fageeri, S., Ahmad, R., Alhussian, H. (2020). An Efficient Algorithm for Mining Frequent Itemsets and Association Rules. In: Subair, S., Thron, C. (eds) Implementations and Applications of Machine Learning. Studies in Computational Intelligence, vol 782. Springer, Cham