Abstract
Data mining is one of the most important areas in the 21 century for its applications are wide ranging. This includes medicine, finance, commerce and engineering, to name a few. Pattern mining is amongst the most important and challenging techniques employed in data mining. Patterns are collections of items which satisfy certain properties. Emerging Patterns are those whose frequencies change significantly from one dataset to another. They represent strong contrast knowledge and have been shown very successful for constructing accurate and robust classifiers. In this paper, we examine various kinds of patterns. We also investigate efficient pattern mining techniques and discuss how to exploit patterns to construct effective classifiers.
Similar content being viewed by others
References
Aha, D., Kibler, D., Albert, M.: Instance-based learning algorithms. Mach. Learn. 6, 37–66 (1991)
Alhammady, H., Ramamohanarao, K.: The application of emerging patterns for improving the quality of rare-class classification. In: Proceedings of the 8th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD2004), pp. 207–211. Sydney, Australia (2004)
Alhammady, H., Ramamohanarao, K.: Using emerging patterns and decision trees in rare-class classification. In: Proceedings of the 4th IEEE Int’l Conference on Data Mining (ICDM 2004), pp. 315–318. IEEE Computer Society, Brighton, UK (2004)
Alhammady, H., Ramamohanarao, K.: Expanding the training data space using emerging patterns and genetic methods. In: Proceedings of the 2005 SIAM International Data Mining Conference (SDM2005) (2005)
Bailey, J., Manoukian, T., Ramamohanarao, K.: Fast algorithms for mining emerging patterns. In: Proceedings of the 6th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD’02). Helsinki, Finland (2002)
Bailey, J., Manoukian, T., Ramamohanarao, K.: Classification using constrained emerging patterns. In: Proceedings of the 4th Int’l. Conference on Web-Age Information Management (WAIM2003), pp. 226–237. Chengdu, China (2003) August
Bailey, J., Manoukian, T., Ramamohanarao, K.: A fast algorithm for computing hypergraph transversals and its application in mining emerging patterns. In: Proceedings of the 3rd IEEE Int’l Conf. on Data Mining (ICDM2003), pp. 485–488. Melbourne, FL (2003)
Bay, S.D., Pazzani, M.J.: Detecting group differences: mining contrast sets. Data Min. Knowl. Discov. 5(3), 213–246 (2001)
Bayardo Jr., R.J.: Efficiently mining long patterns from databases. In: Proceedings of the 1998 ACM-SIGMOD Int’l Conference Management of Data (SIGMOD’98), pp. 85–93. ACM, Seattle, WA (1998)(June)
Bethea, R.M., Duran, B.S., Boullion, T.L.: Statistical Methods for Engineers and Scientists. M. Dekker, New York (1995)
Bishop, C.M., Bishop, C.: Neural Networks for Pattern Recognition. Oxford University Press, UK (1995)
Blake, C.L., Merz, C.J.: UCI repository of machine learning databases (1998)
Brachman, R.J., Khabaza, T., Kloesgen, W., Piatetsky-Shapiro, G., Simoudis, E.: Mining business databases. Commun. ACM 39(11), 42–48 (1996)
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Chapman & Hall, New York (1984)
Cheeseman, P., Stutz, J.: Bayesian classification (autoclass): theory and results. In: Proc 2nd Int’l Conf on Knowledge Discovery and Data Mining (KDD-96), pp. 153–180. MIT, Cambridge, MA (1996)
Christensen, R.: Log-Linear Models and Logistic Regression. Springer, Berlin Heidelberg New York (1997)
Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13, 21–27 (1967)
Dasarathy, B.V.: Nearest neighbor norms: NN pattern classification techniques. IEEE Computer Society, Los Alamitos, CA (1991)
Dong, G., Li, J.: Efficient mining of emerging patterns: Discovering trends and differences. In: Proc. 5th ACM SIGKDD Int’l Conf. on Knowledge Discovery and Data Mining (KDD’99), pp. 43–52. San Diego, CA (1999)(August)
Dong, G., Zhang, X., Wong, L., Li, J.: Caep: classification by aggregating emerging patterns. In: Proceedings of the 2nd Int’l Conference on Discovery Science (DS’99), pp. 30–42. Tokyo, Japan (1999)(December)
Duda, R.O., Hart, P.E.: Pattern classification and scene analysis. Wiley, New York (1973)
Fan, H., Ramamohanarao, K.: An efficient single-scan algorithm for mining essential jumping emerging patterns for classification. In: Proc. 6th Pacific-Asia Conf. on Knowledge Discovery and Data Mining (PAKDD2002), pp. 456–462. Taipei, Taiwan, China (2002)(May)
Fan, H., Ramamohanarao, K.: A bayesian approach to use emerging patterns for classification. In: Proc. 14th Australasian Database Conference (ADC2003), pp. 39–48. Adelaide, Australia (2003)(February)
Fan, H., Ramamohanarao, K.: Efficiently mining interesting emerging patterns. In: Proc. 4th Int’l. Conf. on Web-Age Information Management (WAIM2003), pp. 189–201. Chengdu, China (2003) (August)
Fan, H., Ramamohanarao, K.: A weighting scheme based on emerging patterns for weighted support vector machines. In: Proceedings of the IEEE Int’l Conference on Granular Computing (GrC 2005)(2005)
Fan, H., Ramamohanarao, K.: Fast discovery and the generalization of strong jumping emerging patterns for building compact and accurate classifiers. IEEE Trans. Knowl. Data Eng. 18(6), 721–737 (2006)
Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P.: From data mining to knowledge discovery in databases. AI Mag. 17, 37–54 (1996)
Freitas, A.A.: Data Mining and Knowledge Discovery with Evolutionary Algorithms. Spinger, Berlin Heidelberg New York (2002)
Gunopulos, D., Mannila, H., Khardon, R., Toivonen, H.: Data mining, hypergraph transversals, and machine learning. In: Proceedinds of the 16th ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems (PODS’97), pp. 209–216 (1997)
Han, J., Cai, Y., Cercone, N.: Knowledge discovery in databases: an attribute-oriented approach. In: Yuan, L.-Y. (ed.) Proceedings of the 18th Int’l Conference on Very Large Databases, pp. 547–559. Morgan Kaufmann, San Francisco, CA (1992)
Han, J., Cai, Y., Cercone, N.: Data-driven discovery of quantitative rules in relational databases. IEEE Trans. Knowl. Data Eng. 5(1), 29–40 (1993)
Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann, CA (2000)
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proceedings of the 2000 ACM-SIGMOD Int’l Conference Management of Data (SIGMOD’00), pp. 1–12. Dallas, TX (2000)(May)
Joshi, M.V., Agarwal, R.C., Kumar, V.: Mining needle in a haystack: classifying rare classes via two-phase rule induction. In: Proceedings of the ACM Conference on Management of Data (SIGMOD2001), pp. 91–102 (2001)
Kohavi, R., John, G., Long, R., Manley, D., Pfleger, K.: MLC++: a machine learning library in C++. Tools with artificial intelligence, pp. 740–743 (1994)
Li, J., Dong, G., Ramamohanarao, K.: Instance-based classification by emerging patterns. In: Proc. 4th European ConferencePrinciples of Data Mining and Knowledge Discovery (PKDD 2000), pp. 191–200. Lyon, France (2000)
Li, J., Dong, G., Ramamohanarao, K.: Making use of the most expressive jumping emerging patterns for classification. Knowl. Inf. Syst. 3(2), 131–145 (2001)
Li, J., Dong, G., Ramamohanarao, K., Wong, L.: Deeps: a new instance-based lazy discovery and classification system. Mach. Learn. 54(2), 99–124 (2004)
Li, J., Liu, H., Downing, J.R., Yeoh, A.E.-J., Wong, L.: Simple rules underlying gene expression profiles of more than six subtypes of acute lymphoblastic leukemia (all) patients. Bioinformatics 19(1), 71–78 (2003)
Li, J., Liu, H., Ng, S.-K., Wong, L.: Discovery of significant rules for classifying cancer diagnosis data. Bioinformatics 19(Suppl. 2), ii93–ii102 (2003)
Li, J., Manoukian, T., Dong, G., Ramamohanarao, K.: Incremental maintenance on the border of the space of emerging patterns. Data Min. Knowl. Discov. 9(1), 89–116 (2004)
Li, J., Ramamohanarao, K., Dong, G.: The space of jumping emerging patterns and its incremental maintenance algorithms. In: Proceedings of the Seventeenth Int’l Conference on Machine Learning (ICML 2000), pp. 551–558. Morgan Kaufmann, Stanford University, Standford, CA (2000)
Li, J., Ramamohanarao, K., Dong, G.: Combining the strength of pattern frequency and distance for classification. In: Proceedings of the 5th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’01), pp. 455–466. Hong Kong, China (2001)
Li, J., Wong, L.: Identifying good diagnostic gene groups from gene expression profiles using the concept of emerging patterns. Bioinformatics 18(5), 725–734 (2002)
Mitchell, T.: Generalization as search. Artif. Intell. 18(2) (1982)
Mitchell, T.: Machine Learning. McGraw-Hill, New York (1997)
Piatetsky-Shapiro, G., Frawley, W.J.: Knowledge Discovery in Databases. MIT, Cambridge, MA (1991)
Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1, 81–106 (1986)
Quinlan, J.R.: C45: Programs for Machine Learning. Morgan Kaufmann, San Mateo, CA (1993)
Ripley, B.D.: Pattern Recognition and Neural Networks. Cambridge University Press, Cambridge, UK (1996)
RuleQuest: See5/c5.0, 2000. RULEQUEST RESEARCH data mining tools http://www.rulequest.com/
Sebag, M.: Delaying the choice of bias: A disjunctive version space approach. In: Proceedings of the 13th Int’l Conference on Machine Learning, pp. 444–452. Morgan Kaufmann, CA (1996)
Wang, Z., Fan, H., Ramamohanarao, K.: Exploiting maximal emerging patterns for classification. In: Proc. 17th Australian Joint Conf. on Artificial Intelligence, pp. 1062–1068. Cairns, Queensland, Australia (2004)(December)
Zhang, X., Dong, G., Ramamohanarao, K. Exploring constraints to efficiently mine emerging patterns from large high-dimensional datasets. In: Proc. 6th ACM SIGKDD Int’l Conf. Knowledge Discovery and Data Mining (KDD’00), pp. 310–314. Boston (2000) (August)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ramamohanarao, K., Fan, H. Patterns Based Classifiers. World Wide Web 10, 71–83 (2007). https://doi.org/10.1007/s11280-006-0012-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11280-006-0012-7