Abstract
Contrast pattern miners and contrast pattern classifiers typically use a quality measure to evaluate the discriminative power of a pattern. Since many quality measures exist, it is important to perform comparative studies among them. Nevertheless, previous studies mostly compare measures based on how they impact the classification accuracy. In this paper, we introduce a comparative study of quality measures over different aspects: accuracy using the whole training set, accuracy using pattern subsets, and accuracy and compression for filtering patterns. Experiments over 10 quality measures in 25 repository databases show that there is a huge correlation among different quality measures and that the most accurate quality measures are not appropriate in contexts like pattern filtering.
Chapter PDF
Similar content being viewed by others
References
Martens, D., Baesens, B., Gestel, T.V., Vanthienen, J.: Comprehensible credit scoring models using rule extraction from support vector machines. European Journal of Operational Research 183(3), 1466–1476 (2007)
Dong, G.: Overview of Results on Contrast Mining and Applications. In: Dong, G., Bailey, J. (eds.) Contrast Data Mining: Concepts, Algorithms, and Applications, pp. 353–362. Chapman & Hall/CRC, United States of America (2012)
Fang, G., Wang, W., Oatley, B., Ness, B.V., Steinbach, M., Kumar, V.: Characterizing discriminative patterns. Computing Research Repository, abs/1102.4 (2011)
An, A., Cercone, N.: Rule quality measures for rule induction systems: Description and evaluation. Computational Intelligence 17(3), 409–424 (2001)
Bailey, J.: Statistical Measures for Contrast Patterns. In: Dong, G., Bailey, J. (eds.) Contrast Data Mining: Concepts, Algorithms, and Applications, pp. 13–20. Chapman & Hall/CRC, United States of America (2012)
Dong, G., Li, J.: Efficient mining of emerging patterns: Discovering trends and differences. In: ACM International Conference on Knowledge Discovery and Data Mining (KDD), pp. 43–52 (1999)
Bay, S.D., Pazzani, M.J.: Detecting change in categorical data: Mining contrast sets. In: ACM International Conference on Knowledge Discovery and Data Mining (KDD), pp. 302–306 (1999)
Li, J., Yang, Q.: Strong compound-risk factors: Efficient discovery through emerging patterns and contrast sets. IEEE Transactions on Information Technology in Biomedicine 11(5), 544–552 (2007)
Yin, X., Han, J.: CPAR: Classification based on predictive association rules. In: SIAM International Conference on Data Mining, SDM (2003)
Li, J., Li, H., Wong, L., Pei, J., Dong, G.: Minimum description length principle: Generators are preferable to closed patterns. In: 21st National Conf. on AI, pp. 409–414 (2006)
Lavrac, N., Kavsek, B., Flach, P.A., Todorovski, L.: Subgroup discovery with cn2-sd. Journal of Machine Learning Research with CN2-SD 5, 153–188 (2004)
Ramamohanarao, K., Fan, H.: Patterns based classifiers. World Wide Web 10, 71–83 (2007)
Abudawood, T., Flach, P.: Evaluation measures for multi-class subgroup discovery. In: Buntine, W., Grobelnik, M., Mladenić, D., Shawe-Taylor, J. (eds.) ECML PKDD 2009, Part I. LNCS (LNAI), vol. 5781, pp. 35–50. Springer, Heidelberg (2009)
García-Borroto, M., Martínez-Trinidad, J.F., Carrasco-Ochoa, J.A., Medina-Pérez, M.A., Ruiz-Shulcloper, J.: LCMine: An efficient algorithm for mining discriminative regularities and its application in supervised classification. Pattern Recognition 43(9), 3025–3034 (2010)
Demsar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)
García, S., Herrera, F., Shawe-Taylor, J.: An extension on statistical comparisons of classifiers over multiple data sets for all pairwise comparisons. Journal of Machine Learning Research 9 (2008)
Merz, C.J., Murphy, P.M.: Uci repository of machine learning databases, Technical report, Department of Information and Computer Science, University of California at Irvine (1998)
Loyola-González, O., García-Borroto, M., Medina-Pérez, M.A., Martínez-Trinidad, J.F., Carrasco-Ochoa, J.A., De Ita, G.: An Empirical Study of Oversampling and Undersampling Methods for LCMine an Emerging Pattern Based Classifier. In: Carrasco-Ochoa, J.A., Martínez-Trinidad, J.F., Rodríguez, J.S., di Baja, G.S. (eds.) MCPR 2013. LNCS, vol. 7914, pp. 264–273. Springer, Heidelberg (2013)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
García-Borroto, M., Loyola-Gonzalez, O., Martínez-Trinidad, J.F., Carrasco-Ochoa, J.A. (2013). Comparing Quality Measures for Contrast Pattern Classifiers. In: Ruiz-Shulcloper, J., Sanniti di Baja, G. (eds) Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. CIARP 2013. Lecture Notes in Computer Science, vol 8258. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41822-8_39
Download citation
DOI: https://doi.org/10.1007/978-3-642-41822-8_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41821-1
Online ISBN: 978-3-642-41822-8
eBook Packages: Computer ScienceComputer Science (R0)