Abstract
This paper addresses the problem of transforming arbitrary data into binary data. This is intended as preprocessing for a supervised classification task. As a binary mapping compresses the total information of the dataset, the goal here is to design such a mapping that maintains most of the information relevant to the classification problem. Most of the existing approaches to this problem are based on correlation or entropy measures between one individual binary variable and the partition into classes. On the contrary, the approach proposed here is based on a global study of the combinatorial property of a set of binary variable.
Chapter PDF
Similar content being viewed by others
References
Almuallim, H., Dietterich, T.G.: Learning boolean concepts in the presence of many irrelevant features. Artificial Intelligence 69(1-2), 279–306 (1994)
Boros, E., Hammer, P.L., Ibaraki, T., Kogan, A.: Logical analysis of numerical data. Technical Report RRR 4-97, RUTCOR (1997)
Boros, E., Hammer, P.L., Ibaraki, T., Kogan, A., Mayoraz, E., Muchnik, I.: An implementation of logical analysis of data. RRR 22-96, RUTCOR-Rutgers University’s Center For Operations Research (July 1996), http://rutcor.rutgers.edu:80/~rrr/ (to appear in IEEE Trans. on Knowledge and Data Engineering)
Keogh, E., Blake, C., Merz, C.J.: UCI repository of machine learning databases (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html
Hammer, P.L.: Partially defined Boolean functions and cause-effect relationships. In: Int. Conf. on Multi-Attribute Decision Making Via OR-Based Expert Systems, University of Passau, Germany (April 1986)
Moreira, M., Hertz, A., Mayoraz, E.: Data binarization by discriminant elimination. In: Bruha, I., Bohanec, M. (eds.) Proceedings of the ICML 1999 Workshop: From Machine Learning to Knowledge Discovery in Databases, pp. 51–60 (1999) ftp://ftp.idiap.ch/pub/reports/1999/rr99-04.ps.gz
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Mayoraz, E., Moreira, M. (1999). Combinatorial Approach for Data Binarization. In: Żytkow, J.M., Rauch, J. (eds) Principles of Data Mining and Knowledge Discovery. PKDD 1999. Lecture Notes in Computer Science(), vol 1704. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-48247-5_56
Download citation
DOI: https://doi.org/10.1007/978-3-540-48247-5_56
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66490-1
Online ISBN: 978-3-540-48247-5
eBook Packages: Springer Book Archive