Abstract
Logical analysis of data (LAD) is a special data analysis methodology which combines ideas and concepts from optimization, combinatorics, and Boolean functions. The central concept in LAD is that of patterns, or rules, which were found to play a critical role in classification, ranked regression, clustering, detection of subclasses, feature selection and other problems. The research area of LAD was defined and initiated by Peter L. Hammer, who was the catalyst of the LAD oriented research for decades, and whose consistent vision and efforts helped the methodology to move from theory to data analysis applications, to achieve maturity and to be successful in many medical, industrial and economics case studies. This overview presents some of the basic aspects of LAD, from the definition of the main concepts to the efficient algorithms for pattern generation, and from the complexity analysis of the difficult problems embedded in LAD to its biomedical applications. We focus in this paper only on some recent developments in LAD which were of particular interest to Peter L. Hammer, who played a key role in obtaining all the results described here. The presentation in this overview is based on the original publications of Peter L. Hammer and his co-authors. We dedicate this paper to the memory of Peter L. Hammer.
Similar content being viewed by others
References
Abramson, S., Alexe, G., Hammer, P.L., Knight, D., Kohn, J.: A computational approach to predicting cell growth on polymeric biomaterials. J. Biomed. Mater. Res. Part A. 73(1), 116–124 (2005)
Alexe, G., Alexe, S., Axelrod, D.E., Bonates, T.O., Lozina, I., Reiss, M., Hammer, P.L.: Breast cancer prognosis by combinatorial analysis of gene expression data. Breast Cancer Res. 8R41 (2006)
Alexe, G., Alexe, S., Axelrod, D.E., Hammer, P.L., Weissmann, D.: Logical analysis of diffuse large B-cell lymphomas. Artif. Intell. Med. 34(3), 235–67 (2005)
Alexe, G., Alexe, S., Crama, Y., Foldes, S., Hammer, P.L., Simeone, B.: Consensus algorithms for the generation of all maximal bicliques. Discrete Appl. Math. 145(1), 11–21 (2004)
Alexe, G., Alexe, S., Hammer, P.L.: Pattern-based clustering and attribute analysis. Soft Comput. 10(5), 442–452 (2006)
Alexe, G., Alexe, S., Hammer, P.L., Liotta, L., Petricoin, E., Reiss, M.: Ovarian cancer detection by logical analysis of proteomic data. Proteomics 4(3), 766–783 (2004)
Alexe, G., Alexe, S., Hammer, P.L., Vizvári, B.: Pattern-based feature selection in genomics and proteomics. Annals Oper. Res. 148(1), 189–201 (2006)
Alexe, G., Alexe, S., Hammer, P.L., Kogan, A.: Comprehensive vs. comprehensible classifiers in logical analysis of data. Discrete Appl. Math. (2007) doi:10.1016/j.dam.2005.02.035
Alexe, S., Blackstone, E., Hammer, P.L., Ishwaran, H., Lauer, M., Pothier Snader, C.: Coronary risk prediction by logical analysis of data. Annals Oper. Res. 119, 15–42 (2003)
Alexe, G., Hammer, P.L.: Spanned patterns for the logical analysis of data. Discrete Appl. Math. 154(7), 1039–1049 (2006)
Alexe, S., Hammer, P.L.: Accelerated algorithm for pattern detection in logical analysis of data. Discrete Appl. Math. 154(7), 1050–1063 (2006)
Alexe, S., Hammer, P.L.: Pattern-based discriminants in the logical analysis of data. In: Pardalos, P.M., Boginski, V.L., Vazacopoulos, A. (eds.) Data Mining in Biomedicine. Springer (2007)
Angluin, D.: Queries and concept learning. Mach. Learn. 2, 319–342 (1988)
Axelrod, D.E., Bonates, T.O., Hammer, P.L., Lozina, I.: From Diagnosis to Therapy via LAD, Invited Lecture at INFORMS Annual Meeting. Denver, CO, October (2004)
Bioch, J.C., Ibaraki, T.: Complexity of identification and dualization of positive boolean functions. Inf. Comput. 123, 50–63 (1995)
Blake, A.: Canonical Expressions in Boolean Algebra, Ph.D. thesis. University of Chicago (1937)
Blake, C.L., Merz, C.J.: UCI Repository of Machine Learning Databases, Department of Information and Computer Science. University of California, Irvine, CA. http://www.ics.uci.edu/~mlearn/MLRepository.html (1998)
Bonates, T.O., Hammer, P.L., Kogan, A.: Maximum patterns in datasets. Discrete Appl. Math. (2007) doi:10.1016/j.dam.2007.06.004
Borda J.C.: Mémoire sur les élections au scrutin. Histoire de l’Academie Royale des Sciences (1781)
Boros, E., Gurvich, V., Hammer, P.L., Ibaraki, T., Kogan, A.: Decomposability of partially defined boolean functions. Discrete Appl. Math. 62, 51–75 (1995)
Boros, E., Hammer, P.L., Ibaraki, T., Kogan, A.: Logical analysis of numerical data. Math. Program. 79, 163–190 (1997)
Boros, E., Hammer, P.L., Ibaraki, T., Kogan, A., Mayoraz, E., Muchnik, I.: An Implementation of logical analysis of data. IEEE Trans. Knowl. Data Eng. 12(2), 292–306 (2000)
Boros, E., Ibaraki, T., Makino, K.: Extensions of partially defined boolean functions with missing data. RUTCOR Research Report RRR 06-96, RUTCOR, Rutgers University (1996)
Boros, E., Ibaraki, T., Makino, K.: Error-free and best-fit extensions of partially defined boolean functions. Inf. Comput. 140(2), 254–283 (1998)
Brauner, M.W., Brauner, N., Hammer, P.L., Lozina, I., Valeyre, D.: Logical analysis of computed tomography data to differentiate entities of idiopathic interstitial pneumonias. In: Pardalos, P.M., Boginski, V.L., Vazacopoulos, A. (eds.) Data Mining in Biomedicine, Springer (2007)
Chvátal, V.: A Greedy heuristic for the set-covering problem. Math. Oper. Res. 4(3), 233–235 (1979)
Crama, Y., Hammer, P.L., Ibaraki, T.: Cause-effect relationships and partially defined boolean functions. Annals Oper Res. 16, 299–325 (1988)
Dash Associates: Xpress-Mosel Reference Manuals and Xpress-Optimizer Reference Manual, Release 2004G (2004)
Dechter, R., Pearl, J.: Structure identification in relational data. Artif. Intell. 58, 237–270 (1992)
Eckstein, J., Hammer, P.L., Liu, Y., Nediak, M., Simeone, B.: The maximum box problem and its application to data analysis. Comput. Optim Appl. 23(3), 285–298 (2002)
Ekin, O., Hammer, P.L., Kogan, A.: On connected boolean functions. Discrete Appl. Math. 96, 337–362 (1999)
Ekin, O., Hammer, P.L., Kogan, A.: Convexity and logical analysis of data. Theor. Comput Sci. 244(1-2), 95–116 (2000)
Feige, U.: A threshold of ln n for approximating set cover. J. ACM 45(4), 634–652 (1998)
Hammer, P.L.: Partially defined boolean functions and cause-effect relationships. In: International Conference on Multi-attribute Decision Making Via OR-based Expert Systems. University of Passau, Passau, Germany, April (1986)
Hammer, P.L., Bonates, T.O.: Logical analysis of data: from combinatorial optimization to medical applications. Annals Oper. Res. 148, 203–225 (2006)
Hammer, A.B., Hammer, P.L., Muchnik, I.: Logical analysis of chinese labor productivity patterns. Annals Oper. Res. 87, 165–176 (1999)
Hammer, P.L., Holzman, R.: Approximations of pseudo-Boolean functions; applications to game theory. Methods Models Operat. Res. 39, 3–21 (1992)
Hammer, P.L., Kogan, A., Lejeune, M.: Country risk ratings: statistical and combinatorial non-recursive models. RUTCOR Research Report RRR 8–2004 (2004)
Hammer, P.L., Kogan, A., Lejeune, M.: Modeling country risk ratings using partial orders. Eur. J. Oper. Res. 175(2), 836–859 (2006)
Hammer, P.L., Kogan, A., Simeone, B., Szedmak, S.: Pareto-optimal patterns in logical analysis of data. Discrete Appl. Math. 144, 79–102 (2004)
Hartman, T.E., Swensen, S.J., Hansell, D.M., Colby, T.V., Myers, J.L., Tazelaar, H.D., Nicholson, A.G., U, A.Wells, Ryu, J.H., Midthun, D.E., du Bois, R.M., Muller, N.L.: Nonspecific interstitial pneumonia: Variable appearance at high-resolution chest CT. Radiology 217(3), 701–705 (2000)
Johkoh, T., Muller, N.L., Cartier, Y., Kavanagh, P.V., Hartman, T.E., Akira, M., Ichikado, K., Ando, M., Nakamura, H.: Idiopathic interstitial pneumonias: diagnostic accuracy of thin-section ct in 129 patients. Radiology 211(2), 555–560 (1999)
Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data: An Introduction to Cluster Analysis. Wiley (1990)
Koda, Y., Ruskey, F.: A gray code for the ideals of a forest poset. J. Algorithms 15, 324–340 (1993)
Kuznetsov, S.O., Obiedkov, S.A.: Comparing performance of algorithms for generating concept lattices. J. Exp. Theor. Artif. Intell. 14, 189–216 (2002)
Lauer, M., Alexe, S., Blackstone, E.H., Hammer, P.L., Ishwaran, H., Pothier Snader, C.: Use of the logical analysis of data method for assessing long-term mortality risk after exercise electrocardiography. Circulation 106, 685–690 (2002)
Lim, T.S., Loh, W.Y., Shin, Y.S.: A comparison of prediction accuracy, complexity, and training time of thirty-three old and new classification algorithms. Mach. Learn. 40, 203–229, Appendix: http://www.stat.wisc.edu/∼loh/treeprogs/quest1.7/appendix.pdf (2000)
Malgrange, Y.:, Recherche des Sous-matrices Premières d’une Matrice à Coefficients Binaires. Applications à Certains Problèmes de Graphe, Proceedings of the Deuxième Congrès de l’AFCALTI, 231–242 (1962)
Petricoin, E.F., Ardekani, A.M., Hitt, B.A., Levine, P.J., Fusaro, V.A., Steinberg, S.M., Mills, G.B., Simone, C., Fishman, D.A., Kohn, E.C., Liotta, L.A.: Use of proteomic patterns in serum to identify ovarian cancer. The Lancet 359, 572–577 (2002)
Quine, W.: A way to simplify truth functions. Am. Math. Mon. 62, 627–631 (1955)
Shipp, M.A., Ross, K.N., Tamayo, P., Weng, A.P., Kutok, J.L., Aguiar, R.C., Gaasenbeek, M., Angelo, M., Reich, M., Pinkus, G.S., Ray, T.S., Koval, M.A., Last, K.W., Norton, A., Lister, T.A., Mesirov, J., Neuberg, D.S., Lander, E.S., Aster, J.C., Golub, T.R.: Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning. Nat. Med. 8(1), 68–74 (2002)
Valiant, L.G.: A theory of the learnable. Commun. ACM 27, 1134–1142 (1984)
van’t Veer, L.J., Dai, H., van De Vijver, M.J., He, Y.D., Hart, A.A.M., Mao, M., Peterse, H.L., van der Kooy, K., Marton, M.J., Witteveen, A.T., Schreiber, G.J., Kerkhoven, R.M., Roberts, C., Linsley, P.S., Bernards, R., Friend, S.H.: Gene expression profiling predicts clinical outcome of breast cancer. Nature 12, 292–306 (2002)
Witten, I.H., Frank, E.: Data mining: Practical machine learning tools with java implementations. San Francisco, Morgan Kaufmann (2000)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Alexe, G., Alexe, S., Bonates, T.O. et al. Logical analysis of data – the vision of Peter L. Hammer. Ann Math Artif Intell 49, 265–312 (2007). https://doi.org/10.1007/s10472-007-9065-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10472-007-9065-2