Abstract
Current classification algorithms focus on vectorial data, given in euclidean or kernel spaces. Many real world data, like biological sequences are not vectorial and often non-euclidean, given by (dis-)similarities only, requesting for efficient and interpretable models. Current classifiers for such data require complex transformations and provide only crisp classification without any measure of confidence, which is a standard requirement in the life sciences. In this paper we propose a prototype-based conformal classifier for dissimilarity data. It effectively deals with dissimilarity data. The model complexity is automatically adjusted and confidence measures are provided. In experiments on dissimilarity data we investigate the effectiveness with respect to accuracy and model complexity in comparison to different state of the art classifiers.
Chapter PDF
References
Boeckmann, B., Bairoch, A., Apweiler, R., Blatter, M.-C., Estreicher, A., Gasteiger, E., Martin, M.J., Michoud, K., O’Donovan, C., Phan, I., Pilbout, S., Schneider, M.: The swiss-prot protein knowledgebase and its supplement trembl in 2003. Nucl. Ac. Res. 31, 365–370
Chen, Y., Garcia, E.K., Gupta, M.R., Rahimi, A., Cazzanti, L.: Similarity-based classification: Concepts and algorithms. J. of Mach. Learn. Res. 10, 747–776 (2009)
Duin, R.P.W.: PRTools (March 2012), http://www.prtools.org
Duin, R.P.W., Loog, M., Pękalska, E.z., Tax, D.M.J.: Feature-Based Dissimilarity Space Classification. In: Ünay, D., C̨ataltepe, Z., Aksoy, S. (eds.) ICPR 2010. LNCS, vol. 6388, pp. 46–55. Springer, Heidelberg (2010)
Gasteiger, E.: Expasy: the proteomics server for in-depth protein knowledge and analysis. Nuc. Ac. Res. 31(3784-3788) (2003)
Gusfield, D.: Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology. Cambridge University Press (1997)
Hammer, B., Hasenfuss, A.: Topographic mapping of large dissimilarity data sets. Neural Computation 22(9), 2229–2284 (2010)
Hammer, B., Schleif, F.-M., Zhu, X.: Relational Extensions of Learning Vector Quantization. In: Lu, B.-L., Zhang, L., Kwok, J. (eds.) ICONIP 2011, Part II. LNCS, vol. 7063, pp. 481–489. Springer, Heidelberg (2011)
Lozano, M., Pekalska, E., Duin, R.P.W.: Experimental study on prototype optimisation algorithms for prototype-based classification in vector spaces. Pattern Recognition 39(10), 1827–1838 (2006)
Pekalska, E., Duin, R.P.W.: The dissimilarity representation for pattern recognition. World Scientific (2005)
Pekalska, E., Duin, R.P.W., Paclík, P.: Prototype selection for dissimilarity-based classifiers. Pattern Recognition 39(2), 189–208 (2006)
Proedrou, K., Nouretdinov, I., Vovk, V., Gammerman, A.: Transductive Confidence Machines for Pattern Recognition. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) ECML 2002. LNCS (LNAI), vol. 2430, pp. 381–390. Springer, Heidelberg (2002)
Sato, A., Yamada, K.: Generalized learning vector quantization. In: NIPS, pp. 423–429 (1995)
Schleif, F.-M., Villmann, T., Hammer, B., Schneider, P.: Efficient kernelized prototype based classification. Int. J. Neural Syst. 21(6), 443–457 (2011)
Shafer, G., Vovk, V.: A tutorial on conformal prediction. JMLR 9, 371–421 (2008)
Vapnik, V.: The nature of statistical learning theory. Stat. f. Eng. & Inf. Sc. Springer (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 IFIP International Federation for Information Processing
About this paper
Cite this paper
Schleif, FM., Zhu, X., Hammer, B. (2012). A Conformal Classifier for Dissimilarity Data. In: Iliadis, L., Maglogiannis, I., Papadopoulos, H., Karatzas, K., Sioutas, S. (eds) Artificial Intelligence Applications and Innovations. AIAI 2012. IFIP Advances in Information and Communication Technology, vol 382. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33412-2_24
Download citation
DOI: https://doi.org/10.1007/978-3-642-33412-2_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33411-5
Online ISBN: 978-3-642-33412-2
eBook Packages: Computer ScienceComputer Science (R0)