Abstract
Cluster ensembles provide us with a versatile alternative to individual clustering algorithms. In structural pattern recognition, however, cluster ensembles have been rarely studied. In the present paper a general methodology for creating structural cluster ensembles is proposed. Our representation formalism is based on graphs and includes strings and trees as special cases. The basic idea of our approach is to view the dissimilarities of an input graph g to a number of prototype graphs as a vectorial description of g. Randomized prototype selection offers a convenient possibility to generate m different vector sets out of the same graph set. Applying any available clustering algorithm to these vector sets results in a cluster ensemble with m clusterings which can then be combined with an appropriate consensus function. In several experiments conducted on different graph sets, the cluster ensemble shows superior performance over two single clustering procedures.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Kuncheva, L., Vetrov, D.: Evaluation of stability of k-means cluster ensembles with respect to random initialization. IEEE Transactions on Pattern Analysis and Machine Intelligence 28(11), 1798–1808 (2006)
Jain, A., Murty, M., Flynn, P.: Data clustering: A review. ACM Computing Surveys 31(3), 264–323 (1999)
Dudoit, S.: Fridlyand: Bagging to improve the accuracy of a clustering procedure. Bioinformatics 19(9), 1090–1099 (2003)
Fred, A., Jain, A.: Combining multiple clusterings using evidence accumulation. IEEE Trans. on Pattern Analysis and Machine Intelligence 27(6), 835–850 (2005)
Ayad, H., Kamel, M.: Finding natural clusters using multiclusterer combiner based on shared nearest neighbors. In: Windeatt, T., Roli, F. (eds.) MCS 2003. LNCS, vol. 2709, pp. 166–175. Springer, Heidelberg (2003)
Strehl, A., Gosh, J., Cardie, C.: Cluster ensembles–a knowledge reuse framework for combining multiple partitions. Journal of Machine Learning Research 3, 583–617 (2002)
Englert, R., Glantz, R.: Towards the clustering of graphs. In: Kropatsch, W., Jolion, J. (eds.) Proc. 2nd Int. Workshop on Graph Based Representations in Pattern Recognition, pp. 125–133 (2000)
Bunke, H., Dickinson, P., Kraetzl, M., Wallis, W.: A Graph-Theoretic Approach to Enterprise Network Dynamics. In: Progress in Computer Science and Applied Logic (PCS), vol. 24. Birkhäuser, Basel (2007)
Mahé, P., Ueda, N., Akutsu, T.: Graph kernels for molecular structures – activity relationship analysis with support vector machines. Journal of Chemical Information and Modeling 45(4), 939–951 (2005)
Schenker, A., Bunke, H., Last, M., Kandel, A.: Graph-Theoretic Techniques for Web Content Mining. World Scientific, Singapore (2005)
Conte, D., Foggia, P., Sansone, C., Vento, M.: Thirty years of graph matching in pattern recognition. Int. Journal of Pattern Recognition and Artificial Intelligence 18(3), 265–298 (2004)
Schölkopf, B., Smola, A.: Learning with Kernels. MIT Press, Cambridge (2002)
Gärtner, T.: Kernels for Structured Data. World Scientific, Singapore (2008)
Pekalska, E., Duin, R.: The Dissimilarity Representation for Pattern Recognition: Foundations and Applications. World Scientific, Singapore (2005)
Spillmann, B., Neuhaus, M., Bunke, H., Pekalska, E., Duin, R.: Transforming strings to vector spaces using prototype selection. In: Yeung, D.Y., Kwok, J., Fred, A., Roli, F., de Ridder, D. (eds.) SSPR 2006 and SPR 2006. LNCS, vol. 4109, pp. 287–296. Springer, Heidelberg (2006)
Riesen, K., Bunke, H.: Graph classification based on vector space embedding. Int. Journal of Pattern Recognition and Artificial Intelligence (2008) (accepted for publication)
Riesen, K., Bunke, H.: Classifier ensembles for vector space embedding of graphs. In: Haindl, M., Kittler, J., Roli, F. (eds.) MCS 2007. LNCS, vol. 4472, pp. 220–230. Springer, Heidelberg (2007)
Bunke, H., Allermann, G.: Inexact graph matching for structural pattern recognition. Pattern Recognition Letters 1, 245–253 (1983)
Riesen, K., Bunke, H.: Approximate graph edit distance computation by means of bipartite graph matching. In: Image and Vision Computing (2008) (accepted for publication)
Riesen, K., Bunke, H.: IAM graph database repository for graph based pattern recognition and machine learning. In: da Vitoria, L., et al. (eds.) Structural, Syntactic, and Statistical Pattern Recognition. LNCS, vol. 5342, pp. 287–297. Springer, Heidelberg (2008)
Nene, S., Nayar, S., Murase, H.: Columbia Object Image Library: COIL-100. Technical report, Department of Computer Science, Columbia University, New York (1996)
Watson, C., Wilson, C.: NIST Special Database 4, Fingerprint Database. National Institute of Standards and Technology (1992)
Berman, H., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T., Weissig, H., Shidyalov, I., Bourne, P.: The protein data bank. Nucleic Acids Research 28, 235–242 (2000)
Dunn, J.: Well-separated clusters and optimal fuzzy partitions. Journal of Cybernetics 4, 95–104 (1974)
Hubert, L., Schultz, J.: Quadratic assignment as a general data analysis strategy. British Journal of Mathematical and Statistical Psychology 29, 190–241 (1976)
Rand, W.: Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association 66(336), 846–850 (1971)
Riesen, K., Bunke, H.: Kernel k-means clustering applied to vector space embeddings of graphs. In: Prevost, L., Marinai, S., Schwenker, F. (eds.) ANNPR 2008. LNCS (LNAI), vol. 5064, pp. 24–35. Springer, Heidelberg (2008)
Kuncheva, L.: Combining Pattern Classifiers: Methods and Algorithms. John Wiley, Chichester (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Riesen, K., Bunke, H. (2009). Cluster Ensembles Based on Vector Space Embeddings of Graphs. In: Benediktsson, J.A., Kittler, J., Roli, F. (eds) Multiple Classifier Systems. MCS 2009. Lecture Notes in Computer Science, vol 5519. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02326-2_22
Download citation
DOI: https://doi.org/10.1007/978-3-642-02326-2_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02325-5
Online ISBN: 978-3-642-02326-2
eBook Packages: Computer ScienceComputer Science (R0)