Abstract
Many stability measures to validate a cluster have been proposed such as Normalized Mutual Information. The drawback of the common approach is discussed in this paper and then a new asymmetric criterion is proposed to assess the association between a cluster and a partition which is called Alizadeh-Parvin-Moshki-Minaei criterion, APMM. The APMM criterion compensates the drawback of the common Normalized Mutual Information (NMI) measure. Also, a clustering ensemble method is proposed which is based on aggregating a subset of primary clusters. This method uses the Average APMM as fitness measure to select a number of clusters. The clusters which satisfy a predefined threshold of the mentioned measure are selected to participate in the clustering ensemble. To combine the chosen clusters, a co-association based consensus function is employed. Since the Evidence Accumulation Clustering, EAC, method cannot derive the co-association matrix from a subset of clusters, a new EAC based method which is called Extended EAC, EEAC, is employed to construct the co-association matrix from the chosen subset of clusters. The empirical studies show that the proposed method outperforms other ones.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Ayad, H.G., Kamel, M.S.: Cumulative Voting Consensus Method for Partitions with a Variable Number of Clusters. IEEE Trans. on Pattern Analysis and Machine Intelligence 30(1), 160–173 (2008)
Baumgartner, R., Somorjai, R., Summers, R., Richter, W., Ryner, L., Jarmasz, M.: Resampling as a Cluster Validation Technique in fMRI. Journal of Magnetic Resonance Imaging 11, 228–231 (2000)
Ben-Hur, A., Elisseeff, A., Guyon, I.: A stability based method for discovering structure in clustered data. Pasific Symposium on Biocomputing 7, 6–17 (2002)
Brandsma, T., Buishand, T.A.: Simulation of extreme precipitation in the Rhine basin by nearest-neighbour resampling. Hydrology and Earth System Sciences 2, 195–209 (1998)
Breckenridge, J.: Replicating cluster analysis: Method, consistency and validity. Multivariate Behavioral Research (1989)
Das, A.K., Sil, J.: Cluster Validation using Splitting and Merging Technique. In: Int. Conf. on Computational Intelligence and Multimedia Applications, ICCIMA (2007)
Davison, A.C., Hinkley, D.V., Young, G.A.: Recent developments in bootstrap methodology. Statistical Science 18, 141–157 (2003)
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. Wiley, Chichester (2001)
Estivill-Castro, V., Yang, J.: Cluster Validity Using Support Vector Machines. In: Kambayashi, Y., Mohania, M., Wöß, W. (eds.) DaWaK 2003. LNCS, vol. 2737, pp. 244–256. Springer, Heidelberg (2003)
Faceli, K., Marcilio, C.P., Souto, D.: Multi-objective Clustering Ensemble. In: Proceedings of the Sixth International Conference on Hybrid Intelligent Systems (2006)
Fern, X.Z., Lin, W.: Cluster Ensemble Selection. In: SIAM International Conference on Data Mining (2008)
Fred, A., Jain, A.K.: Combining Multiple Clusterings Using Evidence Accumulation. IEEE Trans. on Pattern Analysis and Machine Intelligence 27(6), 835–850 (2005)
Fred, A., Jain, A.K.: Data Clustering Using Evidence Accumulation. In: Intl. Conf. on Pattern Recognition, ICPR 2002, Quebec City, pp. 276–280 (2002)
Fred, A., Jain, A.K.: Learning Pairwise Similarity for Data Clustering. In: Int. Conf. on Pattern Recognition (2006)
Fred, A., Lourenco, A.: Cluster Ensemble Methods: from Single Clusterings to Combined Solutions. SCI, vol. 126, pp. 3–30 (2008)
Fridlyand, J., Dudoit, S.: Applications of resampling methods to estimate the number of clusters and to improve the accuracy of a clustering method. Stat. Berkeley Tech. Report No. 600 (2001)
Inokuchi, R., Nakamura, T., Miyamoto, S.: Kernelized Cluster Validity Measures and Application to Evaluation of Different Clustering Algorithms. In: IEEE Int. Conf. on Fuzzy Systems, Canada, July 16-21 (2006)
Law, M.H.C., Topchy, A.P., Jain, A.K.: Multiobjective data clustering. In: IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 424–430 (2004)
Lange, T., Roth, V., Braun, M.L., Buhmann, J.M.: Stability-based validation of clustering solutions. Neural Computation 16(6), 1299–1323 (2004)
Minaei-Bidgoli, B., Topchy, A., Punch, W.F.: Ensembles of Partitions via Data Resampling. In: Intl. Conf. on Information Technology, ITCC 2004, Las Vegas (2004)
Möller, U., Radke, D.: Performance of data resampling methods based on clustering. Intelligent Data Analysis 10(2) (2006)
Rakhlin, A., Caponnetto, A.: Stability of k-means clustering. In: Advances in Neural Information Processing Systems, vol. 19. MIT Press, Cambridge (2007)
Roth, V., Lange, T.: Feature Selection in Clustering Problems. In: Advances in Neural Information Processing Systems (2004)
Roth, V., Lange, T., Braun, M., Buhmann, J.: A Resampling Approach to Cluster Validation. In: Intl. Conf. on Computational Statistics, COMPSTAT (2002)
Strehl, A., Ghosh, J.: Cluster ensembles - a knowledge reuse framework for combining multiple partitions. Journal of Machine Learning Research 3, 583–617 (2002)
Xie, X.L., Beni, G.: A Validity measure for Fuzzy Clustering. IEEE Trans. on Pattern Analysis and Machine Intelligence 13(4), 841–846 (1991)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Alizadeh, H., Minaei, B., Parvin, H., Moshki, M. (2011). An Asymmetric Criterion for Cluster Validation. In: Mehrotra, K.G., Mohan, C., Oh, J.C., Varshney, P.K., Ali, M. (eds) Developing Concepts in Applied Intelligence. Studies in Computational Intelligence, vol 363. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21332-8_1
Download citation
DOI: https://doi.org/10.1007/978-3-642-21332-8_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-21331-1
Online ISBN: 978-3-642-21332-8
eBook Packages: EngineeringEngineering (R0)