Dampster-Shafer Evidence Theory Based Multi-Characteristics Fusion for Clustering Evaluation

Yue, Shihong; Wu, Teresa; Wang, Yamin; Zhang, Kai; Liu, Weixia

doi:10.1007/978-3-642-16248-0_70

Shihong Yue²⁴,
Teresa Wu²⁵,
Yamin Wang²⁴,
Kai Zhang²⁴ &
…
Weixia Liu²⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6401))

Included in the following conference series:

International Conference on Rough Sets and Knowledge Technology

939 Accesses
1 Citations

Abstract

Clustering is a widely used unsupervised learning method to group data with similar characteristics. The performance of the clustering method can be in general evaluated through some validity indices. However, most validity indices are designed for the specific algorithms along with specific structure of data space. Moreover, these indices consist of a few within- and between- clustering distance functions. The applicability of these indices heavily relies on the correctness of combining these functions. In this research, we first summarize three common characteristics of any clustering evaluation: (1) the clustering outcome can be evaluated by a group of validity indices if some efficient validity indices are available, (2) the clustering outcome can be measured by an independent intra-cluster distance function and (3) the clustering outcome can be measured by the neighborhood based functions. Considering the complementary and unstable natures among the clustering evaluation, we then apply Dampster-Shafter (D-S) Evidence Theory to fuse the three characteristics to generate a new index, termed fused Multiple Characteristic Indices (fMCI). The fMCI generally is capable to evaluate clustering outcomes of arbitrary clustering methods associated with more complex structures of data space. We conduct a number of experiments to demonstrate that the fMCI is applicable to evaluate different clustering algorithms on different datasets and the fMCI can achieve more accurate and robust clustering evaluation comparing to existing indices.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Xu, R., Wunsch, D.: Survey of clustering algorithm. IEEE Trans. Neural Network 16(3), 645–678 (2005)
Article Google Scholar
Bezdek, J.C., Pal, N.R.: Some new indexes of cluster validity. IEEE Trans. SMC-B 28(3), 301–315 (1998)
Google Scholar
Maulik, U., Bandyop, S.: Performance evaluation of some clustering algorithms and validity indices. IEEE Trans. Pattern Anal. Mach. Intel. 24(12), 1650–1654 (2002)
Article Google Scholar
Pakhira, M.K., Bandyopadhyay, S., Maulik, U.: A study of some fuzzy cluster validity indices, genetic clustering and application to pixel classification. Fuzzy Sets Syst. 155(3), 191–214 (2005)
Article MathSciNet Google Scholar
Wang, J., Chiang, J.: A Cluster Validity Measure with Outlier Detection for Support Vector Clustering. IEEE Trans. SMC-B 38(1), 78–89 (2008)
Google Scholar
Hubert, L.J., Arabie, P.: Comparing partitions. J. Classification 2, 193–218 (1985)
Article Google Scholar
Davies, D.L., Bouldin, D.W.: A cluster separation measure. IEEE Trans. Pattern Anal. Machine Intell. 1(4), 224–227 (1979)
Article Google Scholar
Xie, X.L., Beni, G.: A validity measure for fuzzy clustering. IEEE Trans. Pattern Anal. Mach. Intell. 13(8), 841–847 (1991)
Article Google Scholar
Bezdek, J.C.: Pattern Recognition with fuzzy objective function algorithms. Plenum Press, New York (1981)
MATH Google Scholar
Kim, M., Ramakrishna, R.S.: New indices for cluster validity assessment. Patt. Recog. Lett. 26, 2353–2363 (2005)
Article Google Scholar
Pakhira, M.K., Bandyopadhyay, S., Maulik, U.: Validity index for crisp and fuzzy clusters. Pattern Recognition 37(3), 487–501 (2004)
Article MATH Google Scholar
Saha, S., Bandyopadhyay, S.: Application of a new symmetry based cluster validity index for satellite image anghamitra. IEEE Geos. Remote Sensing Letter 5(2), 166–170 (2008)
Article Google Scholar
Tantrum, J., Murua, A., Stuetzle, W.: Hierarchical model-based clustering of large datasets through fractionation and refractionation. Information Systems 29, 315–326 (2004)
Article Google Scholar
MacQueen, J.B.: Some methods for classification and analysis of multivariate observations. In: The 5th Berkeley Symposium on Mathematical and Probability, Berkeley, vol. 1, pp. 281–297 (1967)
Google Scholar
Bezdek, J.C., Pal, S.K.: Fuzzy models for Pattern recognition. Plenum Press, New York (1992)
Google Scholar
Dunn, J.C.: A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. J. Cybern. 3(3), 32–57 (1973)
Article MATH MathSciNet Google Scholar
Tibshirani, R., Walther, G., Hastie, T.: Estimation the number of clusters in a dataset via the gap statistic. J. Royal Society-B 63(2), 411–423 (2000)
Article MathSciNet Google Scholar
Agrawal, R., Gehrke, J., Gunopulos, D., et al.: Automatic subspace clustering of high dimensional data. Data Mining. Knowl. Disc. 11(1), 5–33 (2005)
Article MathSciNet Google Scholar
Ester, M., Kriegel, H.P., et al.: A density-based algorithm for discovering clusters in large spatial datasets with noise. In: Proc. 2nd Int. Conf. KDDD 1996, Portland, Oregon, pp. 226–239 (1996)
Google Scholar
Ma, E.W.M., Chow, T.W.S.: A new shifting grid clustering algorithm. Pattern Recognition 37, 503–514 (2004)
Article MATH Google Scholar
Wang, J., Chiang, J.: A cluster validity measure with a hybrid parameter search method for support vector clustering algorithm. Pattern Recognition 41(2), 506–520 (2008)
Article MATH Google Scholar
Kim, D.J., Lee, K.H., Lee, D.: On cluster validity index for estimation of the optimal number of fuzzy clusters. Pattern Recognition 37(10), 2009–2025 (2004)
Article Google Scholar
Yue, S., Li, P., Song, Z.: On the index of cluster validity. J. Chinese Electronic 14(3), 535–539 (2005)
MathSciNet Google Scholar
Kittler, J., Hatef, M., Duin, R.P., Matas, J.: On Combining Classifiers. IEEE Trans. Patt. Anal. Mach. Intell. 20(3), 226–239 (1998)
Article Google Scholar
Kaftandjian, V., Zhu, Y., Dupuis, O., Lyon, I.: The combined use of the evidence theory and fuzzy logic for improving multimodal nondestructive testing system. IEEE Trans. Instr. Mea. 54(4), 1968–1977 (2005)
Article Google Scholar
Fred, A.L.N., Jain, A.K.: Combining multiple Clusterings using evidence accumulation. IEEE Trans. Patt. Anal. Mach. Intell. 27(6), 835–851 (2005)
Article Google Scholar
Sheng, W., Swift, S., Zhang, L., Liu, X.: A Weighted Sum Validity Function for Clustering With a Hybrid Niching Genetic Algorithm. IEEE Trans. SMC-B 35(6), 1156–1167 (2005)
Google Scholar
Wu, S., Chow, W.S.: Clustering of the self-organizing map using a clustering validity index based on inter-cluster and intra-cluster density. Pattern Recognition 37(2), 175–188 (2004)
Article MATH Google Scholar
Zhang, W., Lee, Y.: The uncertainty of reasoning principles. Xi’an Jiaotong University Press, Xi’an (1999)
Google Scholar
Cuzzolin, F.: A geometric approach to the theory of evidence. IEEE Trans. SMC-C 38(4), 522–534 (2008)
Google Scholar
Regis, M., Doncescu, A., Desachy, J.: Use of Evidence theory for the fusion and the estimation of relevance of data sources: application to an alcoholic bioprocess. Traitements Signal 24(2), 115–132 (2007)
Google Scholar
Boudraa, A., Bentabet, A., Salzensten, F., Guillon, L.: Dempster-Shafer’s probability assignment based on fuzzy membership functions. Elec. Lett. Comp. Vison. Image Anal. 4(1), 1–9 (2004)
Google Scholar
Salzenstein, F., Boudraa, A.: Iterative estimation of Dempster-Shafer’s basic probability assignment: application tomultisensor image segment. Opt. Eng. 43(6), 1–7 (2004)
Article Google Scholar
Huang, Z., Ng, M.: A Fuzzy k-Modes Algorithm for Clustering Categorical Data. IEEE Trans. Fuzzy Systems 7(4), 446–452 (1999)
Article Google Scholar
Huang, Z., Ng, M.K., Rong, H.: Automated variable weighting in k-means type clustering. IEEE Trans. Patt. Anal. Mach. Intell. 27(3), 657–668 (2005)
Article Google Scholar
Pedrycz, W.: Conditional fuzzy clustering. Patt. Recog. Lett. 18(7), 791–807 (2005)
Google Scholar
http://morden.csee.usf.edu/brfcm/brfcm-src/brfcm.c
Ankerst, M., Breunig, M., Kriegel, H.P.: Ordering points to identify the clustering structure. SIGMOD Record 28(2), 49–60 (1999)
Article Google Scholar
UCI Machine Learning Repository, ftp://ftp.cs.cornell.edu/pub/smart/
AlphaMiner2.0., http://bi.hitsz.edu.cn/alphaminer/index.htm
Yue, S., Wei, M., Wang, J., Wang, H.: A general grid-clustering approach. Patt. Recog. Lett. 29(9), 1372–1384 (2008)
Article Google Scholar
Lange, T., Roth, V., BrauM, L., Buhmann, J.M.: Stability-Based Validation of Clustering Solutions. Neural Comput. 16(6), 1299–1323 (2004)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

School of Electric Engineering and Automation, Tianjin University, Tianjin, 300072, China
Shihong Yue, Yamin Wang, Kai Zhang & Weixia Liu
School of Computing, Informatics, Decision Systems Engineering, Arizona State University Tempe, AZ, 85287, USA
Teresa Wu

Authors

Shihong Yue
View author publications
You can also search for this author in PubMed Google Scholar
Teresa Wu
View author publications
You can also search for this author in PubMed Google Scholar
Yamin Wang
View author publications
You can also search for this author in PubMed Google Scholar
Kai Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Weixia Liu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computer and Information Technology, Beijing Jiaotong University, 100044, Beijing, China
Jian Yu
Faculty of Economics, University of Catania, Corso Italia, 55, 95129, Catania, Italy
Salvatore Greco
Department of Mathematics and Computing Science, Saint Mary’s University, B3H 3C3, Halifax, Nova Scotia, Canada
Pawan Lingras
Institute of Computer Science and Technology, Chongqing University of Posts and Telecommunications, 400065, Chongqing, China
Guoyin Wang
Institute of Mathematics, Warsaw University, Banacha 2, 02-097, Warsaw, Poland
Andrzej Skowron

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yue, S., Wu, T., Wang, Y., Zhang, K., Liu, W. (2010). Dampster-Shafer Evidence Theory Based Multi-Characteristics Fusion for Clustering Evaluation. In: Yu, J., Greco, S., Lingras, P., Wang, G., Skowron, A. (eds) Rough Set and Knowledge Technology. RSKT 2010. Lecture Notes in Computer Science(), vol 6401. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16248-0_70

Download citation

DOI: https://doi.org/10.1007/978-3-642-16248-0_70
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16247-3
Online ISBN: 978-3-642-16248-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics