Copyright © 2007 Elsevier B.V. All rights reserved.
Investigating diversity of clustering methods: An empirical comparison
Received 19 October 2006;
References and further reading may be available for this article. To view references and further reading you must purchase this article.
Abstract
The paper aims to shed some light on the question why clustering algorithms, despite being quantitative and hence supposedly objective in nature, yield different and varied results. To do that, we took 10 common clustering algorithms and tested them over four known datasets, used in the literature as baselines with agreed upon clusters. One additional method, Binary-Positive, developed by our team, was added to the analysis. The results affirm the unpredictable nature of the clustering process, point to different assumptions taken by different methods. One conclusion of the study is to carefully choose the appropriate clustering method for any given application.
Keywords: Cluster analysis; Similarity; Binary-Positive data representation
Article Outline
- 1. Introduction
- 2. Study goals
- 3. Clustering algorithms
- 4. Dataset descriptions
- 4.1. Dataset I: chemical analysis of wines
- 4.2. Dataset II: Fisher’s Iris dataset
- 4.3. Dataset III: Ecoli sites analysis
- 4.4. Dataset IV: psychological balanced
- 5. Implementation description
- 5.1. Algorithm implementation and score evaluation
- 5.1.1. Result presentation
- 5.1.2. Basic score evaluation
- 5.1.3. Mismatch analysis
- 5.1.4. Score normalization
- 5.2. The score table
- 6. Graphic representation
- 7. Discussion
- 8. Summary and conclusions
- References
- Vitae






E-mail Article
Add to my Quick Links

Cited By in Scopus (2)






