ScienceDirect® Home Skip Main Navigation Links
You have guest access to ScienceDirect. Find out more.
 
Home
Browse
My Settings
Alerts
Help
 Quick Search
 Search tips (Opens new window)
    Clear all fields    
advertisementadvertisement
Computational Statistics & Data Analysis
Volume 52, Issue 2, 15 October 2007, Pages 687-701
 
Font Size: Decrease Font Size  Increase Font Size
 Abstract - selected
Article
Purchase PDF (426 K)

  E-mail Article   
  Add to my Quick Links   
Bookmark and share in 2collab (opens in new window)
Request permission to reuse this article
  Cited By in Scopus (0)
 
 
 
Related Articles in ScienceDirect
View More Related Articles
 
View Record in Scopus
 
doi:10.1016/j.csda.2007.03.013    How to Cite or Link Using DOI (Opens New Window)
Copyright © 2007 Elsevier B.V. All rights reserved.

DIVCLUS-T: A monothetic divisive hierarchical clustering method

Marie Chaventa, Corresponding Author Contact Information, E-mail The Corresponding Author, Yves Lechevallierb and Olivier Briantc

aIMB, UMR CNRS 5251, Université Bordeaux1, 351 cours de la libération, 33405 Talence, Cedex, France bINRIA-Rocquencourt, 78153 Le Chesnay, Cedex, France cG-SCOP, ENSGI-INPG, 6 avenue Félix-Viallet, 38031 Grenoble, France

Available online 18 March 2007.

Purchase the full-text article



References and further reading may be available for this article. To view references and further reading you must purchase this article.

Abstract

DIVCLUS-T is a divisive hierarchical clustering algorithm based on a monothetic bipartitional approach allowing the dendrogram of the hierarchy to be read as a decision tree. It is designed for either numerical or categorical data. Like the Ward agglomerative hierarchical clustering algorithm and the k-means partitioning algorithm, it is based on the minimization of the inertia criterion. However, unlike Ward and k-means, it provides a simple and natural interpretation of the clusters. The price paid by construction in terms of inertia by DIVCLUS-T for this additional interpretation is studied by applying the three algorithms on six databases from the UCI Machine Learning repository.

Keywords: Divisive clustering; Monothetic cluster; Decision dendrogram; Inertia criterion

Article Outline

1. Introduction
2. An example
3. The data table
4. The inertia criterion
4.1. Definitions
4.2. The inertia criterion for numerical or categorical data
5. DIVCLUS-T
5.1. The problem of how to split a cluster
5.1.1. Inertia of a bipartition
5.1.2. Binary questions
5.1.3. Choice of the binary question
5.2. Selecting the cluster to be split
5.3. An example for categorical data
5.4. Computational complexity
6. Empirical comparison with Ward and k-means
6.1. The proportion of inertia explained
6.2. Resampling
7. Conclusion
References





 
Home
Browse
My Settings
Alerts
Help
Elsevier.com (Opens new window)
About ScienceDirect  |  Contact Us  |  Information for Advertisers  |  Terms & Conditions  |  Privacy Policy
Copyright © 2008 Elsevier B.V. All rights reserved. ScienceDirect® is a registered trademark of Elsevier B.V.