research-article

Experiments on Density-Constrained Graph Clustering

Authors:
Robert Görke

Karlsruhe Institute of Technology, Germany

Karlsruhe Institute of Technology, Germany
View Profile

,
Andrea Kappes

Karlsruhe Institute of Technology, Germany

Karlsruhe Institute of Technology, Germany
View Profile

,
Dorothea Wagner

Karlsruhe Institute of Technology, Germany

Karlsruhe Institute of Technology, Germany
View Profile

ACM Journal of Experimental Algorithmics Volume 19Article No.: 3.3pp 1.1–1.31https://doi.org/10.1145/2638551

Published:07 January 2015Publication History

ACM Journal of Experimental Algorithmics

Abstract

Clustering a graph means identifying internally dense subgraphs that are only sparsely interconnected. Formalizations of this notion lead to measures that quantify the quality of a clustering and to algorithms that actually find clusterings. Since, most generally, corresponding optimization problems are hard, heuristic clustering algorithms are used in practice, or other approaches that are not based on an objective function. In this work, we conduct a comprehensive experimental evaluation of the qualitative behavior of greedy bottom-up heuristics driven by cut-based objectives and constrained by intracluster density, using both real-world data and artificial instances. Our study documents that a greedy strategy based on local movement is superior to one based on merging. We further reveal that the former approach generally outperforms alternative setups and reference algorithms from the literature in terms of its own objective, while a modularity-based algorithm competes surprisingly well. Finally, we exhibit which combinations of cut-based inter- and intracluster measures are suitable for identifying a hidden reference clustering in synthetic random graphs and discuss the skewness of the resulting cluster size distributions. Our results serve as a guideline to the usage of bicriterial, cut-based measures for graph clusterings.

References

Alex Arenas. 2009. Network Data Sets. Retrieved from http://deim.urv.cat/~aarenas/data/welcome.htm.Google Scholar
Alex Arenas, Leon Danon, Albert Díaz-Guilera, Pablo Gleiser, and Roger Guimerà. 2004. Community analysis in social networks. European Physical Journal B 38, 2 (2004), 373--380.Google ScholarCross Ref
Pavel Berkhin. 2006. A survey of clustering data mining techniques. In Grouping Multidimensional Data: Recent Advances in Clustering, Jacob Kogan, Charles Nicholas, and Marc Teboulle (Eds.). Springer, 25--71.Google Scholar
Charles-Edmond Bichot and Patrick Siarry (Eds.). 2011. Graph Partitioning. Wiley.Google Scholar
Vincent Blondel, Jean-Loup Guillaume, Renaud Lambiotte, and Etienne Lefebvre. 2008. Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment 2008, 10.Google ScholarCross Ref
Ulrik Brandes, Marco Gaertler, and Dorothea Wagner. 2003. Experiments on graph clustering algorithms. In Proceedings of the 11th Annual European Symposium on Algorithms (ESA'03), Lecture Notes in Computer Science, Vol. 2832. Springer, 568--579.Google ScholarCross Ref
Ulrik Brandes, Marco Gaertler, and Dorothea Wagner. 2007. Engineering graph clustering: Models and experimental evaluation. ACM Journal of Experimental Algorithmics 12, 1.1 (2007), 1--26. Google ScholarDigital Library
Aaron Clauset, Mark E. J. Newman, and Cristopher Moore. 2004. Finding community structure in very large networks. Physical Review E 70, 066111 (2004).Google ScholarCross Ref
Daniel Delling, Marco Gaertler, Robert Görke, and Dorothea Wagner. 2008. Engineering comparators for graph clusterings. In Proceedings of the 4th International Conference on Algorithmic Aspects in Information and Management (AAIM'08), Lecture Notes in Computer Science, 5034, Springer, 131--142. Google ScholarDigital Library
Gary William Flake, Robert E. Tarjan, and Kostas Tsioutsiouliklis. 2004. Graph clustering and minimum cut trees. Internet Mathematics 1, 4 (2004), 385--408.Google ScholarCross Ref
Santo Fortunato. 2010. Community detection in graphs. Physics Reports 486, 3--5 (2010), 75--174.Google ScholarCross Ref
Santo Fortunato and Marc Barthélemy. 2007. Resolution limit in community detection. Proceedings of the National Academy of Science of the United States of America 104, 1 (2007), 36--41.Google ScholarCross Ref
Corrado Gini. 1921. Measurement of inequality of incomes. Economic Journal 31, 121 (March 1921), 124--126.Google ScholarCross Ref
Robert Görke, Andrea Schumm, and Dorothea Wagner. 2011. Density-constrained graph clustering. In Algorithms and Data Structures, 12th International Symposium (WADS'11), Frank Dehne, John Iacono, and Jörg-Rüdiger Sack (Eds.), Lecture Notes in Computer Science, Vol. 6844. Springer, 679--690. Google ScholarDigital Library
Robert Görke, Andrea Schumm, and Dorothea Wagner. 2012. Experiments on density-constrained graph clustering. In Proceedings of the 14th Meeting on Algorithm Engineering and Experiments (ALENEX'12). SIAM, 1--15.Google ScholarDigital Library
Robert Görke and Christian Staudt. 2009. A Generator for Dynamic Clustered Random Graphs. Technical Report. iti_wagner. Informatik, Uni Karlsruhe, TR 2009-7.Google Scholar
Shlomo Hoory, Nathan Linial, and Avi Wigderson. 2006. Expander graphs and their applications. Bulletin of the American Mathematical Society 43 (2006), 439--561.Google ScholarCross Ref
Anil K. Jain and Richard C. Dubes. 1988. Algorithms for Clustering Data. Prentice Hall. Google ScholarDigital Library
Ravi Kannan, Santosh Vempala, and Adrian Vetta. 2000. On clusterings—good, bad and spectral. In Proceedings of the 41st Annual IEEE Symposium on Foundations of Computer Science (FOCS'00). 367--378. Google ScholarDigital Library
Andrea Lancichinetti and Santo Fortunato. 2009. Community detection algorithms: A comparative analysis. Physical Review E 80, 5 (November 2009).Google ScholarCross Ref
David Lusseau, Karsten Schneider, Oliver Boisseau, Patti Haase, Elisabeth Slooten, and Steve Dawson. 2004. The bottlenose dolphin community of doubtful sound features a large proportion of long-lasting associations. Behavioral Ecology and Sociobiology 54, 4 (September 2004), 396--405.Google Scholar
Marc Newman. 2011. Network Data. Retrieved from http://www-personal.umich.edu/~mejn/netdata/.Google Scholar
Mark E. J. Newman and Michelle Girvan. 2004. Finding and evaluating community structure in networks. Physical Review E 69, 026113 (2004), 1--16.Google ScholarCross Ref
Gergely Palla, Imre Derényi, Illés Farkas, and Tamás Vicsek. 2005. Uncovering the overlapping community structure of complex networks in nature and society. Nature 435 (2005), 814--818.Google ScholarCross Ref
Randolf Rotta and Andreas Noack. 2011. Multilevel local search algorithms for modularity clustering. ACM Journal of Experimental Algorithmics 16 (July 2011), 2.3:2.1--2.3:2.27. Google ScholarDigital Library
Stijn M. van Dongen. 2000. Graph Clustering by Flow Simulation. Ph.D. dissertation. University of Utrecht. http://micans.org/mcl/lit/.Google Scholar
Stijn M. van Dongen. 2008. Graph clustering via a discrete uncoupling process. SIAM Journal on Matrix Analysis and Applications 30, 1 (2008), 121--141. Google ScholarDigital Library
Ken Wakita and Toshiyuki Tsurumi. 2007. Finding Community Structure in Mega-scale Social Networks. (February 2007). http://arxiv.org/abs/cs/0702048v1 Technical Report on arXiv.Google Scholar
yWorks GmbH. 2008. yFiles for Java. Retrieved from http://www.yworks.com/en/products_yfiles_about.html.Google Scholar

Index Terms

Experiments on Density-Constrained Graph Clustering

Recommendations

Engineering graph clustering: Models and experimental evaluation

A promising approach to graph clustering is based on the intuitive notion of intracluster density versus intercluster sparsity. As for the weighted case, clusters should accumulate lots of weight, in contrast to their connection to the remaining graph, ...
Read More
Experiments on density-constrained graph clustering
ALENEX '12: Proceedings of the Meeting on Algorithm Engineering & Expermiments

Clustering a graph means identifying internally dense subgraphs which are only sparsely interconnected. Formalizations of this notion lead to measures that quantify the quality of a clustering and to algorithms that actually find clusterings. Since, ...
Read More
Enhanced bisecting k-means clustering using intermediate cooperation

Bisecting k-means (BKM) is very attractive in many applications as document-retrieval/indexing and gene expression analysis problems. However, in some scenarios when a fraction of the dataset is left behind with no other way to re-cluster it again at ...
Read More

Reviews

Reviewer: Hui Liu

Graph clustering is the task of identifying dense sub-graphs of a given graph such that these sub-graphs are sparsely interconnected. In this paper, the authors conduct an experimental evaluation of greedy graph clustering algorithms. First, the problem (density-constrained clustering) and its background are briefly introduced. Second, two greedy algorithms are presented. The first one is the greedy merge (GM) algorithm, which greedily merges two clusters. The second one is the greedy vertex moving (GVM) algorithm, which allows vertices to move to another cluster to improve objective function. Then several measures are applied to compare GM and GVM algorithms, including intercluster density, balancedness, cluster size distribution, and effectiveness of different objective functions. The GVM algorithm is also compared with several reference algorithms. In the end, the GVM and multilevel modularity (ML-MOD) algorithms are employed to compare different objective functions. Graph clustering is very important to the modern Internet, including uses such as online shopping, social network analysis, and link prediction; it is a popular research area. Through various experiments, the authors show advantages and disadvantages of different clustering algorithms. The results of this paper are interesting and practical, which make them useful to software developers and algorithm researchers; they can serve as a guideline. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Journal of Experimental Algorithmics Volume 19, Issue
2014
402 pages
ISSN:1084-6654
EISSN:1084-6654
DOI:10.1145/2627368
Issue’s Table of Contents

Copyright © 2015 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 7 January 2015
- Accepted: 1 June 2014
- Revised: 1 March 2014
- Received: 1 April 2012
Published in jea Volume 19, Issue
Author Tags
Graph clustering
experimental evaluation
quality measures
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 308
  Total Downloads
- Downloads (Last 12 months)3
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Experiments on Density-Constrained Graph Clustering

ACM Journal of Experimental Algorithmics

Abstract

References

Cited By

Index Terms

Recommendations

Engineering graph clustering: Models and experimental evaluation

Experiments on density-constrained graph clustering

Enhanced bisecting k-means clustering using intermediate cooperation

Reviews

Access critical reviews of Computing literature here

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Experiments on Density-Constrained Graph Clustering

ACM Journal of Experimental Algorithmics

Abstract

References

Cited By

Index Terms

Recommendations

Engineering graph clustering: Models and experimental evaluation

Experiments on density-constrained graph clustering

Enhanced bisecting k-means clustering using intermediate cooperation

Reviews

Access critical reviews of Computing literature here

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media