Abstract
Social networks are ubiquitous. The discovery of close-knit clusters in these networks is of fundamental and practical interest. Existing clustering criteria are limited in that clusters typically do not overlap, all vertices are clustered and/or external sparsity is ignored. We introduce a new criterion that overcomes these limitations by combining internal density with external sparsity in a natural way. An algorithm is given for provably finding the clusters, provided there is a sufficiently large gap between internal density and external sparsity. Experiments on real social networks illustrate the effectiveness of the algorithm.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Abello, J., Resende, M.G.C., Sudarsky, S.: Massive quasi-clique detection. In: Rajsbaum, S. (ed.) LATIN 2002. LNCS, vol. 2286, pp. 598–612. Springer, Heidelberg (2002)
Aiello, W., Chung, F., Lu, L.: A random graph model for massive graphs. In: STOC 2000. Proceedings of the 32nd Annual ACM Symposium on Theory of Computing, Portland, Oregon, pp. 171–180 (May 21-23, 2000)
Van Dongen, S.: A new cluster algorithm for graphs. Technical report, Universiteit Utrecht (July 10, 1998)
Flake, G.W., Tarjan, R.E., Tsioutsiouliklis, K.: Graph clustering and minimum cut trees. Internet Mathematics 1(4), 385–408 (2004)
Gomory, R.E., Hu, T.C.: Multi terminal network flows. Journal of the Society for Industrial and Applied Mathematics 9, 551–571 (1961)
Hartuv, E., Shamir, R.: A clustering algorithm based on graph connectivity. IPL: Information Processing Letters 76, 175–181 (2000)
KDD Cup 2003 HEP-TH (2003), http://www.cs.cornell.edu/projects/kddcup/datasets.html
Johnson, D.S., Papadimitriou, C.H., Yannakakis, M.: On generating all maximal independent sets. Information Processing Letters 27(3), 119–123 (1988)
Kannan, R., Vempala, S., Vetta, A.: On clusterings — good, bad and spectral. In: Proceedings of the 41th Annual Symposium on Foundations of Computer Science, pp. 367–377 (2000)
Karypis, G., Kumar, V.: A parallel algorithm for multilevel graph partitioning and sparse matrix ordering. J. Parallel Distrib. Comput. 48(1), 71–95 (1998)
Kempe, D., McSherry, F.: A decentralized algorithm for spectral analysis. In: STOC-2004. Proceedings of the thirty-sixth annual ACM Symposium on Theory of Computing, pp. 561–568. ACM Press, New York (June 13-15, 2004)
Krebs, V.: Uncloaking terrorist networks. First Monday 7(4) (2002)
Kumar, R., Raghavan, P., Rajagopalan, S., Tomkins, A.: Trawling the Web for emerging cyber-communities. Computer Networks 31(11-16), 1481–1493 (1999)
LiveJournal, http://www.livejournal.com
Newman, M.E.J.: Modularity and community structure in networks. National Academy of Sciences 103, 8577–8582 (2006)
Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Analysis and Machine Intelligence 22(8), 888–905 (2000)
Spielman, D.A., Teng, S.: Spectral partitioning works: Planar graphs and finite element meshes. In: Proceedings of the 37th Annual Symposium on Foundations of Computer Science, vol. 37, pp. 96–105 (1996)
Tsukiyama, S., Ide, M., Ariyoshi, H., Shirakawa, I.: A new algorithm for generating all the maximal independent sets. SIAM J. Comput. 6(3), 505–517 (1977)
Yuster, R., Zwick, U.: Fast sparse matrix multiplication. ACM Transactions on Algorithms 1(1), 2–13 (2005)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Mishra, N., Schreiber, R., Stanton, I., Tarjan, R.E. (2007). Clustering Social Networks. In: Bonato, A., Chung, F.R.K. (eds) Algorithms and Models for the Web-Graph. WAW 2007. Lecture Notes in Computer Science, vol 4863. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-77004-6_5
Download citation
DOI: https://doi.org/10.1007/978-3-540-77004-6_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-77003-9
Online ISBN: 978-3-540-77004-6
eBook Packages: Computer ScienceComputer Science (R0)