Abstract
Finding communities in graphs is one of the most well- studied problems in data mining and social-network analysis. In many real applications, the underlying graph does not have a clear community structure. In those cases, selecting a single community turns out to be a fairly ill-posed problem, as the optimization criterion has to make a difficult choice between selecting a tight but small community or a more inclusive but sparser community.
In order to avoid the problem of selecting only a single community we propose discovering a sequence of nested communities. More formally, given a graph and a starting set, our goal is to discover a sequence of communities all containing the starting set, and each community forming a denser subgraph than the next. Discovering an optimal sequence of communities is a complex optimization problem, and hence we divide it into two subproblems: 1) discover the optimal sequence for a fixed order of graph vertices, a subproblem that we can solve efficiently, and 2) find a good order. We employ a simple heuristic for discovering an order and we provide empirical and theoretical evidence that our order is good.
Chapter PDF
Similar content being viewed by others
References
Agarwal, G., Kempe, D.: Modularity-maximizing network communities via mathematical programming. European Physics Journal B 66(3) (2008)
Ayer, M., Brunk, H., Ewing, G., Reid, W.: An empirical distribution function for sampling with incomplete information. The Annals of Mathematical Statistics 26(4) (1955)
Bellman, R.: On the approximation of curves by line segments using dynamic programming. Communications of the ACM 4(6) (1961)
Charikar, M.: Greedy approximation algorithms for finding dense components in a graph. In: Jansen, K., Khuller, S. (eds.) APPROX 2000. LNCS, vol. 1913, pp. 84–95. Springer, Heidelberg (2000)
Chung, F.R.K.: Spectral Graph Theory. American Mathematical Society (1997)
Clauset, A., Newman, M.E.J., Moore, C.: Finding community structure in very large networks. Physical Review E (2004)
Coscia, M., Rossetti, G., Giannotti, F., Pedreschi, D.: DEMON: a local-first discovery method for overlapping communities. In: KDD (2012)
Flake, G.W., Lawrence, S., Giles, C.L.: Efficient identification of web communities. In: KDD (2000)
Flake, G.W., Lawrence, S., Giles, C.L., Coetzee, F.M.: Self-organization and identification of web communities. Computer 35(3) (2002)
Fortunato, S.: Community detection in graphs. Physics Reports, 486 (2010)
Girvan, M., Newman, M.E.J.: Community structure in social and biological networks. PNAS 99 (2002)
Gregory, S.: An algorithm to find overlapping community structure in networks. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) PKDD 2007. LNCS (LNAI), vol. 4702, pp. 91–102. Springer, Heidelberg (2007)
Guha, S., Koudas, N., Shim, K.: Approximation and streaming algorithms for histogram construction problems. ACM TODS 31 (2006)
Haiminen, N., Gionis, A.: Unimodal segmentation of sequences. In: ICDM (2004)
Karypis, G., Kumar, V.: Multilevel algorithms for multi-constraint graph partitioning. In: CDROM (1998)
Koren, Y., North, S.C., Volinsky, C.: Measuring and extracting proximity graphs in networks. TKDD 1(3) (2007)
Leskovec, J., Lang, K.J., Dasgupta, A., Mahoney, M.W.: Statistical properties of community structure in large social and information networks. In: WWW (2008)
Ng, A.Y., Jordan, M.I., Weiss, Y.: On spectral clustering: Analysis and an algorithm. In: NIPS (2001)
Palla, G., Derényi, I., Farkas, I., Vicsek, T.: Uncovering the overlapping community structure of complex networks in nature and society. Nature 435 (2005)
Pinney, J., Westhead, D.: Betweenness-based decomposition methods for social and biological networks. In: Interdisciplinary Statistics and Bioinformatics (2006)
Pons, P., Latapy, M.: Computing communities in large networks using random walks. Journal of Graph Algorithms Applications 10(2) (2006)
Sozio, M., Gionis, A.: The community-search problem and how to plan a successful cocktail party. In: KDD (2010)
Tong, H., Faloutsos, C.: Center-piece subgraphs: problem definition and fast solutions. In: KDD (2006)
van Dongen, S.: Graph Clustering by Flow Simulation. PhD thesis, University of Utrecht (2000)
von Luxburg, U.: A tutorial on spectral clustering. Statistics and Computing 17(4) (2007)
White, S., Smyth, P.: A spectral clustering approach to finding communities in graph. In: SDM (2005)
Zhang, S., Wang, R.-S., Zhang, X.-S.: Identification of overlapping community structure in complex networks using fuzzy c-means clustering. Physica A (2007)
Zhou, H., Lipowsky, R.: Network brownian motion: A new method to measure vertex-vertex proximity and to identify communities and subcommunities. In: Bubak, M., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds.) ICCS 2004. LNCS, vol. 3038, pp. 1062–1069. Springer, Heidelberg (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Tatti, N., Gionis, A. (2013). Discovering Nested Communities. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2013. Lecture Notes in Computer Science(), vol 8189. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40991-2_3
Download citation
DOI: https://doi.org/10.1007/978-3-642-40991-2_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40990-5
Online ISBN: 978-3-642-40991-2
eBook Packages: Computer ScienceComputer Science (R0)