Abstract
The weight-based fusion model (WBFM) is among the simplest and most efficient ones for modularity-driven community detection (CD) in node-attributed social networks (ASNs) that contain both links between social actors (“structure”) and the actors’ feature vectors (“attributes”). Roughly speaking, the WBFM first converts the attributes into an attributive network so that one obtains the two networks—structural and attributive—instead of the ASN. Then, the two networks are fused into a composite one that is believed to contain the information about both the structure and the attributes and that can be already fed to traditional modularity-driven graph CD approaches. While the WBFM is widely used, it has been understudied analytically and had only a heuristic ground. In this paper, we disclose the mathematical machinery of the WBFM by revealing the objective function of the corresponding optimization CD process and establishing its connection with the traditional ASN CD quality measures. We also propose a pioneering non-manual parameter tuning scheme that provides the desired impact of the structure and the attributes on the CD results within the WBFM. Based on our theoretical results, we further present a well-tunable Leiden-based ASN CD algorithm that declares itself fast and accurate in our multiple experiments with synthetic and real-world datasets.
Similar content being viewed by others
Notes
An edge weight may be zero and this indicates that there is no social connection.
For nominal or textual attributes, it is common to use one-hot encoding or embeddings to obtain their numerical representation.
Communities may be overlapping if necessary but here we focus on disjoint ones.
As before, \(G_S=(\mathcal {V},\mathcal {E},\mathcal {W})\) is just the structure of G.
References
Akbas E, Zhao P (2019) Graph clustering based on attribute-aware graph embedding. In: Karampelas P, Kawash J, Özyer T (eds) From security to community detection in social networking platforms. Springer, Cham, pp 109–131
Alinezhad E, Teimourpour B, Sepehri MM, Kargari M (2020) Community detection in attributed networks considering both structural and attribute similarities: two mathematical programming approaches. Neural Comput Appl 32:3203–3220
Atzmueller M, Günnemann S, Zimmermann A (2021) Mining communities and their descriptions on attributed graphs: a survey. Data Min Knowl Dis 35(3):661–687
Blondel VD, Guillaume J-L, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Statist Mech Theory Exp 10:P10008
Bollobás B (2001) Random Graphs. Cambridge Studies in Advanced Mathematics. Cambridge University Press, NY
Bothorel C, Cruz J, Magnani M, Micenková B (2015) Clustering attributed graphs: models, measures and methods. Netw Sci 3(3):408–444
Chakraborty T, Dalmia A, Mukherjee A, Ganguly N (2017) Metrics for community analysis: a survey. ACM Comput Surv 50(4):54
Cheng H, Zhou Y, Huang X, Yu JX (2012) Clustering large attributed information networks: an efficient incremental computing approach. Data Min Knowl Dis 25(3):450–477
Chunaev P (2020) Community detection in node-attributed social networks: a survey. Comp Sci Rev 37:100286
Chunaev P, Gradov T, Bochenina K (2020) Community detection in node-attributed social networks: How structure-attributes correlation affects clustering quality. In: Procedia Computer Science, 178:355—364. In: Proceedings of the 9th international young scientists conference in computational science, YSC2020, 05-12 September 2020
Chunaev P, Gradov T, Bochenina K (2021) Composite modularity and parameter tuning in the weight-based fusion model for community detection in node-attributed social networks. In: Benito RM, Cherifi C, Cherifi H, Moro E, Rocha LM, Sales-Pardo M (eds) Complex networks & their applications IX. Springer International Publishing, Cham, pp 100–111
Chunaev, P., Nuzhdenko, I., and Bochenina, K. (2019). Community detection in attributed social networks: A unified weight-based model and its regimes. In: 2019 International Conference on Data Mining Workshops (ICDMW), pages 455–464
Combe, D., Largeron, C., Egyed-Zsigmond, E., and Gery, M. (2012). Combining relations and text in scientific network clustering. In: Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining, ASONAM’12, pages 1248–1253
Cruz J, Bothorel C, Poulet F (2011a) Entropy based community detection in augmented social networks. In: International Conference on Computational Aspects of Social Networks 163–168
Cruz J, Bothorel C, Poulet F (2011b) Semantic clustering of social networks using points of view. In: Conférence en recherche d’information et applications, pp 1–8
Cruz J, Bothorel C, Poulet F (2012) Détection et visualisation des communautés dans les réseaux sociaux. Revue d’intelligence Artificielle 26:369–392
Dang TA, Viennet E (2012) Community detection based on structural and attribute similarities. In: Proceedings of the international conference on digital society, ICDS 2012, pp 7–14
Danon L, Díaz-Guilera A, Duch J, Arenas A (2005) Comparing community structure identification. J Statist Mech Theory Exp 09:P09008
Fiore A, Donath J (2005) Homophily in online dating: When do you like someone like yourself? In: CHI EA '05: CHI '05 Extended Abstracts on Human Factors in Computing Systems, pp 1371–1374
He C, Liu S, Zhang L, Zheng J (2019) A fuzzy clustering based method for attributed graph partitioning. J Amb Intell Human Comput 10(9):3399–3407
Hric D, Darst RK, Fortunato S (2014) Community detection in networks: structural communities versus ground truth. Phys Rev E 90:062805
Huang B, Wang C, Wang B (2019) NMLPA: Uncovering overlapping communities in attributed networks via a multi-label propagation approach. Sensors (Basel, Switzerland) 19(2):260
Jebabli M, Cherifi H, Cherifi C, Hamouda A (2018) Community detection algorithm evaluation with ground-truth data. Phys A Statist Mech Appl 492:651–706
Jia C, Li Y, Carson MB, Wang X, Yu J (2017) Node attribute-enhanced community detection in complex networks. Sci Rep 7:2626
Kossinets G, Watts DJ (2009) Origins of homophily in an evolving social network. Am J Sociol 115:405–450
Lancichinetti A, Fortunato S, Radicchi F (2008) Benchmark graphs for testing community detection algorithms. Phys Rev E 78:046110
Li J, Guo R, Liu C, Liu H (2019) Adaptive unsupervised feature selection on attributed networks. In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery and data mining, KDD’19, pp 92–100
McPherson M, Smith-Lovin L, Cook JM (2001) Birds of a feather: homophily in social networks. Ann Rev Sociol 27(1):415–444
Meng F, Rui X, Wang Z, Xing Y, Cao L (2018) Coupled node similarity learning for community detection in attributed networks. Entropy 20(6):471
Nawaz W, Khan K-U, Lee Y-K, Lee S (2015) Intra graph clustering using collaborative similarity measure. Distrib Parallel Databases 33(4):583–603
Neville, J., Adler, M., and Jensen, D. (2003). Clustering relational data using attribute and link information. In: Proceedings of the Text Mining and Link Analysis Workshop, 18th International Joint Conference on Artificial Intelligence, pages 9–15
Newman M, Clauset A (2015) Structure and inference in annotated networks. Nature Commun 7:11863
Newman MEJ, Girvan M (2004) Finding and evaluating community structure in networks. Phys Rev E 69:026113
Orman GK, Labatut V, Cherifi H (2012) Comparative evaluation of community detection algorithms: a topological approach. J Statist Mech Theory Exp 08:P08001
Peel L, Larremore DB, Clauset A (2017) The ground truth about metadata and community detection in networks. Sci Adv 3(5):e1602548
Qin M, Jin D, Lei K, Gabrys B, Musial-Gabrys K (2018) Adaptive community detection incorporating topology and content in social networks. Knowl Based Syst 161:342–356
Ruan Y, Fuhry D, Parthasarathy S (2013) Efficient community detection in large networks using content and links. In: Proceedings of the 22Nd international conference on World Wide Web, WWW ’13, pp 1089–1098
Steinhaeuser K, Chawla NV (2010) Identifying and evaluating community structure in complex networks. Pattern Recognit Lett 31(5):413–421
Traag VA, Waltman L, van Eck NJ (2019) From Louvain to Leiden: guaranteeing well-connected communities. Sci Rep 9(1):5233
Vieira AR, Campos P, Brito P (2020) New contributions for the comparison of community detection algorithms in attributed networks. J Complex Netw 8(4):cnaa044
Wang, X., Jin, D., Cao, X., Yang, L., and Zhang, W. (2016). Semantic community identification in large attribute networks. In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, AAAI’16, pages 265–271. AAAI Press
Wang, X., Tang, L., Gao, H., and Liu, H. (2010). Discovering overlapping groups in social media. In: 2010 IEEE International Conference on Data Mining, pages 569–578
Xu, Z., Ke, Y., Wang, Y., Cheng, H., and Cheng, J. (2012). A model-based approach to attributed graph clustering. In: Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, pages 505–516
Xu Z, Ke Y, Wang Y, Cheng H, Cheng J (2014) Gbagc: a general bayesian framework for attributed graph clustering. ACM Trans Knowl Discov Data 9(1):1–43
Yang, J., McAuley, J. J., and Leskovec, J. (2013). Community detection in networks with node attributes. In: 2013 IEEE 13th International Conference on Data Mining, pages 1151–1156
Yang T, Jin R, Chi Y, Zhu S (2009) Combining link and content for community detection: a discriminative approach. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’09, pp 927–936
Yang Z, Algesheimer R, Tessone CJ (2016) A comparative analysis of community detection algorithms on artificial networks. Sci Rep 6:30750
Zhou Y, Cheng H, Yu JX (2009) Graph clustering based on structural/attribute similarities. Proc VLDB Endow 2(1):718–729
Zhou Y, Cheng H, Yu JX (2010) Clustering large attributed graphs: An efficient incremental approach. In: Proceedings of the 2010 IEEE international conference on data mining, ICDM ’10, pp 689–698
Acknowledgements
This research was financially supported by the Russian Science Foundation, Agreement 17-71-30029, with co-financing of Bank Saint Petersburg.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Chunaev, P., Gradov, T. & Bochenina, K. The machinery of the weight-based fusion model for community detection in node-attributed social networks. Soc. Netw. Anal. Min. 11, 109 (2021). https://doi.org/10.1007/s13278-021-00811-6
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13278-021-00811-6