Research paperInfluence maximization by rumor spreading on correlated networks through community identification
Introduction
With the popularization of Internet access by mobile devices, online social networks have emerged as a suitable medium for information transmission [1], [2]. News, rumors, and advertisements propagate fast in these networks due to the low average degree of separation between users [2]. Information is also exchanged in communication networks, where users share files related to multiple contents, including images, audio, and video. Communication and social networks are also characterized by a very heterogeneous structure, in which most of the users are low connected, whereas a minimal set of them have many connections [2]. Moreover, in some social networks, high degree vertices tend to connect to low degree vertices, defining a disassortative wiring pattern. This complex structure of networks affects the information propagation, defining a hierarchy among the nodes [1]. This means that networks present special nodes that are the most influential spreaders in the propagation process [3], [4], i.e., nodes that maximize the average size of outbreaks.
The identification of these influential nodes is essential to understand and control the spreading process on social networks [3]. Particularly, the influence maximization problem (IMP) is faced with the selection of a set of η spreaders that trigger the largest cascade of new adopters according to a spreading dynamic [5]. The problem of finding this set of initial spreaders is NP-hard for most of the spreading models [1], which makes the IMP as a challenge for network scientists. Thus, since it is not possible to obtain the optimal results for most of the networks, the IMP is addressed by heuristic algorithms. For instance, one of the most studied methods is a hill-climbing greedy approach [1], which guarantees that the influence spread is within of the optimal influence spread. This greedy algorithm outperforms the classic degree and centrality-based heuristics in influence spread [1], but it is still very computationally expensive. In addition, Morone and Makse [3] mapped the IMP onto optimal percolation in random networks to identify the nodes that should be removed to minimize the average size of outbreaks. They verified that this set is given by the nodes whose removal break down the network into many disconnected subgraphs. However, this set of nodes does not correspond necessarily to optimal spreaders, as verified by Radicchi and Castellano [6]. Although all these works advanced the study of influence maximization, they disregard patterns of connections, such as degree-degree correlation and community structure, which have a fundamental impact on spreading dynamics [2].
Degree-degree correlations (or assortativity) is a network property in which nodes with similar features, such as degree, tend to be connected. Previous works verified that epidemics spread faster in assortative networks, but the reach is more extensive on disassortative structures [7]. Assortativity also influences the spreading threshold [8] and the diffusion time [9]. Although degree-degree correlation influences the spreading dynamics, the role of this network property on the influence maximization problem has not been addressed yet (see, for instance, [1], [2]). Here, we analyze how degree-degree correlation affects the average size of outbreaks in rumor dynamics.
We also propose a method for identification of the most influential spreaders based on community organization. Communities are groups of nodes densely connected among them, but with few connections with other groups [10]. Some authors verified that to improve the spreading efficiency, a good strategy is to distribute the seeds on the network producing lower overlap [11], [12], [13], [14]. If the community structure is not considered, then only suboptimal solutions can be obtained [12]. This happens because vertices belonging to the same community are likely to be more similar to each other and share the same set of neighbors. Although communities influence the diffusion of information, only a few studies have considered the community organization to study the influence maximization problem [12], [13], [14], [15], [16], [17], [18]. Indeed, most of these works try to reduce the number of candidate vertices according to some evaluation method and the community structure. For instance, Galstyan et al. [12] employed the greedy approach for selecting the seeds in the smallest community and verified that this might cause a global activation cascade even for a small number of seeds. However, the results are restricted to random networks made up of two communities. Wang et al. [13] introduced a community-based greedy algorithm to find the η most influential nodes. The idea is to divide the network into communities and then, by a dynamic programming algorithm, incrementally select the community from which the next influential node is taken. The method involves high computational cost, although it is an order of magnitude faster than the greedy algorithm. In a similar approach, Cao et al. [15] transformed the influence maximization problem into an optimal resource allocation problem in the network communities. Initially, the method assumes that the communities are disconnected. Then, the method selects η candidates from each community according to the degree centrality and a dynamic programming algorithm identifies the final target nodes.
Although these works provided essential results on the influence maximization problem, none of them addressed the impact of the assortativity on the propagation dynamics. These methods are computationally expensive and consider a relatively low number of initial spreaders, i.e., up to spreaders. Moreover, classical rumor models are not addressed by these studies, although the model by Maki and Thompson [19] is often used to study information dynamics in networks [2], [20], [21], [22], [23]. Thus, in the present work, we provide an analysis of the impact of degree-degree correlation on the influence maximization problem, where the Maki-Thompson algorithm models the information spreading. A simple approach to maximize information diffusion considering the community structure of the network is introduced. We perform exhaustive simulations in eight real and ten artificial complex networks and verify that assortativity plays a significant role in the influence maximization problem. For instance, increasing the number of initial spreaders may not increase the size of the outbreak. Moreover, the selection of influential spreaders through communities is statistically similar to the greedy algorithm. However, our method requires much lower computational cost and, therefore, is more suitable in practice.
Section snippets
Concepts and methods
A social network can be represented as a graph made up of a set of vertices (nodes) and a set E of edges that connect pairs of vertices. Here, we consider only undirected and static networks. The degree ki of a vertex i corresponds to the number of edges attached to i. The degree distribution of a network P(k) gives the probability that a given randomly selected vertex has degree k. Social networks are characterized by highly heterogeneous degree distribution, presenting a
Databases
We perform extensive numerical simulations in several artificial and real-world networks, evaluating the impact of the degree correlation in the influence maximization problem. The structural properties of the networks are summarized in Table 1, with the respective assortativity ρ, number of vertices N, average degree ⟨k⟩, average shortest path length ⟨g⟩ and the average clustering coefficient ⟨Cc⟩. Also, the highest modularity Q value and number of communities Nc identified by the fastgreedy
Impact of assortativity on artificial networks
We calculate the final fraction of influenced individuals according to Eq. (1). The number of initial seeds varies from two nodes to 10% of the total number of nodes (N). The impact of degree correlation on the influence maximization problem is illustrated in Fig. 2, where we show the relative maximum spreading, i.e., the fraction of maximum informed nodes within the population that was not initially informed (initial seeds). We observe that an unexpected phenomenon occurs when networks are
Conclusion
We have analyzed the role of degree-degree correlation in the influence maximization problem. To simulate the information spreading, we consider the rumor model proposed by Maki and Thompson [19], [20], which is more suitable to represent the information dynamics in social networks [2]. We have proposed a method to maximize the influence transmission based on network community organization. This method has been analyzed by performing simulations on the top of eight real and ten artificial
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
Research carried out using the computational resources of the Center for Mathematical Sciences Applied to Industry (CeMEAI) funded by the grant 2013/07375-0 São Paulo Research Foundation (FAPESP). D.A.VO. acknowledges CNPq (grant 140688/2013-7) and FAPESP (grants 2016/23698-1 and 2018/24260-5). F.A.R. acknowledges CNPq (grant 305940/2010-4) and FAPESP (grant 2016/25682-5). Luciano da F. Costa thanks CNPq (grants 307085/2018-0 and 307333/2013-2), FAPESP (2011/50761-2), and NAP-PRP-USP for
References (60)
- et al.
The bass diffusion model on networks with correlations and inhomogeneous advertising
Chaos Solitons Fractals
(2016) - et al.
Community detection in networks: a user guide
Phys Rep
(2016) - et al.
Maximizing influence spread in modular social networks by optimal resource allocation
Expert Syst Appl
(2011) - et al.
Identifying influential nodes in complex networks with community structure
Knowl Based Syst
(2013) - et al.
A multi-centrality index for graph-based keyword extraction
Inf Process Manag
(2019) - et al.
Identifying influential nodes in complex networks based on a spreading influence related centrality
Phys A
(2019) - Tixier A.J., Rossi M.G., Malliaros F.D., Read J., Vazirgiannis M.. Perturb and combine to identify influential...
- et al.
Fast influencers in complex networks
Commun Nonlinear Sci Numer Simul
(2019) - et al.
Influencer identification in dynamical complex systems
J Complex Netw
(2019) - et al.
Maximizing the spread of influence through a social network
Theory Comput
(2015)