Research paper
Influence maximization by rumor spreading on correlated networks through community identification

https://doi.org/10.1016/j.cnsns.2019.105094Get rights and content

Highlights

  • Network assortativity impacts on the results of the influence maximization methods.

  • More spreaders may not provide additional informed nodes at the end of the dynamic.

  • Selecting the best spreaders by communities performs similarly to the Greedy approach.

  • It is more suitable and less time-consuming the selection of spreaders by communities.

Abstract

The identification of the minimal set of nodes that maximizes the propagation of information is one of the most relevant problems in network science. In this paper, we introduce a new method to find the set of initial spreaders to maximize the information propagation in complex networks. We evaluate this method in assortative networks and verify that degree-degree correlation plays a fundamental role in the spreading dynamics. Simulation results show that our algorithm is statistically similar, regarding the average size of outbreaks, to the greedy approach in real-world networks. However, our method is much less time consuming than the greedy algorithm.

Introduction

With the popularization of Internet access by mobile devices, online social networks have emerged as a suitable medium for information transmission [1], [2]. News, rumors, and advertisements propagate fast in these networks due to the low average degree of separation between users [2]. Information is also exchanged in communication networks, where users share files related to multiple contents, including images, audio, and video. Communication and social networks are also characterized by a very heterogeneous structure, in which most of the users are low connected, whereas a minimal set of them have many connections [2]. Moreover, in some social networks, high degree vertices tend to connect to low degree vertices, defining a disassortative wiring pattern. This complex structure of networks affects the information propagation, defining a hierarchy among the nodes [1]. This means that networks present special nodes that are the most influential spreaders in the propagation process [3], [4], i.e., nodes that maximize the average size of outbreaks.

The identification of these influential nodes is essential to understand and control the spreading process on social networks [3]. Particularly, the influence maximization problem (IMP) is faced with the selection of a set of η spreaders that trigger the largest cascade of new adopters according to a spreading dynamic [5]. The problem of finding this set of initial spreaders is NP-hard for most of the spreading models [1], which makes the IMP as a challenge for network scientists. Thus, since it is not possible to obtain the optimal results for most of the networks, the IMP is addressed by heuristic algorithms. For instance, one of the most studied methods is a hill-climbing greedy approach [1], which guarantees that the influence spread is within (11/e) of the optimal influence spread. This greedy algorithm outperforms the classic degree and centrality-based heuristics in influence spread [1], but it is still very computationally expensive. In addition, Morone and Makse [3] mapped the IMP onto optimal percolation in random networks to identify the nodes that should be removed to minimize the average size of outbreaks. They verified that this set is given by the nodes whose removal break down the network into many disconnected subgraphs. However, this set of nodes does not correspond necessarily to optimal spreaders, as verified by Radicchi and Castellano [6]. Although all these works advanced the study of influence maximization, they disregard patterns of connections, such as degree-degree correlation and community structure, which have a fundamental impact on spreading dynamics [2].

Degree-degree correlations (or assortativity) is a network property in which nodes with similar features, such as degree, tend to be connected. Previous works verified that epidemics spread faster in assortative networks, but the reach is more extensive on disassortative structures [7]. Assortativity also influences the spreading threshold [8] and the diffusion time [9]. Although degree-degree correlation influences the spreading dynamics, the role of this network property on the influence maximization problem has not been addressed yet (see, for instance, [1], [2]). Here, we analyze how degree-degree correlation affects the average size of outbreaks in rumor dynamics.

We also propose a method for identification of the most influential spreaders based on community organization. Communities are groups of nodes densely connected among them, but with few connections with other groups [10]. Some authors verified that to improve the spreading efficiency, a good strategy is to distribute the seeds on the network producing lower overlap [11], [12], [13], [14]. If the community structure is not considered, then only suboptimal solutions can be obtained [12]. This happens because vertices belonging to the same community are likely to be more similar to each other and share the same set of neighbors. Although communities influence the diffusion of information, only a few studies have considered the community organization to study the influence maximization problem  [12], [13], [14], [15], [16], [17], [18]. Indeed, most of these works try to reduce the number of candidate vertices according to some evaluation method and the community structure. For instance, Galstyan et al. [12] employed the greedy approach for selecting the seeds in the smallest community and verified that this might cause a global activation cascade even for a small number of seeds. However, the results are restricted to random networks made up of two communities. Wang et al. [13] introduced a community-based greedy algorithm to find the η most influential nodes. The idea is to divide the network into communities and then, by a dynamic programming algorithm, incrementally select the community from which the next influential node is taken. The method involves high computational cost, although it is an order of magnitude faster than the greedy algorithm. In a similar approach, Cao et al. [15] transformed the influence maximization problem into an optimal resource allocation problem in the network communities. Initially, the method assumes that the communities are disconnected. Then, the method selects η candidates from each community according to the degree centrality and a dynamic programming algorithm identifies the final target nodes.

Although these works provided essential results on the influence maximization problem, none of them addressed the impact of the assortativity on the propagation dynamics. These methods are computationally expensive and consider a relatively low number of initial spreaders, i.e., up to η=50 spreaders. Moreover, classical rumor models are not addressed by these studies, although the model by Maki and Thompson [19] is often used to study information dynamics in networks [2], [20], [21], [22], [23]. Thus, in the present work, we provide an analysis of the impact of degree-degree correlation on the influence maximization problem, where the Maki-Thompson algorithm models the information spreading. A simple approach to maximize information diffusion considering the community structure of the network is introduced. We perform exhaustive simulations in eight real and ten artificial complex networks and verify that assortativity plays a significant role in the influence maximization problem. For instance, increasing the number of initial spreaders may not increase the size of the outbreak. Moreover, the selection of influential spreaders through communities is statistically similar to the greedy algorithm. However, our method requires much lower computational cost and, therefore, is more suitable in practice.

Section snippets

Concepts and methods

A social network can be represented as a graph G=(V,E) made up of a set of N=|V| vertices (nodes) and a set E of edges that connect pairs of vertices. Here, we consider only undirected and static networks. The degree ki of a vertex i corresponds to the number of edges attached to i. The degree distribution of a network P(k) gives the probability that a given randomly selected vertex has degree k. Social networks are characterized by highly heterogeneous degree distribution, presenting a

Databases

We perform extensive numerical simulations in several artificial and real-world networks, evaluating the impact of the degree correlation in the influence maximization problem. The structural properties of the networks are summarized in Table 1, with the respective assortativity ρ, number of vertices N, average degree ⟨k⟩, average shortest path length ⟨g⟩ and the average clustering coefficient ⟨Cc⟩. Also, the highest modularity Q value and number of communities Nc identified by the fastgreedy

Impact of assortativity on artificial networks

We calculate the final fraction of influenced individuals according to Eq. (1). The number of initial seeds varies from two nodes to 10% of the total number of nodes (N). The impact of degree correlation on the influence maximization problem is illustrated in Fig. 2, where we show the relative maximum spreading, i.e., the fraction of maximum informed nodes within the population that was not initially informed (initial seeds). We observe that an unexpected phenomenon occurs when networks are

Conclusion

We have analyzed the role of degree-degree correlation in the influence maximization problem. To simulate the information spreading, we consider the rumor model proposed by Maki and Thompson [19], [20], which is more suitable to represent the information dynamics in social networks [2]. We have proposed a method to maximize the influence transmission based on network community organization. This method has been analyzed by performing simulations on the top of eight real and ten artificial

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

Research carried out using the computational resources of the Center for Mathematical Sciences Applied to Industry (CeMEAI) funded by the grant 2013/07375-0 São Paulo Research Foundation (FAPESP). D.A.VO. acknowledges CNPq (grant 140688/2013-7) and FAPESP (grants 2016/23698-1 and 2018/24260-5). F.A.R. acknowledges CNPq (grant 305940/2010-4) and FAPESP (grant 2016/25682-5). Luciano da F. Costa thanks CNPq (grants 307085/2018-0 and 307333/2013-2), FAPESP (2011/50761-2), and NAP-PRP-USP for

References (60)

  • R. Pastor-Satorras et al.

    Epidemic processes in complex networks

    Rev Mod Phys

    (2015)
  • F. Morone et al.

    Influence maximization in complex networks through optimal percolation

    Nature

    (2015)
  • D.A. Vega-Oliveros et al.

    Rumor propagation with heterogeneous transmission in social networks

    J Stat Mech

    (2017)
  • M. Richardson et al.

    Mining knowledge-sharing sites for viral marketing

    Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining - KDD ’02

    (2002)
  • F. Radicchi et al.

    Fundamental difference between superblockers and superspreaders in networks

    Phys Rev E

    (2017)
  • I.Z. Kiss et al.

    The effect of network mixing patterns on epidemic dynamics and the efficacy of disease contact tracing

    J Royal Soc Interface

    (2008)
  • M. Boguñá et al.

    Absence of epidemic threshold in scale-free networks with degree correlations

    Phys Rev Lett

    (2003)
  • E. Balkanski et al.

    The importance of communities for learning to influence

    NIPS

    (2017)
  • A. Galstyan et al.

    Maximizing influence propagation in networks with community structure

    Phys Rev E

    (2009)
  • Y. Wang et al.

    Community-based greedy algorithm for mining top-k influential nodes in mobile social networks

    Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD ’10

    (2010)
  • L. Weng et al.

    Virality prediction and community structure in social networks

    Sci Rep

    (2013)
  • D.A. Vega-Oliveros et al.

    Spreader selection by community to maximize information diffusion in social networks

    SIMBig

    (2015)
  • M. Hosseini-Pozveh et al.

    A community-based approach to identify the most influential nodes in social networks

    J Inf Sci

    (2017)
  • D.P. Maki et al.

    Mathematical models and applications, with emphasis on the social, life, and management sciences

    (1973)
  • D.A. Vega-Oliveros et al.

    Evaluating link prediction by diffusion processes in dynamic networks

    Sci Rep

    (2019)
  • D.H. Zanette

    Dynamics of rumor propagation on small world networks

    Phys Rev E

    (2001)
  • Y. Moreno et al.

    Dynamics of rumor spreading in complex networks

    Phys Rev E

    (2004)
  • J. Borge-Holthoefer et al.

    Absence of influential spreaders in rumor dynamics

    Phys Rev E

    (2012)
  • M. Newman

    Assortative mixing in networks

    Phys Rev Lett

    (2002)
  • E. Estrada

    Combinatorial study of degree assortativity in networks

    Phys Rev E

    (2011)
  • Cited by (0)

    View full text