Skip to main content
Log in

Preventing the diffusion of information to vulnerable users while preserving PageRank

  • Regular Paper
  • Published:
International Journal of Data Science and Analytics Aims and scope Submit manuscript

Abstract

Limiting the diffusion of information in social networks is important in viral marketing and computer security. To achieve this, existing works aim to prevent the diffusion of information to as many nodes as possible, by deleting a given number of edges. Thus, they adopt a collective approach and quantify the impact of deletion on the graph, based on the number of deleted edges. In this work, we propose a selective approach which quantifies the impact of edge deletion based on PageRank. Our approach allows specifying the nodes to which information diffusion should be prevented and their maximum allowable activation probability. Furthermore, it performs edge deletion while avoiding drastic changes to the ability of the network to propagate information. To realize our approach, we propose a measure that captures changes, caused by deletion, to the PageRank distribution of the graph. Our measure is called PageRank-Harm (PRH) and quantifies the contribution of an incoming edge \((u_l,u)\) to the PageRank score of the node u. Based on PRH, we define the following optimization problem: Given a subset of nodes and a threshold, find a subset of edges that has minimum PRH and whose deletion limits the activation probability of each specified node to at most the threshold. We show that the problem can be modeled as a Submodular Set Cover (SSC) problem and design an approximation algorithm, based on the well-known approximation algorithm for SSC. Furthermore, we develop an iterative heuristic that has similar effectiveness but also enables significant computational savings. Moreover, we propose a lazy edge selection technique that is used to improve the efficiency of both our approximation algorithm and the iterative heuristic, without affecting their effectiveness. Experiments on real and synthetic data show the effectiveness and efficiency of our methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Notes

  1. The restriction provided in [24] allows u to remain inactive, after each of its active in-neighbors has attempted to activate u.

  2. The value of a monotone measure for the deletion of \(E'\) cannot be larger than that for the deletion of \(E'\cup e\).

  3. This function can be written as \(\sum _{v\in G}(\mathcal {P}(v,Q_{S,v},E))-\sum _{v\in G}(\mathcal {P}(v,Q_{S,v},E'))\), which is supermodular. This is because: (I) \(\sum _{v\in G}\mathcal {P}(v,Q_{S,v},E)\) is constant, (II) \(\sum _{v\in G}(\mathcal {P}(v,Q_{S,v},E'))\) is submodular (as a sum of submodular functions [27]), which implies that \(-\sum _{v\in G}(\mathcal {P}(v,Q_{S,v},E'))\) is supermodular [25], and (III) the sum of a constant and a supermodular function is supermodular [27].

References

  1. Albert, R., Jeong, H., Barabasi, A.: Error and attack tolerance of complex networks. Nature 406, 378–382 (2000)

    Article  Google Scholar 

  2. Avrachenkov, K., Litvak, N.: The effect of new links on google pagerank. Stoch. Models 22(2), 319–331 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  3. Avrachenkov, K., Litvak, N., Nemirovsky, D., Osipova, N.: Monte carlo methods in pagerank computation. SIAM J. Numer. Anal. 45(2), 890–904 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  4. Berkhin, P.: A survey on pagerank computing. Internet Math. 2(1), 73–120 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  5. Boldi, P., Santini, M., Vigna, S.: Paradoxical effects in pagerank incremental computations. Internet Math. 2(3), 387–404 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  6. Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine. Comput. Netw. ISDN 30(1), 107–117 (1998)

    Article  Google Scholar 

  7. Budak, C., Agrawal, D., Abbadi, A.E.: Limiting the spread of misinformation in social networks. In: WWW, pp. 665–674 (2011)

  8. Chan, H., Akoglu, L., Tong, H.: Make it or break it: manipulating robustness in large networks. In: SDM, pp. 325–333 (2014)

  9. Chen, W. et al.: Influence maximization in social networks when negative opinions may emerge and propagate. In: SDM, pp. 379–390 (2011)

  10. Chen, W., Wang, Y., Yang, S.: Efficient influence maximization in social networks. In: KDD, pp. 199–208 (2009)

  11. Chen, W., Yuan, Y., Zhang, L.: Scalable influence maximization in social networks under the linear threshold model. In: ICDM, pp. 88–97 (2010)

  12. Chvatal, V.: A greedy heuristic for the set-covering problem. Math. Oper. Res. 4(3), 233–235 (1979)

    Article  MathSciNet  MATH  Google Scholar 

  13. Csáji, B., Jungers, R., Blondel, V.: Pagerank optimization by edge selection. Discrete Appl. Math. 169, 73–87 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  14. Fujito, T.: Approximation algorithms for submodular set cover with applications. IEICE Trans. Inf. Syst. e83(3), 480–487 (2000)

    Google Scholar 

  15. Gao, C., Liu, J., Zhong, N.: Network immunization and virus propagation in email networks: experimental evaluation and analysis. KAIS 27(2), 253–279 (2011)

    Google Scholar 

  16. Goyal, A., Lu, W., Lakshmanan, L.V.S.: CELF++: optimizing the greedy algorithm for influence maximization in social networks. In: WWW, pp. 47–48 (2011)

  17. Goyal, A., Wei, L., Lakshmanan, L.: SIMPATH: an efficient algorithm for influence maximization under the linear threshold model. In: ICDM, pp. 211–220 (2011)

  18. Gupta, M., Pathak, A., Chakrabarti, S.: Fast algorithms for topk personalized pagerank queries. In: WWW, pp. 1225–1226 (2008)

  19. Gupta, S.: A Conceptual Framework that Identifies Antecedents and Consequences of Building Socially Responsible International Brands. Thunderbird International Business Review, Glendale (2015)

    Google Scholar 

  20. He, X., Song, G., Chen, W., Jiang, Q.: Influence blocking maximization in social networks under the competitive linear threshold model. In: SDM, pp. 463–474 (2012)

  21. Hethcote, H.W.: The mathematics of infectious diseases. SIAM Rev. 42(4), 599–653 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  22. Hideaki, I., Roberto, T.: Computing the pagerank variation for fragile web data. SICE J. Control Meas. Syst. Integr. 2(1), 1–9 (2009)

    Article  Google Scholar 

  23. Holme, P., Kim, B.J., Yoon, C.N., Han, S.K.: Attack vulnerability of complex networks. Phys. Rev. E 65, 056109 (2002)

    Article  Google Scholar 

  24. Kempe, D., Kleinberg, J., Tardos, E.: Maximizing the spread of influence through a social network. In: KDD, pp. 137–146 (2003)

  25. Khalil, E.B., Dilkina, B., Song, L.: Scalable diffusion-aware optimization of network topology. In: KDD, pp. 1226–1235 (2014)

  26. Kimura, M., Saito, K., Motoda, H.: Solving the contamination minimization problem on networks for the linear threshold model. In: PRICAI, pp. 977–984 (2008)

  27. Krause, A., Golovin, D.: Submodular function maximization. In: Bordeaux L (ed) Tractability. Cambridge University Press (2013)

  28. Kuhlman, C., Tuli, G., Swarup, S., Marathe, M.V., Ravi, S.S.: Blocking simple and complex contagion by edge removal. In: ICDM, pp. 399–408 (2013)

  29. Langville, A.N., Meyer, C.D.: A survey of eigenvector methods for web information retrieval. SIAM Rev. 47(1), 135–161 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  30. Leskovec, J., Krause, A., Guestrin, C., Faloutsos, C., VanBriesen, J., Glance, N.: Cost-effective outbreak detection in networks. In: KDD, pp. 420–429 (2007)

  31. Lin, H., Bilmes, J.: Multi-document summarization via budgeted maximization of submodular functions. In: HLT, pp. 912–920 (2010)

  32. Lowry, E.S., Medlock, C.W.: Object code optimization. Commun. ACM 12(1), 13–22 (1969)

    Article  Google Scholar 

  33. Maestre, J.M., Ishii, H.: A cooperative game theory approach to the pagerank problem. In: American Control Conference, pp. 3820–3825 (2016)

  34. Minoux, M.: Accelerated greedy algorithms for maximizing submodular set functions. In: IFIP Conference on Optimization Techniques, pp. 234–243 (1978)

  35. Mirzasoleiman, B., Badanidiyuru, A., Karbasi, A., Vondrák, J., Krause, A.: Lazier than lazy greedy. In: AAAI, pp. 1812–1818 (2015)

  36. Nemhauser, G.L., Wolsey, L.A., Fisher, M.L.: An analysis of approximations for maximizing submodular set functions. Math. Program. 14(1), 265–294 (1978)

    Article  MathSciNet  MATH  Google Scholar 

  37. Nguyen, N.P., Yan, G., Thai, M., Eidenbenz, S. Containment of misinformation spread in online social networks. In: ACM WebSci, pp. 213–222 (2012)

  38. Saha, S., Adiga, A., Prakash, B.A., Vullikanti, A.K.S.: Approximation algorithms for reducing the spectral radius to control epidemic spread. In: SDM, pp. 568–576 (2015)

  39. Smith, N.C., Cooper-Martin, E.: Ethics and target marketing: the role of product harm and consumer vulnerability. J. Mark. 61(3), 1–20 (1997)

    Article  Google Scholar 

  40. Tong, H., Prakash, B.A., Eliassi-Rad, T., Faloutsos, M., Faloutsos, C.: Gelling, and melting, large graphs by edge manipulation. In: CIKM, pp. 245–254 (2012)

  41. Wolsey, L.A.: An analysis of the greedy algorithm for the submodular set covering problem. Combinatorica 2(4), 385–393 (1982)

    Article  MathSciNet  MATH  Google Scholar 

  42. Xiang, B., et al.: Pagerank with priors: an influence propagation perspective. In: IJCAI, pp. 2740–2746 (2013)

  43. Yu, S., Gu, G., Barnawi, A., Guo, S., Stojmenovic, I.: Malware propagation in large-scale networks. IEEE TKDE 27(1), 170–179 (2015)

    Google Scholar 

  44. Zhang, H., Li, K., Fu, X., Wang, B.: An efficient control strategy of epidemic spreading on scale-free networks. Chin. Phys. Lett. 26(6), 068901 (2009)

    Article  Google Scholar 

  45. Zhang, Y., Prakash, B.A.: DAVA: distributing vaccines over networks under prior information. In: SDM, pp. 46–54 (2014)

  46. Zhang, Y., Prakash, B.A.: Data-aware vaccine allocation over large networks. ACM TKDD 10(2), 1–32 (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Grigorios Loukides.

Ethics declarations

Conflicts of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

This paper is an extended version of the paper “Limiting the diffusion of information by a selective PageRank-preserving approach” that was presented at the IEEE International Conference on Data Science and Advanced Analytics (DSAA) 2016.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Loukides, G., Gwadera, R. Preventing the diffusion of information to vulnerable users while preserving PageRank. Int J Data Sci Anal 5, 19–39 (2018). https://doi.org/10.1007/s41060-017-0082-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s41060-017-0082-x

Keywords

Navigation