Abstract
Influence Maximization aims to find the top-K influential individuals to maximize the influence spread within a social network, which remains an important yet challenging problem. Most existing greedy algorithms mainly focus on computing the exact influence spread, leading to low computational efficiency and limiting their application to real-world social networks. While in this paper we show that through supervised sampling, we can efficiently estimate the influence spread at only negligible cost of precision, thus significantly reducing the execution time. Motivated by this, we propose ESMCE, a power-law exponent supervised Monte Carlo estimation method. In particular, ESMCE exploits the power-law exponent of the social network to guide the sampling, and employs multiple iterative steps to guarantee the estimation accuracy. Moreover, ESMCE shows excellent scalability and well suits large-scale social networks. Extensive experiments on six real-world social networks demonstrate that, compared with state-of-the-art greedy algorithms, ESMCE is able to achieve almost two orders of magnitude speedup in execution time with only negligible error (2.21 % on average) in influence spread.
Similar content being viewed by others
References
Chen, W., Wang, Y., Yang, S.: Efficient influence maximization in social networks. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 199–208. ACM, Paris (2009)
Chen, W., Wang, C., Wang, Y.: Scalable influence maximization for prevalent viral marketing in large-scale social networks. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1029–1038. ACM, Washington, DC (2010)
Chen, W., Collins, A., Cummings, R., Ke, T., Liu, Z., Rincon, D., Sun, X., Wang, Y., Wei, W., Yuan, Y.: Influence maximization in social networks when negative opinions may emerge and propagate. In: Proceedings of SIAM International Conference on Data Mining, pp. 379–390. SIAM, Mesa, AZ (2011)
Cohen, E.: Size-estimation framework with applications to transitive closure and reachability. J. Comput. Syst. Sci. 55(3), 441–453 (1997)
Domingos, P., Richardson, M.: Mining the network value of customers. In: Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 57–66. ACM, San Francisco, CA (2001)
Huang, Y., Chu H.: Practical consideration for grey modeling and its application to image processing. J. Grey Syst. 8(3), 217–234 (1996)
Jiang, Q., Song, G., Cong, G., Wang, Y., Si, W., Xie, K.: Simulated annealing based influence maximization in social networks. In: Proceedings of the 25th AAAI International Conference on Artificial Intelligence, pp. 127–132. AAAI, San Francisco, CA (2011)
Jung, K., Heo, W., Chen, W.: IRIE: A scalable influence maximization algorithm for independent cascade model and its extensions, pp. 1–20. CoRR arXiv preprint arXiv:1111.4795 (2011)
Kawai, R.: Adaptive Monte Carlo variance reduction with two-time-scale stochastic approximation. Monte Carlo Methods Appl. 13(3), 197–217 (2007)
Kempe, D., Kleinberg, J., Tardos, E.: Maximizing the spread of influence through a social network. In: Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 137–146. ACM, Washington, DC (2003)
Leskovec, J., Kleinberg, J., Faloutsos, C.: Graphs over time: densification laws, shrinking diameters and possible explanations. In: Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 177–187. ACM, Chicago, IL (2005)
Leskovec, J., Adamic L., Huberman, B.: The dynamics of viral marketing. ACM Trans. Web 1(1), Article 5 (2007)
Leskovec, J., Krause, A., Guestrin, C., Faloutsos, C., VanBriesen, J., Glance, N.: Cost-effective outbreak detection in networks. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 420–429. ACM, San Jose, CA (2007)
Leskovec, J., Lang, K., Dasgupta, A., Mahoney, M.: Community structure in large networks: natural cluster sizes and the absence of large well-defined clusters. Internet Math. 6(1), 29–123 (2009)
Liu, S., Lin, Y.: Grey Systems: Theory and Applications, 1st edn. p. 380. Springer Berlin, Heidelberg (2010)
Richardson, M., Agrawal, R., Domingos, P.: Trust management for the semantic web. In: Proceedings of 2nd International Semantic Web Conference, pp. 351–368. Springer, Sanibel Island, FL (2003)
Tseng, F., Yu, H., Tzeng, G.: Applied hybrid grey model to forecast seasonal time series. Technol. Forecast. Soc. 67(2), 291–302 (2001)
Wang, Y., Cong, G., Song, G., Xie, K.: Community-based greedy algorithm for mining top-k influential nodes in mobile social networks. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1039–1048. ACM, Washington, DC (2010)
Wijayatunga, P., Cory B.: Sample size reduction in Monte Carlo based use-of-system costing of power systems. In: Proceedings of International Conference on Advances in Power System Control, Operation and Management, pp. 373–378. IEEE, Hong Kong (1991)
Yao, W., Chi, S., Chen, J.: An improved grey-based approach for electricity demand forecasting. Electr. Power Syst. Res. 67(3), 217–224 (2003)
Zafarani, R., Liu, H.: Social Computing Data Repository at ASU. http://socialcomputing.asu.edu/ (2009). Accessed 15 April 2011
Zhuge, H.: The Web Resource Space Model (Web Information Systems Engineering and Internet Technologies Book Series), 1st edn., p. 238. Springer (2008)
Zhuge, H.: Communities and emerging semantics in semantic link network: discovery and learning. IEEE Trans. Knowl. Data Eng. 21(6), 785–799 (2009)
Zhuge, H.: Semantic linking through spaces for cyber-physical-socio intelligence: a methodology. Artif. Intell. 175(5–6), 988–1019 (2011)
Zhuge, H.: The Knowledge Grid—Toward Cyber-Physical Society, 2nd edn. World Scientific Publishing Co., Singapore (2012)
Zhuge, H., Xing, Y.: Probabilistic resource space model for managing resources in cyber-physical society. IEEE T. Serv. Comput. 5(3), 404–421 (2012)
Zhuge, H., Zhang, J.: Topological centrality and its e-science applications. J. Am. Soc. Inf. Sci. Technol. 61(9), 1824–1841 (2010)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Liu, X., Li, S., Liao, X. et al. Know by a handful the whole sack: efficient sampling for top-k influential user identification in large graphs. World Wide Web 17, 627–647 (2014). https://doi.org/10.1007/s11280-012-0196-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11280-012-0196-y