Know by a handful the whole sack: efficient sampling for top-k influential user identification in large graphs

Liu, Xiaodong; Li, Shanshan; Liao, Xiangke; Peng, Shaoliang; Wang, Lei; Kong, Zhiyin

doi:10.1007/s11280-012-0196-y

Know by a handful the whole sack: efficient sampling for top-k influential user identification in large graphs

Published: 16 December 2012

Volume 17, pages 627–647, (2014)
Cite this article

World Wide Web Aims and scope Submit manuscript

Xiaodong Liu¹,
Shanshan Li¹,
Xiangke Liao¹,
Shaoliang Peng¹,
Lei Wang¹ &
…
Zhiyin Kong²

262 Accesses
6 Citations
Explore all metrics

Abstract

Influence Maximization aims to find the top-K influential individuals to maximize the influence spread within a social network, which remains an important yet challenging problem. Most existing greedy algorithms mainly focus on computing the exact influence spread, leading to low computational efficiency and limiting their application to real-world social networks. While in this paper we show that through supervised sampling, we can efficiently estimate the influence spread at only negligible cost of precision, thus significantly reducing the execution time. Motivated by this, we propose ESMCE, a power-law exponent supervised Monte Carlo estimation method. In particular, ESMCE exploits the power-law exponent of the social network to guide the sampling, and employs multiple iterative steps to guarantee the estimation accuracy. Moreover, ESMCE shows excellent scalability and well suits large-scale social networks. Extensive experiments on six real-world social networks demonstrate that, compared with state-of-the-art greedy algorithms, ESMCE is able to achieve almost two orders of magnitude speedup in execution time with only negligible error (2.21 % on average) in influence spread.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

The homophily principle in social network analysis: A survey

Article 18 January 2022

Kazi Zainab Khanam, Gautam Srivastava & Vijay Mago

A new semi-local centrality for identifying influential nodes based on local average shortest path with extended neighborhood

Article Open access 13 April 2024

Yi Xiao, Yuan Chen, … Xiaoping Zhu

The Independent Cascade and Linear Threshold Models

References

Chen, W., Wang, Y., Yang, S.: Efficient influence maximization in social networks. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 199–208. ACM, Paris (2009)
Chapter Google Scholar
Chen, W., Wang, C., Wang, Y.: Scalable influence maximization for prevalent viral marketing in large-scale social networks. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1029–1038. ACM, Washington, DC (2010)
Chapter Google Scholar
Chen, W., Collins, A., Cummings, R., Ke, T., Liu, Z., Rincon, D., Sun, X., Wang, Y., Wei, W., Yuan, Y.: Influence maximization in social networks when negative opinions may emerge and propagate. In: Proceedings of SIAM International Conference on Data Mining, pp. 379–390. SIAM, Mesa, AZ (2011)
Cohen, E.: Size-estimation framework with applications to transitive closure and reachability. J. Comput. Syst. Sci. 55(3), 441–453 (1997)
Article MATH Google Scholar
Domingos, P., Richardson, M.: Mining the network value of customers. In: Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 57–66. ACM, San Francisco, CA (2001)
Google Scholar
Huang, Y., Chu H.: Practical consideration for grey modeling and its application to image processing. J. Grey Syst. 8(3), 217–234 (1996)
Google Scholar
Jiang, Q., Song, G., Cong, G., Wang, Y., Si, W., Xie, K.: Simulated annealing based influence maximization in social networks. In: Proceedings of the 25th AAAI International Conference on Artificial Intelligence, pp. 127–132. AAAI, San Francisco, CA (2011)
Google Scholar
Jung, K., Heo, W., Chen, W.: IRIE: A scalable influence maximization algorithm for independent cascade model and its extensions, pp. 1–20. CoRR arXiv preprint arXiv:1111.4795 (2011)
Kawai, R.: Adaptive Monte Carlo variance reduction with two-time-scale stochastic approximation. Monte Carlo Methods Appl. 13(3), 197–217 (2007)
Article MATH MathSciNet Google Scholar
Kempe, D., Kleinberg, J., Tardos, E.: Maximizing the spread of influence through a social network. In: Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 137–146. ACM, Washington, DC (2003)
Google Scholar
Leskovec, J., Kleinberg, J., Faloutsos, C.: Graphs over time: densification laws, shrinking diameters and possible explanations. In: Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 177–187. ACM, Chicago, IL (2005)
Google Scholar
Leskovec, J., Adamic L., Huberman, B.: The dynamics of viral marketing. ACM Trans. Web 1(1), Article 5 (2007)
Google Scholar
Leskovec, J., Krause, A., Guestrin, C., Faloutsos, C., VanBriesen, J., Glance, N.: Cost-effective outbreak detection in networks. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 420–429. ACM, San Jose, CA (2007)
Chapter Google Scholar
Leskovec, J., Lang, K., Dasgupta, A., Mahoney, M.: Community structure in large networks: natural cluster sizes and the absence of large well-defined clusters. Internet Math. 6(1), 29–123 (2009)
Article MATH MathSciNet Google Scholar
Liu, S., Lin, Y.: Grey Systems: Theory and Applications, 1st edn. p. 380. Springer Berlin, Heidelberg (2010)
Book Google Scholar
Richardson, M., Agrawal, R., Domingos, P.: Trust management for the semantic web. In: Proceedings of 2nd International Semantic Web Conference, pp. 351–368. Springer, Sanibel Island, FL (2003)
Google Scholar
Tseng, F., Yu, H., Tzeng, G.: Applied hybrid grey model to forecast seasonal time series. Technol. Forecast. Soc. 67(2), 291–302 (2001)
Article Google Scholar
Wang, Y., Cong, G., Song, G., Xie, K.: Community-based greedy algorithm for mining top-k influential nodes in mobile social networks. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1039–1048. ACM, Washington, DC (2010)
Chapter Google Scholar
Wijayatunga, P., Cory B.: Sample size reduction in Monte Carlo based use-of-system costing of power systems. In: Proceedings of International Conference on Advances in Power System Control, Operation and Management, pp. 373–378. IEEE, Hong Kong (1991)
Google Scholar
Yao, W., Chi, S., Chen, J.: An improved grey-based approach for electricity demand forecasting. Electr. Power Syst. Res. 67(3), 217–224 (2003)
Article Google Scholar
Zafarani, R., Liu, H.: Social Computing Data Repository at ASU. http://socialcomputing.asu.edu/ (2009). Accessed 15 April 2011
Zhuge, H.: The Web Resource Space Model (Web Information Systems Engineering and Internet Technologies Book Series), 1st edn., p. 238. Springer (2008)
Zhuge, H.: Communities and emerging semantics in semantic link network: discovery and learning. IEEE Trans. Knowl. Data Eng. 21(6), 785–799 (2009)
Article MathSciNet Google Scholar
Zhuge, H.: Semantic linking through spaces for cyber-physical-socio intelligence: a methodology. Artif. Intell. 175(5–6), 988–1019 (2011)
Article Google Scholar
Zhuge, H.: The Knowledge Grid—Toward Cyber-Physical Society, 2nd edn. World Scientific Publishing Co., Singapore (2012)
Book Google Scholar
Zhuge, H., Xing, Y.: Probabilistic resource space model for managing resources in cyber-physical society. IEEE T. Serv. Comput. 5(3), 404–421 (2012)
Article Google Scholar
Zhuge, H., Zhang, J.: Topological centrality and its e-science applications. J. Am. Soc. Inf. Sci. Technol. 61(9), 1824–1841 (2010)
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer, National University of Defense Technology, Changsha, 410073, China
Xiaodong Liu, Shanshan Li, Xiangke Liao, Shaoliang Peng & Lei Wang
Science and Technology on Information Assurance Laboratory, Beijing, 010100, China
Zhiyin Kong

Authors

Xiaodong Liu
View author publications
You can also search for this author in PubMed Google Scholar
Shanshan Li
View author publications
You can also search for this author in PubMed Google Scholar
Xiangke Liao
View author publications
You can also search for this author in PubMed Google Scholar
Shaoliang Peng
View author publications
You can also search for this author in PubMed Google Scholar
Lei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zhiyin Kong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaodong Liu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, X., Li, S., Liao, X. et al. Know by a handful the whole sack: efficient sampling for top-k influential user identification in large graphs. World Wide Web 17, 627–647 (2014). https://doi.org/10.1007/s11280-012-0196-y

Download citation

Received: 11 June 2012
Revised: 20 November 2012
Accepted: 22 November 2012
Published: 16 December 2012
Issue Date: July 2014
DOI: https://doi.org/10.1007/s11280-012-0196-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Know by a handful the whole sack: efficient sampling for top-k influential user identification in large graphs

Abstract

Access this article

Similar content being viewed by others

The homophily principle in social network analysis: A survey

A new semi-local centrality for identifying influential nodes based on local average shortest path with extended neighborhood

The Independent Cascade and Linear Threshold Models

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Know by a handful the whole sack: efficient sampling for top-k influential user identification in large graphs

Abstract

Access this article

Similar content being viewed by others

The homophily principle in social network analysis: A survey

A new semi-local centrality for identifying influential nodes based on local average shortest path with extended neighborhood

The Independent Cascade and Linear Threshold Models

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation