Skip to main content
Log in

Embedding and predicting the event at early stage

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

Social media has become one of the most credible sources for delivering messages, breaking news, as well as events. Predicting the future dynamics of an event at a very early stage is significantly valuable, e.g, helping company anticipate marketing trends before the event becomes mature. However, this prediction is non-trivial because a) social events always stay with “noise” under the same topic and b) the information obtained at its early stage is too sparse and limited to support an accurate prediction. In order to overcome these two problems, in this paper, we design an event early embedding model (EEEM) that can 1) extract social events from noise, 2) find the previous similar events, and 3) predict future dynamics of a new event with very limited information. Specifically, a denoising approach is derived from the knowledge of signal analysis to eliminate social noise and extract events. Moreover, we propose a novel predicting scheme based on locally linear embedding algorithm to construct the volume of a new event from its k nearest neighbors. Compared to previous work only fitting the historical volume dynamics to make a prediction, our predictive model is based on both the volume information and content information of events. Extensive experiments conducted on a large-scale dataset of Twitter data demonstrate the capacity of our model on extract events and the promising performance of prediction by considering both volume information as well as content information. Compared with predicting with only the content or the volume feature, we find the best performance of considering they both with our proposed fusion method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7

Similar content being viewed by others

Notes

  1. https://twitter.com/

  2. https://www.google.com/trends/

  3. In practice, we find that V and Vg may have different lengthes. To make them comparable, we simply expand the shorter one with value 0 to meet the length of the longer one.

References

  1. Barabasi, A.L.: The origin of bursts and heavy tails in human dynamics. Nature 435(7039), 207–211 (2005)

    Article  Google Scholar 

  2. Bauckhage, C., Kersting, K., Hadiji, F.: Mathematical models of fads explain the temporal dynamics of internet memes. In: Proceedings of the Seventh International Conference on Weblogs and Social Media, ICWSM 2013, Cambridge, Massachusetts, USA, July 8-11, 2013. http://www.aaai.org/ocs/index.php/ICWSM/ICWSM13/paper/view/6022 (2013)

  3. Becker, H., Naaman, M., Gravano, L.: Learning similarity metrics for event identification in social media. In: Proceedings of the Third International Conference on Web Search and Web Data Mining, WSDM 2010, New York, NY, USA, February 4-6, 2010, pp. 291–300. https://doi.org/10.1145/1718487.1718524. http://doi.acm.org/10.1145/1718487.1718524 (2010)

  4. Cai, H., Tang, Z., Yang, Y., Huang, Z.: Eventeye: Monitoring evolving events from tweet streams. In: Proceedings of the ACM International Conference on Multimedia, MM ’14, Orlando, FL, USA, November 03 - 07, 2014, pp. 747–748. https://doi.org/10.1145/2647868.2654871. http://doi.acm.org/10.1145/2647868.2654871 (2014)

  5. Gao, L., Guo, Z., Zhang, H., Xu, X., Shen, H.T.: Video captioning with attention-based LSTM and semantic consistency. IEEE Trans. Multimed. 19(9), 2045–2055 (2017)

    Article  Google Scholar 

  6. Gomez-Rodriguez, M., Leskovec, J., Krause, A.: Inferring networks of diffusion and influence. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, July 25-28, 2010, pp. 1019–1028. https://doi.org/10.1145/1835804.1835933. http://doi.acm.org/10.1145/1835804.1835933 (2010)

  7. Gomez-Rodriguez, M., Balduzzi, D., Schȯlkopf, B.: Uncovering the temporal dynamics of diffusion networks. In Proceedings of the 28th International Conference on Machine Learning, ICML 2011, Bellevue, Washington, USA, June 28 - July 2, 2011, pp. 561–568 (2011)

  8. Guille, A., Hacid, H.: A predictive model for the temporal dynamics of information diffusion in online social networks. In: Proceedings of the 21st World Wide Web Conference, WWW 2012, Lyon, France, April 16-20, 2012 (Companion Volume), pp. 1145–1152. https://doi.org/10.1145/2187980.2188254. http://doi.acm.org/10.1145/2187980.2188254 (2012)

  9. Guo, L., Zhang, D., Li, G., Tan, K., Bao, Z.: Location-aware pub/sub system: When continuous moving queries meet dynamic event streams. In: SIGMOD, pp. 843–857 (2015)

  10. Gupta, M., Gao, J., Zhai, C., Han, J.: Predicting future popularity trend of events in microblogging platforms. Proc. Amer. Soc. Inf. Sci. Technol. 49(1), 1–10 (2012)

    Article  Google Scholar 

  11. Hong, R., Li, L., Cai, J., Tao, D., Wang, M., Tian, Q.: Coherent semantic-visual indexing for large-scale image retrieval in the cloud. IEEE Trans. Image Process. 26(9), 4128–4138 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  12. Hu, M., Yang, Y., Shen, F., Zhang, L., Shen, H.T., Xuelong, L.: Robust web image annotation via exploring multi-facet and structural knowledge. IEEE Trans. Image Process. 26(10), 4871–4884 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  13. Hu, M., Yang, Y., Shen, F., Xie, N., Shen, H.T.: Hashing with angular reconstructive embeddings. IEEE Trans. Image Process. 27(2), 545–555 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  14. Kamath, K.Y., Caverlee, J.: Discovering trending phrases on information streams. In: Proceedings of the 20th ACM Conference on Information and Knowledge Management, CIKM 2011, Glasgow, United Kingdom, October 24-28, 2011, pp. 2245–2248. https://doi.org/10.1145/2063576.2063937. http://doi.acm.org/10.1145/2063576.2063937 (2011)

  15. Kanhabua, N., Nejdl, W.: Understanding the diversity of tweets in the time of outbreaks. In: 22nd International World Wide Web Conference, WWW ’13, Rio de Janeiro, Brazil, May 13-17, 2013, Companion Volume, pp. 1335–1342. http://dl.acm.org/citation.cfm?id=2488172 (2013)

  16. Lehmann, J., Gonçalves, B., Ramasco, J.J., Cattuto, C.: Dynamical classes of collective attention in twitter. In: Proceedings of the 21st World Wide Web Conference 2012, WWW 2012, Lyon, France, April 16-20, 2012, pp. 251–260. https://doi.org/10.1145/2187836.2187871. http://doi.acm.org/10.1145/2187836.2187871 (2012)

  17. Lerman, K., Hogg, T.: Using a model of social dynamics to predict popularity of news. In: Proceedings of the 19th International Conference on World Wide Web, WWW 2010, Raleigh, North Carolina, USA, April 26-30, 2010, pp. 621–630. https://doi.org/10.1145/1772690.1772754. http://doi.acm.org/10.1145/1772690.1772754 (2010)

  18. Leskovec, J., Backstrom, L., Kleinberg, J.M.: Meme-tracking and the dynamics of the news cycle. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Paris, France, June 28 - July 1, 2009, pp. 497–506. https://doi.org/10.1145/1557019.1557077. http://doi.acm.org/10.1145/1557019.1557077 (2009)

  19. Luo, Y., Yang, Y., Shen, F., Huang, Z., Zhou, P., Shen, H.T.: Robust discrete code modeling for supervised hashing. Pattern Recogn. 75, 128–135 (2018)

    Article  Google Scholar 

  20. Manning, C.D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S.J., McClosky, D.: The Stanford CoreNLP natural language processing toolkit. In: Association for Computational Linguistics (ACL) System Demonstrations, pp. 55–60. http://www.aclweb.org/anthology/P/P14/P14-5010 (2014)

  21. Rong, Y., Cheng, H., Mo, Z.: Why it happened: Identifying and modeling the reasons of the happening of social events. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Sydney, NSW, Australia, August 10-13, 2015, pp. 1015–1024. https://doi.org/10.1145/2783258.2783305. http://doi.acm.org/10.1145/2783258.2783305 (2015)

  22. Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000). https://doi.org/10.1126/science.290.5500.2323. http://science.sciencemag.org/content/290/5500/2323

    Article  Google Scholar 

  23. Sakaki, T., Okazaki, M., Matsuo, Y.: Earthquake shakes twitter users: real-time event detection by social sensors. In: Proceedings of the 19th International Conference on World Wide Web, WWW 2010, Raleigh, North Carolina, USA, April 26-30, 2010, pp. 851–860. https://doi.org/10.1145/1772690.1772777. http://doi.acm.org/10.1145/1772690.1772777 (2010)

  24. Shen, H., Wang, D., Song, C., Barabási, A.: Modeling and predicting popularity dynamics via reinforced poisson processes. In: Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, July 27 -31, 2014, Québec City, Québec, Canada, pp. 291–297. http://www.aaai.org/ocs/index.php/AAAI/AAAI14/paper/view/8370 (2014)

  25. Shen, F., Yang, Y., Liu, L., Liu, W., Dacheng Tao, H.T.S.: Asymmetric binary coding for image search. IEEE Transactions on Multimedia. https://doi.org/10.1109/TMM.2017.2699863 (2017)

  26. Shen, F., Xu, Y., Liu, L., Yang, Y., Huang, Z., Shen, H.T.: Unsupervised deep hashing with similarity-adaptive and discrete optimization. IEEE Transactions on Pattern Analysis and Machine Intelligence. https://doi.org/10.1109/TPAMI.2018.2789887(2018)

  27. Song, X., Nie, L., Zhang, L., Akbari, M., Chua, T.: Multiple social network learning and its application in volunteerism tendency prediction. In: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, Santiago, Chile, August 9-13, 2015, pp. 213–222. https://doi.org/10.1145/2766462.2767726 (2015)

  28. Song, J., Gao, L., Nie, F., Shen, H.T., Yan, Y., Sebe, N.: Optimized graph learning using partial tags and multiple features for image and video annotation. IEEE Trans. Image Process. 25(11), 4999–5011 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  29. Tsytsarau, M., Palpanas, T., Castellanos, M.: Dynamics of news events and social media reaction. In: The 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’14, New York, NY, USA - August 24 - 27, 2014, pp. 901–910. https://doi.org/10.1145/2623330.2623670 (2014)

  30. Wang, B., Yang, Y., Xu, X., Hanjalic, A., Shen, H.T.: Adversarial cross-modal retrieval. In: ACM Multimedia, pp. 154–162 (2017)

  31. Weng, J., Lee, B.: Event detection in twitter. In: Proceedings of the Fifth International Conference on Weblogs and Social Media, Barcelona, Catalonia, Spain, July 17-21, 2011. http://www.aaai.org/ocs/index.php/ICWSM/ICWSM11/paper/view/2767 (2011)

  32. Xing, C., Wang, Y., Liu, J., Huang, Y., Ma, W.: Hashtag-based sub-event discovery using mutually generative LDA in twitter. In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, February 12-17, 2016, Phoenix, Arizona, USA, pp. 2666–2672. http://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/view/12012 (2016)

  33. Yang, Y., Pierce, T., Carbonell, J.G.: A study of retrospective and on-line event detection. In: SIGIR ’98: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, August 24-28 1998, Melbourne, Australia, pp. 28–36. https://doi.org/10.1145/290941.290953. http://doi.acm.org/10.1145/290941.290953 (1998)

  34. Yang, J., Leskovec, J.: Patterns of temporal variation in online media. In: Proceedings of the Forth International Conference on Web Search and Web Data Mining, WSDM 2011, Hong Kong, China, February 9-12, 2011, pp. 177–186. https://doi.org/10.1145/1935826.1935863. http://doi.acm.org/10.1145/1935826.1935863 (2011)

  35. Yang, Y., Ma, Z., Yang, Y., Nie, F., Shen, H.T.: Multitask spectral clustering by exploring intertask correlation. IEEE Trans. Cybern. 45(5), 1083–1094 (2015)

    Article  Google Scholar 

  36. Yang, Y., Shen, F., Shen, H. T., Li, H., Li, X.: Robust discrete spectral hashing for large-scale image semantic indexing. IEEE Trans. Big Data 1(4), 162–171 (2015)

    Article  Google Scholar 

  37. Yang, Y., Shen, F., Huang, Z., Shen, H.T., Li, X.: Discrete nonnegative spectral clustering. IEEE Trans. Knowl. Data Eng. 29(9), 1834–1845 (2017)

    Article  Google Scholar 

  38. Yin, H., Cui, B., Chen, L., Hu, Z., Huang, Z.: A temporal context-aware model for user behavior modeling in social media systems. In: International Conference on Management of Data, SIGMOD 2014, Snowbird, UT, USA, June 22-27, 2014, pp. 1543–1554. https://doi.org/10.1145/2588555.2593685. http://doi.acm.org/10.1145/2588555.2593685 (2014)

  39. Yin, H., Cui, B., Chen, L., Hu, Z., Zhou, X.: Dynamic user modeling in social media systems. ACM Trans. Inf. Syst. 33(3), 10:1–10:44 (2015). https://doi.org/10.1145/2699670. http://doi.acm.org/10.1145/2699670

    Article  Google Scholar 

  40. Yin, H., Cui, B., Lu, H., Huang, Y., Yao, J.: A unified model for stable and temporal topic detection from social media data. In: 29th IEEE International Conference on Data Engineering, ICDE 2013, Brisbane, Australia, April 8-12, 2013, pp. 661–672. https://doi.org/10.1109/ICDE.2013.6544864. https://doi.org/10.1109/ICDE.2013.6544864 (2013)

  41. Zaharieva, M., Zeppelzauer, M., del Fabro, M., Schopfhauser, D.: Social event mining in large photo collections. In: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, Shanghai, China, June 23-26, 2015, pp. 11–18. https://doi.org/10.1145/2671188.2749313. http://doi.acm.org/10.1145/2671188.2749313 (2015)

  42. Zhang, D., Li, Y., Fan, J., Gao, L., Shen, F., Shen, H.T.: Processing long queries against short text: Top-k advertisement matching in news stream applications. ACM TOIS 35(3), 28:1–28:27 (2017)

    Google Scholar 

  43. Zhao, Q., Mitra, P., Chen, B.: Temporal and information flow based event detection from social text streams. In: Proceedings of the Twenty-Second AAAI Conference on Artificial Intelligence, July 22-26, 2007, Vancouver, British Columbia, Canada, pp. 1501–1506. http://www.aaai.org/Library/AAAI/2007/aaai07-238.php (2007)

  44. Zhao, Q., Erdogdu, M.A., He, H.Y., Rajaraman, A., Leskovec, J.: SEISMIC: A self-exciting point process model for predicting tweet popularity. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Sydney, NSW, Australia, August 10-13, 2015, pp. 1513–1522. https://doi.org/10.1145/2783258.2783401. http://doi.acm.org/10.1145/2783258.2783401 (2015)

  45. Zhu, Y., Zhong, E., Pan, S.J., Wang, X., Zhou, M., Yang, Q.: Predicting user activity level in social networks. In: 22nd ACM International Conference on Information and Knowledge Management, CIKM’13, San Francisco, CA, USA, October 27 - November 1, 2013, pp. 159–168. https://doi.org/10.1145/2505515.2505518. http://doi.acm.org/10.1145/2505515.2505518 (2013)

  46. Zhu, X., Li, X., Zhang, S.: Block-row sparse multiview multilabel learning for image classification. IEEE Trans. Cybern. 46(2), 450–461 (2016)

    Article  Google Scholar 

  47. Zhu, X., Li, X., Zhang, S., Ju, C., Wu, X.: Robust joint graph sparse coding for unsupervised spectral feature selection. IEEE Trans. Neural Netw. Learn. Syst. 28 (6), 1263–1275 (2017)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China under Project 61572108, Project 61632007 and Project 61502081.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yang Yang.

Additional information

This article belongs to the Topical Collection: Special Issue on Geo-Social Computing

Guest Editors: Guandong Xu, Wen-Chih Peng, Hongzhi Yin, Zi (Helen) Huang

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, Z., Yang, Y., Huang, Z. et al. Embedding and predicting the event at early stage. World Wide Web 22, 1055–1074 (2019). https://doi.org/10.1007/s11280-018-0545-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-018-0545-6

Keywords

Navigation