Skip to main content

On the Predictive Power of Web Intelligence and Social Media

The Best Way to Predict the Future Is to tweet It

  • Conference paper
  • First Online:
Big Data Analytics in the Social and Ubiquitous Context (SENSEML 2015, MUSE 2014, MSM 2014)

Abstract

With more information becoming widely accessible and new content created every day on today’s web, more are turning to harvesting such data and analyzing it to extract insights. But the relevance of such data to see beyond the present is not clear. We present efforts to predict future events based on web intelligence – data harvested from the web – with specific emphasis on social media data and on timed event mentions, thereby quantifying the predictive power of such data. We focus on predicting crowd actions such as large protests and coordinated acts of cyber activism – predicting their occurrence, specific timeframe, and location. Using natural language processing, statements about events are extracted from content collected from hundred of thousands of open content web sources. Attributes extracted include event type, entities involved and their role, sentiment and tone, and – most crucially – the reported timeframe for the occurrence of the event discussed – whether it be in the past, present, or future. Tweets (Twitter posts) that mention an event to occur reportedly in the future prove to be important predictors. These signals are enhanced by cross referencing with the fragility of the situation as inferred from more traditional media, allowing us to sift out the social media trends that fizzle out before materializing as crowds on the ground.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Ahram.org. Egypt warms up for a decisive day of anti- and pro-Morsi protests. www.english.ahram.org.eg/NewsContent/1/64/75483/Egypt/Politics-/Egypt-warms-up-for-a-decisive-day-of-anti-and-proM.aspx. Accessed 25 August 2013

  2. Asur, S., Huberman, B.A.: Predicting the future with social media. In: WI-IAT (2010)

    Google Scholar 

  3. Bollen, J., Mao, H., Zeng, X.: Twitter mood predicts the stock market. J. Comput. Sci. 2(1), 1–8 (2011)

    Article  Google Scholar 

  4. Choi, H., Varian, H.: Predicting the present with google trends. Econ. Rec. 88(s1), 2–9 (2012)

    Article  Google Scholar 

  5. Da, Z., Engelberg, J., Gao, P.: In search of attention. J. Finance 66(5), 1461–1499 (2011)

    Article  Google Scholar 

  6. Gayo-Avello, D.: No, you cannot predict elections with twitter. IEEE Internet Comput. 16(6), 91–94 (2012)

    Article  Google Scholar 

  7. Goel, S., Hofman, J.M., Lahaie, S., Pennock, D.M., Watts, D.J.: Predicting consumer behavior with web search. PNAS 107(41), 17486–17490 (2010)

    Article  Google Scholar 

  8. González-Bailón, S., Borge-Holthoefer, J., Rivero, A., Moreno, Y.: The dynamics of protest recruitment through an online network. Sci. Rep. 1, 197 (2011)

    Google Scholar 

  9. Gruhl, D., Chavet, L., Gibson, D., Meyer, J., Pattanayak, P., Tomkins, A., Zien, J.: How to build a webfountain: an architecture for very large-scale text analytics. IBM Syst. J. 43(1), 64–77 (2004)

    Article  Google Scholar 

  10. Gruhl, D., Guha, R., Kumar, R., Novak, J., Tomkins, A.: The predictive power of online chatter. In: SIGKDD (2005)

    Google Scholar 

  11. Liaw, W.: Classification and regression by randomForest. R News 2(3), 18–22 (2002)

    MathSciNet  Google Scholar 

  12. Nivre, J., Hall, J., Nilsson, J., Chanev, A., Eryigit, G., Kübler, S., Marinov, S., Marsi, E.: MaltParser: a language-independent system for data-driven dependency parsing. Nat. Lang. Eng. 13(2), 95–135 (2007)

    Google Scholar 

  13. NYTimes.com. Protester Dies in Clash That Apparently Involved Hezbollah Supporters. www.nytimes.com/2013/06/10/world/middleeast/protester-dies-in-lebanese-clash-said-to-involve-hezbollah-supporters.html. Accessed 24 August 2013

  14. R Core Team. R: A Language and Environment for Statistical Computing. Vienna, Austria (2013)

    Google Scholar 

  15. Radinsky, K., Horvitz, E.: Mining the web to predict future events. In: WSDM (2013)

    Google Scholar 

  16. Telegraph.co.uk. Twitter in numbers. www.telegraph.co.uk/technology/twitter/9945505/Twitter-in-numbers.html. Accessed 25 August 2013

  17. TheGuardian.com. John Kerry urges peace in Egypt amid anti-government protests.www.theguardian.com/world/video/2013/jun/26/kerry-urges-peace-egypt-protests-video. Accessed 25 August 2013

  18. Ward, J.: Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc. 58(301), 236–244 (1963)

    Article  Google Scholar 

  19. Zhang, D., Guo, B., Yu, Z.: The emergence of social and community intelligence. Computer 7, 21–28 (2011)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nathan Kallus .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Kallus, N. (2016). On the Predictive Power of Web Intelligence and Social Media. In: Atzmueller, M., Chin, A., Janssen, F., Schweizer, I., Trattner, C. (eds) Big Data Analytics in the Social and Ubiquitous Context. SENSEML MUSE MSM 2015 2014 2014. Lecture Notes in Computer Science(), vol 9546. Springer, Cham. https://doi.org/10.1007/978-3-319-29009-6_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-29009-6_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-29008-9

  • Online ISBN: 978-3-319-29009-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics