Abstract
With more information becoming widely accessible and new content created every day on today’s web, more are turning to harvesting such data and analyzing it to extract insights. But the relevance of such data to see beyond the present is not clear. We present efforts to predict future events based on web intelligence – data harvested from the web – with specific emphasis on social media data and on timed event mentions, thereby quantifying the predictive power of such data. We focus on predicting crowd actions such as large protests and coordinated acts of cyber activism – predicting their occurrence, specific timeframe, and location. Using natural language processing, statements about events are extracted from content collected from hundred of thousands of open content web sources. Attributes extracted include event type, entities involved and their role, sentiment and tone, and – most crucially – the reported timeframe for the occurrence of the event discussed – whether it be in the past, present, or future. Tweets (Twitter posts) that mention an event to occur reportedly in the future prove to be important predictors. These signals are enhanced by cross referencing with the fragility of the situation as inferred from more traditional media, allowing us to sift out the social media trends that fizzle out before materializing as crowds on the ground.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ahram.org. Egypt warms up for a decisive day of anti- and pro-Morsi protests. www.english.ahram.org.eg/NewsContent/1/64/75483/Egypt/Politics-/Egypt-warms-up-for-a-decisive-day-of-anti-and-proM.aspx. Accessed 25 August 2013
Asur, S., Huberman, B.A.: Predicting the future with social media. In: WI-IAT (2010)
Bollen, J., Mao, H., Zeng, X.: Twitter mood predicts the stock market. J. Comput. Sci. 2(1), 1–8 (2011)
Choi, H., Varian, H.: Predicting the present with google trends. Econ. Rec. 88(s1), 2–9 (2012)
Da, Z., Engelberg, J., Gao, P.: In search of attention. J. Finance 66(5), 1461–1499 (2011)
Gayo-Avello, D.: No, you cannot predict elections with twitter. IEEE Internet Comput. 16(6), 91–94 (2012)
Goel, S., Hofman, J.M., Lahaie, S., Pennock, D.M., Watts, D.J.: Predicting consumer behavior with web search. PNAS 107(41), 17486–17490 (2010)
González-Bailón, S., Borge-Holthoefer, J., Rivero, A., Moreno, Y.: The dynamics of protest recruitment through an online network. Sci. Rep. 1, 197 (2011)
Gruhl, D., Chavet, L., Gibson, D., Meyer, J., Pattanayak, P., Tomkins, A., Zien, J.: How to build a webfountain: an architecture for very large-scale text analytics. IBM Syst. J. 43(1), 64–77 (2004)
Gruhl, D., Guha, R., Kumar, R., Novak, J., Tomkins, A.: The predictive power of online chatter. In: SIGKDD (2005)
Liaw, W.: Classification and regression by randomForest. R News 2(3), 18–22 (2002)
Nivre, J., Hall, J., Nilsson, J., Chanev, A., Eryigit, G., Kübler, S., Marinov, S., Marsi, E.: MaltParser: a language-independent system for data-driven dependency parsing. Nat. Lang. Eng. 13(2), 95–135 (2007)
NYTimes.com. Protester Dies in Clash That Apparently Involved Hezbollah Supporters. www.nytimes.com/2013/06/10/world/middleeast/protester-dies-in-lebanese-clash-said-to-involve-hezbollah-supporters.html. Accessed 24 August 2013
R Core Team. R: A Language and Environment for Statistical Computing. Vienna, Austria (2013)
Radinsky, K., Horvitz, E.: Mining the web to predict future events. In: WSDM (2013)
Telegraph.co.uk. Twitter in numbers. www.telegraph.co.uk/technology/twitter/9945505/Twitter-in-numbers.html. Accessed 25 August 2013
TheGuardian.com. John Kerry urges peace in Egypt amid anti-government protests.www.theguardian.com/world/video/2013/jun/26/kerry-urges-peace-egypt-protests-video. Accessed 25 August 2013
Ward, J.: Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc. 58(301), 236–244 (1963)
Zhang, D., Guo, B., Yu, Z.: The emergence of social and community intelligence. Computer 7, 21–28 (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Kallus, N. (2016). On the Predictive Power of Web Intelligence and Social Media. In: Atzmueller, M., Chin, A., Janssen, F., Schweizer, I., Trattner, C. (eds) Big Data Analytics in the Social and Ubiquitous Context. SENSEML MUSE MSM 2015 2014 2014. Lecture Notes in Computer Science(), vol 9546. Springer, Cham. https://doi.org/10.1007/978-3-319-29009-6_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-29009-6_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-29008-9
Online ISBN: 978-3-319-29009-6
eBook Packages: Computer ScienceComputer Science (R0)