skip to main content
research-article

Who Will Retweet This? Detecting Strangers from Twitter to Retweet Information

Published:21 April 2015Publication History
Skip Abstract Section

Abstract

There has been much effort on studying how social media sites, such as Twitter, help propagate information in different situations, including spreading alerts and SOS messages in an emergency. However, existing work has not addressed how to actively identify and engage the right strangers at the right time on social media to help effectively propagate intended information within a desired time frame. To address this problem, we have developed three models: (1) a feature-based model that leverages people's exhibited social behavior, including the content of their tweets and social interactions, to characterize their willingness and readiness to propagate information on Twitter via the act of retweeting; (2) a wait-time model based on a user's previous retweeting wait times to predict his or her next retweeting time when asked; and (3) a subset selection model that automatically selects a subset of people from a set of available people using probabilities predicted by the feature-based model and maximizes retweeting rate. Based on these three models, we build a recommender system that predicts the likelihood of a stranger to retweet information when asked, within a specific time window, and recommends the top-N qualified strangers to engage with. Our experiments, including live studies in the real world, demonstrate the effectiveness of our work.

References

  1. Nitin Agarwal, Huan Liu, Lei Tang, and Philip S. Yu. 2008. Identifying the influential bloggers in a community. In WSDM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Eytan Bakshy, Jake M. Hofman, Winter A. Mason, and Duncan J. Watts. 2011. Everyone's an influencer: Quantifying influence on Twitter. In WSDM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Eytan Bakshy, Itamar Rosenn, Cameron Marlow, and Lada Adamic. 2012. The role of social networks in information diffusion. In WWW. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Danah Boyd, Scott Golder, and Gilad Lotan. 2010. Tweet, tweet, retweet: Conversational aspects of retweeting on twitter. In HICSS. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Ceren Budak, Divyakant Agrawal, and Amr El Abbadi. 2011. Limiting the spread of misinformation in social networks. In WWW. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Meeyoung Cha, Hamed Haddadi, Fabricio Benevenuto, and Krishna P. Gummadi. 2010. Measuring user influence in twitter: The million follower fallacy. In ICWSM.Google ScholarGoogle Scholar
  7. Vineet Chaoji, Sayan Ranu, Rajeev Rastogi, and Rushi Bhatt. 2012. Recommendations to boost content spread in social networks. In WWW. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Nitesh V. Chawla, Kevin W. Bowyer, Lawrence O. Hall, and W. Philip Kegelmeyer. 2002. SMOTE: Synthetic minority over-sampling technique. J. Artif. Int. Res. 16, 1 (June 2002), 321--357. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Jilin Chen, Allen Cypher, Clemens Drews, and Jeffrey Nichols. 2013. CrowdE: Filtering tweets for direct customer engagements. In ICWSM.Google ScholarGoogle Scholar
  10. Kailong Chen, Tianqi Chen, Guoqing Zheng, Ou Jin, Enpeng Yao, and Yong Yu. 2012. Collaborative personalized tweet recommendation. In SIGIR. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Paul T. Costa and Robert R. McCrae. 1992. Revised NEO Personality Inventory (NEO PI-R) and NEP Five-factor Inventory (NEO-FFI): Professional Manual. Psychological Assessment Resources.Google ScholarGoogle Scholar
  12. Lisa A. Fast and David C. Funder. 2008. Personality as manifest in word use: Correlations with self-report, acquaintance report, and behavior. J. Personality Social Psychol. 94, 2 (2008), 334.Google ScholarGoogle ScholarCross RefCross Ref
  13. Tom Fawcett. 2006. An introduction to ROC analysis. Pattern Recogn. Lett. 27, 8 (June 2006), 861--874. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Wei Feng and Jianyong Wang. 2013. Retweet or not? Personalized tweet re-ranking. In WSDM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Alastair J. Gill, Scott Nowson, and Jon Oberlander. 2009. What are they blogging about? Personality, topic and motivation in blogs. In ICWSM.Google ScholarGoogle Scholar
  16. Amit Goyal, Francesco Bonchi, and Laks V. S. Lakshmanan. 2010. Learning influence probabilities in social networks. In WSDM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, and Ian H. Witten. 2009. The WEKA data mining software: An update. SIGKDD Explor. Newsl. 11, 1 (Nov. 2009), 10--18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Tuan-Anh Hoang and Ee-Peng Lim. 2012. Virality and susceptibility in information diffusions. In ICWSM.Google ScholarGoogle Scholar
  19. Nathan O. Hodas and Kristina Lerman. 2014. The simple rules of social contagion. Sci. Rep. 4 (2014). http://www.nature.com/srep/2014/140311/srep04343/full/srep04343.html.Google ScholarGoogle Scholar
  20. Junming Huang, Xue-Qi Cheng, Hua-Wei Shen, Tao Zhou, and Xiaolong Jin. 2012. Exploring social influence via posterior effect of word-of-mouth recommendations. In WSDM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Kyumin Lee, James Caverlee, and Steve Webb. 2010. Uncovering social spammers: Social honeypots + machine learning. In SIGIR. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Kyumin Lee, Brian David Eoff, and James Caverlee. 2011. Seven months with the devils: A long-term study of content polluters on Twitter. In ICWSM.Google ScholarGoogle Scholar
  23. Kyumin Lee, Jalal Mahmud, Jilin Chen, Michelle Zhou, and Jeffrey Nichols. 2014. Who will retweet this? Automatically identifying and engaging strangers on twitter to spread information. In IUI. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Xu-Ying Liu and Zhi-Hua Zhou. 2006. The influence of class imbalance on cost-sensitive learning: An empirical study. In ICDM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Sofus A. Macskassy and Matthew Michelson. 2011. Why do people retweet? Anti-homophily wins the day! In ICWSM.Google ScholarGoogle Scholar
  26. Jalal Mahmud, Michelle X. Zhou, Nimrod Megiddo, Jeffrey Nichols, and Clemens Drews. 2013. Recommending targeted strangers from whom to solicit information on social media. In IUI. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Jeffrey Nichols and Jeon-Hyung Kang. 2012. Asking questions of targeted strangers on social networks. In CSCW. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. James W. Pennebaker, Martha E. Francis, and Roger J. Booth. 2001. Linguistic Inquiry and Word Count. Lawrence Erlbaum Associates.Google ScholarGoogle Scholar
  29. Daniel M. Romero, Wojciech Galuba, Sitaram Asur, and Bernardo A. Huberman. 2011. Influence and passivity in social media. In ECML/PKDD. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Claude Shannon. 1948. A mathematical theory of communication. Bell Syst. Tech. J. 27 (July, October 1948), 379--423, 623--656.Google ScholarGoogle ScholarCross RefCross Ref
  31. Yaron Singer. 2012. How to win friends and influence people, truthfully: Influence maximization mechanisms for social networks. In WSDM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Kate Starbird and Leysia Palen. 2010. Pass It On? Retweeting in Mass Emergency. International Community on Information Systems for Crisis Response and Management.Google ScholarGoogle Scholar
  33. Greg Ver Steeg and Aram Galstyan. 2012. Information transfer in social media. In WWW. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Jianshu Weng, Ee-Peng Lim, Jing Jiang, and Qi He. 2010. TwitterRank: Finding topic-sensitive influential twitterers. In WSDM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Yiming Yang and Jan O. Pedersen. 1997. A comparative study on feature selection in text categorization. In ICML. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Tal Yarkoni. 2010. Personality in 100,000 words: A large-scale analysis of personality and word use among bloggers. J. Res. Personality 44, 3 (2010), 363--373.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Who Will Retweet This? Detecting Strangers from Twitter to Retweet Information

    Recommendations

    Reviews

    Susan Loretta Fowler

    The authors of this paper are from Google, IBM Research, and Utah State University. These are smart folks, and the study shows it. Their premise was that one could predict, based on tweeters past behavior and how they used language, who would be most likely to retweet public service messages. These information propagators are more likely to be willing and ready (within a set time frame) to spread information on request. The authors used two tweets, one location based and the other topic based, for their experiment. The location-based public safety tweet reported a shooting in the San Francisco Bay Area, and the topic-based bird flu tweet said that bird flu was expected to evolve in nature. Both tweets were obtained from news media sites. Their retweeting system succeeded when they sent tweets to people with these characteristics: They have tweeted on the topic before. They have many followers (100 was the bottom threshold). They retweeted within a set period of time (possibly six or 12 hours, although the optimal time frame is not stated). Retweeters word usage in their own tweets tended toward inclusiveness, conscientiousness, and openness on the Big5 and Linguistic Inquiry and Word Count scales. The authors provide copious information about the statistical analyses in which they engaged, so anyone with an interest in the project should be able to reproduce the experiment. At the end of the paper, the authors state: We found that our approaches were able to at least double the retweeting rates over two baselines. With our time estimation model, our approach also outperformed other approaches significantly by achieving a much higher retweeting rate within a given time window. ... In a live setting, our approach consistently outperformed the two baselines by almost doubling their retweeting rates. Overall, our approach effectively identifies qualified candidates for retweeting a message within a given time window. Emergency management groups could really use a system like this. However, there is no uniform resource locator (URL) or other information about how to access the authors system (which was funded by the US Defense Advanced Research Projects Agency, or DARPA). Nevertheless, I suppose that any smart information technology (IT) person would be able to reproduce the system from the paper. Online Computing Reviews Service

    Access critical reviews of Computing literature here

    Become a reviewer for Computing Reviews.

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Intelligent Systems and Technology
      ACM Transactions on Intelligent Systems and Technology  Volume 6, Issue 3
      Survey Paper, Regular Papers and Special Section on Participatory Sensing and Crowd Intelligence
      May 2015
      319 pages
      ISSN:2157-6904
      EISSN:2157-6912
      DOI:10.1145/2764959
      • Editor:
      • Huan Liu
      Issue’s Table of Contents

      Copyright © 2015 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 21 April 2015
      • Accepted: 1 September 2014
      • Revised: 1 August 2014
      • Received: 1 March 2014
      Published in tist Volume 6, Issue 3

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader