skip to main content
10.1145/2124295.2124379acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
research-article

How user behavior is related to social affinity

Published:08 February 2012Publication History

ABSTRACT

Previous research has suggested that people who are in the same social circle exhibit similar behaviors and tastes. The rise of social networks gives us insights into the social circles of web users, and recommendation services (including search engines, advertisement engines, and collaborative filtering engines) provide a motivation to adapt recommendations to the interests of the audience. An important primitive for supporting these applications is the ability to quantify how connected two users are in a social network. The shortest-path distance between a pair of users is an obvious candidate measure. This paper introduces a new measure of "affinity" in social networks that takes into account not only the distance between two users, but also the number of edge-disjoint paths between them, i.e. the "robustness" of their connection. Our measure is based on a sketch-based approach, and affinity queries can be answered extremely efficiently (at the expense of a one-time offline sketch computation). We compare this affinity measure against the "approximate shortest-path distance", a sketch-based distance measure with similar efficiency characteristics. Our empirical study is based on a Hotmail email exchange graph combined with demographic information and Bing query history, and a Twitter mention-graph together with the text of the underlying tweets. We found that users who are close to each other - either in terms of distance or affinity - have a higher similarity in terms of demographics, queries, and tweets.

Skip Supplemental Material Section

Supplemental Material

wsdm_day3_session2_2.mp4

mp4

96.5 MB

References

  1. Joshua D. Batson, Daniel A. Spielman, and Nikhil Srivastava. Twice-Ramanujan sparsifiers. In 41st Annual ACM Symposium on Theory of Computing, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. András A. Benczúr and David R. Karger Approximating s-t minimum cuts in Õ(n2) time. In 28th Annual ACM Symposium on the Theory of Computing, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Jean Bourgain. On Lipschitz embeddings of finite metric spaces in Hilbert space. Israel Journal of Mathematics, 52(1-2):46--52 (1985).Google ScholarGoogle ScholarCross RefCross Ref
  4. Andrei Z. Broder. Identifying and filtering near-duplicate documents. In 11th Annual Symposium on Combinatorial Pattern Matching, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Andrei Z. Broder, Steven C. Glassman, Mark S. Manasse, and Geoffrey Zweig. Syntactic clustering of the web. Computer Networks, 29(8):1157--1166 (1997). Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Nicholas A. Christakis and James H. Fowler. The spread of obesity in a large social network over 32 years. New England Journal of Medicine 357(4): 370--379 (July 2007).Google ScholarGoogle ScholarCross RefCross Ref
  7. Nicholas A. Christakis and James H. Fowler. The collective dynamics of smoking in a large social network. New England Journal of Medicine, 358(21): 2249--2258 (May 2008).Google ScholarGoogle ScholarCross RefCross Ref
  8. David J. Crandall, Dan Cosley, Daniel P. Huttenlocher, Jon M. Kleinberg, and Siddharth Suri. Feedback effects between similarity and social influence in online communities. In 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Dan Cosley, Daniel P. Huttenlocher, Jon M. Kleinberg, Xiangyang Lan, and Siddharth Suri. Sequential influence models in social networks. In 4th International Conference on Weblogs and Social Media, 2010.Google ScholarGoogle Scholar
  10. Pedro Domingos and Matthew Richardson. Mining the network value of customers. In 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Atish Das Sarma, Sreenivas Gollapudi, Marc Najork, and Rina Panigrahy. A sketch-based distance oracle for Web-scale graphs. In 3rd International Conference on Web Search and Web Data Mining, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. James H. Fowler and Nicholas A. Christakis. The dynamic spread of happiness in a large social network: Longitudinal analysis over 20 years in the Framingham Heart Study. British Medical Journal 2008; 337: a2338.Google ScholarGoogle Scholar
  13. Wai Shing Fung, Ramesh Hariharan, Nicholas J. A. Harvey, and Debmalya Panigrahi. A general framework for graph sparsification. In 43rd ACM Symposium on Theory of Computing, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. David Kempe, Jon M. Kleinberg, and Éva Tardos. Maximizing the spread of influence through a social network. In 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Gueorgi Kossinets, Jon M. Kleinberg, and Duncan J. Watts. The structure of information pathways in a social communication network. In 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Ioannis Koutis, Gary L. Miller, and Richard Peng. Approaching optimality for solving SDD linear systems. In 51st Annual IEEE Symposium on Foundations of Computer Science, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Jure Leskovec, Ajit Singh, and Jon M. Kleinberg. Patterns of influence in a recommendation network. In 10th Pacific-Asia Conference on Knowledge Discovery and Data Mining, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Ulrike von Luxburg, Agnes Radl, Matthias Hein. Getting lost in space: Large sample analysis of the commute distance. In 24th Annual Conference on Neural Information Processing Systems, 2010.Google ScholarGoogle Scholar
  19. Jiri Matousek. On the distortion required for embedding finite metric spaces into normed spaces. Israel Journal of Mathematics, 93(1):333--344 (1996).Google ScholarGoogle ScholarCross RefCross Ref
  20. J. Niels Rosenquist, Joanne Murabito, James H. Fowler, and Nicholas A. Christakis. The spread of alcohol consumption behavior in a large social network. Annals of Internal Medicine 152(7): 426--433 (April 2010).Google ScholarGoogle ScholarCross RefCross Ref
  21. Gerard Salton and Christopher Buckley. Term-weighting approaches in automatic text retrieval. Information Processing & Management 24 (5): 513--523 (1988). Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Daniel A. Spielman and Nikhil Srivastava. Graph sparsification by effective resistances. In 40th Annual ACM Symposium on Theory of Computing, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Daniel A. Spielman and Shang-Hua Teng. Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems. In 36th Annual ACM Symposium on Theory of Computing, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Mikkel Thorup and Uri Zwick. Approximate distance oracles. Journal of the ACM, 52(1):1--24 (January 2005). Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. How user behavior is related to social affinity

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        WSDM '12: Proceedings of the fifth ACM international conference on Web search and data mining
        February 2012
        792 pages
        ISBN:9781450307475
        DOI:10.1145/2124295

        Copyright © 2012 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 8 February 2012

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate498of2,863submissions,17%

        Upcoming Conference

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader