ABSTRACT
Randomized experiments, or “A/B” tests, remain the gold standard for evaluating the causal effect of a policy intervention or product change. However, experimental settings, such as social networks, where users are interacting and influencing one another, may violate conventional assumptions of no interference for credible causal inference. Existing solutions to the network setting include accounting for the fraction or count of treated neighbors in a user’s network, yet most current methods do not account for the local network structure beyond simply counting the number of neighbors. Our study provides an approach that accounts for both the local structure in a user’s social network via motifs as well as the treatment assignment conditions of neighbors. We propose a two-part approach. We first introduce and employ “causal network motifs”, which are network motifs that characterize the assignment conditions in local ego networks; and then we propose a tree-based algorithm for identifying different network interference conditions and estimating their average potential outcomes. Our approach can account for social network theories, such as structural diversity and echo chambers, and also can help specify network interference conditions that are suitable to each experiment. We test our method on a synthetic network setting and on a real-world experiment on a large-scale network, which highlight how accounting for local structures can better account for different interference patterns in networks.
- Christoph Adami, Jifeng Qian, Matthew Rupp, and Arend Hintze. 2011. Information content of colored motifs in complex networks. Artif Life 17, 4 (2011), 375–390.Google ScholarDigital Library
- Nesreen K Ahmed, Jennifer Neville, Ryan A Rossi, Nick G Duffield, and Theodore L Willke. 2017. Graphlet decomposition: Framework, algorithms, and applications. KAIS 50, 3 (2017), 689–722.Google ScholarDigital Library
- Uri Alon. 2007. Network motifs: theory and experimental approaches. Nat Rev Genet 8, 6 (2007), 450–461.Google ScholarCross Ref
- Joshua D Angrist and Jörn-Steffen Pischke. 2008. Mostly harmless econometrics: An empiricist’s companion. Princeton University Press.Google Scholar
- Elliott M Antman, Joseph Lau, Bruce Kupelnick, Frederick Mosteller, and Thomas C Chalmers. 1992. A comparison of results of meta-analyses of randomized control trials and recommendations of clinical experts: treatments for myocardial infarction. JAMA 268, 2 (1992), 240–248.Google ScholarCross Ref
- Sinan Aral. 2016. Networked experiments. The Oxford Handbook of the Economics of Networks (2016), 376–411.Google Scholar
- Sinan Aral and Dylan Walker. 2011. Creating social contagion through viral product design: A randomized trial of peer influence in networks. Manage Sci 57, 9 (2011), 1623–1639.Google ScholarDigital Library
- Peter M Aronow and Cyrus Samii. 2017. Estimating average causal effects under general interference, with application to a social network experiment. Ann Appl Stat (2017).Google Scholar
- Bruno Arpino, Luca De Benedictis, and Alessandra Mattei. 2015. Implementing propensity score matching with network data: The effect of GATT on bilateral trade. (2015).Google Scholar
- Susan Athey, Dean Eckles, and Guido W Imbens. 2018. Exact p-values for network interference. JASA 113, 521 (2018), 230–240.Google ScholarCross Ref
- Susan Athey and Guido Imbens. 2016. Recursive partitioning for heterogeneous causal effects. PNAS (2016).Google Scholar
- Eytan Bakshy, Dean Eckles, Rong Yan, and Itamar Rosenn. 2012. Social influence in social advertising: evidence from field experiments. In EC. 146–161.Google Scholar
- Eytan Bakshy, Solomon Messing, and Lada A Adamic. 2015. Exposure to ideologically diverse news and opinion on Facebook. Science 348, 6239 (2015), 1130–1132.Google Scholar
- Guillaume Basse and Avi Feller. 2018. Analyzing two-stage experiments in the presence of interference. JASA 113, 521 (2018), 41–55.Google ScholarCross Ref
- Christopher M Bishop. 2006. Pattern Recognition and Machine Learning.Google Scholar
- Jake Bowers, Mark M Fredrickson, and Costas Panagopoulos. 2013. Reasoning about interference between units: A general framework. Polit Anal (2013), 97–124.Google Scholar
- Damon Centola. 2010. The spread of behavior in an online social network experiment. Science (2010).Google Scholar
- Damon Centola and Michael Macy. 2007. Complex contagions and the weakness of long ties. AJS 113, 3 (2007), 702–734.Google ScholarCross Ref
- Deepayan Chakrabarti and Christos Faloutsos. 2006. Graph mining: Laws, generators, and algorithms. ACM computing surveys (CSUR) 38, 1 (2006).Google ScholarDigital Library
- Jin Chen, Wynne Hsu, Mong Li Lee, and See-Kiong Ng. 2007. Labeling network motifs in protein interactomes for protein function prediction. In ICDM. 546–555.Google Scholar
- Victor Chernozhukov, Denis Chetverikov, Mert Demirer, Esther Duflo, Christian Hansen, and Whitney Newey. 2017. Double/debiased/neyman machine learning of treatment effects. Am Econ Rev (2017).Google Scholar
- Alex Chin. 2019. Regression adjustments for estimating the global treatment effect in experiments with interference. JCI 7, 2 (2019).Google Scholar
- Diane J Cook and Lawrence B Holder. 2006. Mining graph data. John Wiley & Sons.Google ScholarDigital Library
- Pedro Domingos and Geoff Hulten. 2000. Mining high-speed data streams. In KDD. 71–80.Google Scholar
- Dean Eckles and Eytan Bakshy. 2020. Bias and high-dimensional adjustment in observational studies of peer effects. JASA (2020).Google Scholar
- Dean Eckles, Brian Karrer, and Johan Ugander. 2016. Design and analysis of experiments in networks: Reducing bias from interference. JCI 5, 1 (2016).Google Scholar
- Naoki Egami, Christian J Fong, Justin Grimmer, Margaret E Roberts, and Brandon M Stewart. 2018. How to make causal inferences using texts. arXiv preprint (2018).Google Scholar
- RA Fisher. 1937. The design of experiments.Number 2nd Ed. Oliver & Boyd, Edinburgh & London.Google Scholar
- Seth Flaxman, Sharad Goel, and Justin M Rao. 2016. Filter bubbles, echo chambers, and online news consumption. Public Opin Q 80, S1 (2016), 298–320.Google ScholarCross Ref
- Laura Forastiere, Edoardo M Airoldi, and Fabrizia Mealli. 2020. Identification and estimation of treatment and interference effects in observational studies on networks. JASA (2020), 1–18.Google Scholar
- Brian Gallagher and Tina Eliassi-Rad. 2008. Leveraging label-independent features for classification in sparsely labeled networks: An empirical study. In SNAKDD. Springer, 1–19.Google Scholar
- Alan S Gerber and Donald P Green. 2012. Field experiments: Design, analysis, and interpretation. WW Norton.Google Scholar
- Cassandra Handan-Nader, Daniel E Ho, and Becky Elias. 2020. Feasible Policy Evaluation by Design: A Randomized Synthetic Stepped-Wedge Trial of Mandated Disclosure in King County. Eval Rev (2020).Google Scholar
- Keisuke Hirano and Guido W Imbens. 2004. Applied Bayesian modeling and causal inference from incomplete-data perspectives. Vol. 226164. Chapter The propensity score with continuous treatments, 73–84.Google Scholar
- Daniel E Ho. 2017. Does peer review work: An experiment of experimentalism. Stan L Rev 69(2017), 1.Google Scholar
- Kosuke Imai, Zhichao Jiang, and Anup Malani. 2020. Causal inference with interference and noncompliance in two-stage randomized experiments. JASA (2020), 1–13.Google Scholar
- Kosuke Imai and David A Van Dyk. 2004. Causal inference with general treatment regimes: Generalizing the propensity score. JASA 99, 467 (2004), 854–866.Google ScholarCross Ref
- Ron Kohavi, Alex Deng, Brian Frasca, Toby Walker, Ya Xu, and Nils Pohlmann. 2013. Online controlled experiments at large scale. In KDD. 1168–1176.Google Scholar
- Sören R Künzel, Jasjeet S Sekhon, Peter J Bickel, and Bin Yu. 2019. Metalearners for estimating heterogeneous treatment effects using machine learning. PNAS 116, 10 (2019), 4156–4165.Google ScholarCross Ref
- Michael P Leung. 2020. Treatment and spillover effects under network interference. Rev Econ Stat 102, 2 (2020), 368–380.Google ScholarCross Ref
- Bing Liu, Yiyuan Xia, and Philip S Yu. 2000. Clustering through decision tree construction. In CIKM. 20–29.Google Scholar
- Anne B Loucks and Jean R Thuma. 2003. Luteinizing hormone pulsatility is disrupted at a threshold of energy availability in regularly menstruating women. J Clin Endocrinol Metab 88, 1 (2003), 297–311.Google ScholarCross Ref
- Ron Milo, Shai Shen-Orr, Shalev Itzkovitz, Nadav Kashtan, Dmitri Chklovskii, and Uri Alon. 2002. Network motifs: simple building blocks of complex networks. Science 298, 5594 (2002), 824–827.Google Scholar
- Jean Pouget-Abadie, Guillaume Saint-Jacques, Martin Saveski, Weitao Duan, S Ghosh, Y Xu, and Edoardo M Airoldi. 2019. Testing for arbitrary interference on experimentation platforms. Biometrika 106, 4 (2019), 929–940.Google ScholarCross Ref
- J. Ross Quinlan. 1986. Induction of decision trees. Mach Learn (1986).Google Scholar
- Pedro Ribeiro and Fernando Silva. 2014. Discovering colored network motifs. In Complex Networks. Springer, 107–118.Google Scholar
- Margaret E Roberts, Brandon M Stewart, and Richard A Nielsen. 2018. Adjusting for confounding with text matching. AJPS (2018).Google Scholar
- Paul R Rosenbaum. 2007. Interference between units in randomized experiments. JASA 102, 477 (2007), 191–200.Google ScholarCross Ref
- Paul R Rosenbaum and Donald B Rubin. 1983. The central role of the propensity score in observational studies for causal effects. Biometrika 70, 1 (1983), 41–55.Google ScholarCross Ref
- Donald B Rubin. 2005. Causal inference using potential outcomes: Design, modeling, decisions. JASA 100, 469 (2005), 322–331.Google ScholarCross Ref
- Anida Sarajlić, Noël Malod-Dognin, Ömer Nebil Yaveroğlu, and Nataša Pržulj. 2016. Graphlet-based characterization of directed networks. Sci Rep 6(2016), 35098.Google ScholarCross Ref
- CE Särndal, B Swensson, and J Wretman. 1992. Model assisted survey sampling Springer. Springer.Google Scholar
- Martin Saveski, Jean Pouget-Abadie, Guillaume Saint-Jacques, Weitao Duan, Souvik Ghosh, Ya Xu, and Edoardo M Airoldi. 2017. Detecting network effects: Randomizing over randomized experiments. In KDD.Google Scholar
- Fredrik Sävje, Michael J Higgins, and Jasjeet S Sekhon. 2017. Generalized full matching and extrapolation of the results from a large-scale voter mobilization experiment. arXiv preprint (2017).Google Scholar
- Shaun R Seaman and Ian R White. 2013. Review of inverse probability weighting for dealing with missing data. Stat Methods Med Res(2013).Google Scholar
- Glenn Shafer and Vladimir Vovk. 2008. A Tutorial on Conformal Prediction.JMLR 9, 3 (2008).Google Scholar
- Jerzy Splawa-Neyman, Dorota M Dabrowska, and TP Speed. 1990. On the application of probability theory to agricultural experiments.Stat Sci (1990), 465–472.Google Scholar
- Jessica Su, Krishna Kamath, Aneesh Sharma, Johan Ugander, and Sharad Goel. 2020. An Experimental Study of Structural Diversity in Social Networks. In ICWSM, Vol. 14. 661–670.Google Scholar
- Eric J Tchetgen Tchetgen and Tyler J VanderWeele. 2012. On causal inference in the presence of interference. Stat Methods Med Res(2012).Google Scholar
- Johan Ugander, Lars Backstrom, Cameron Marlow, and Jon Kleinberg. 2012. Structural diversity in social contagion. PNAS 109, 16 (2012), 5962–5966.Google ScholarCross Ref
- Johan Ugander, Brian Karrer, Lars Backstrom, and Jon Kleinberg. 2013. Graph cluster randomization: Network exposure to multiple universes. In KDD. 329–337.Google Scholar
- Johan Ugander and Hao Yin. 2020. Randomized Graph Cluster Randomization. arXiv preprint arXiv:2009.02297(2020).Google Scholar
- Tyler J VanderWeele. 2008. Ignorability and stability assumptions in neighborhood effects research. Stat Med (2008).Google Scholar
- Stefan Wager and Susan Athey. 2018. Estimation and inference of heterogeneous treatment effects using random forests. JASA (2018).Google Scholar
- Duncan J Watts and Steven H Strogatz. 1998. Collective dynamics of ‘small-world’networks. Nature 393, 6684 (1998), 440–442.Google Scholar
- Daniel Westreich and Stephen R Cole. 2010. Invited commentary: positivity in practice. Am J Epidemiol (2010).Google Scholar
- Jeffrey C Wong. 2020. Computational Causal Inference. arXiv preprint (2020).Google Scholar
- Ya Xu, Nanyu Chen, Addrian Fernandez, Omar Sinno, and Anmol Bhasin. 2015. From infrastructure to culture: A/B testing challenges in large scale social networks. In KDD. 2227–2236.Google Scholar
Recommendations
A Design of Network Attack Detection Using Causal and Non-causal Temporal Convolutional Network
Science of Cyber SecurityAbstractTemporal Convolution Network(TCN) has recently been introduced in the cybersecurity field, where two types of TCNs that consider causal relationships are used: causal TCN and non-causal TCN. Previous researchers have utilized causal and non-causal ...
Analyzing Online Transaction Networks with Network Motifs
KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data MiningNetwork motif is a kind of frequently occurring subgraph that reflects local topology in graphs. Although network motif has been studied in graph analytics, e.g., social network and biological network, it is yet unclear whether network motif is useful ...
Detecting multiple stochastic network motifs in network data
Network motifs are referred to as the interaction patterns that occur significantly more often in a complex network than in the corresponding randomized networks. They have been found effective in characterizing many real-world networks. A number of ...
Comments