ABSTRACT
In this paper we describe a framework to uncover potential causal relations between event mentions from streaming text of news media. This framework relies on a dataset of manually labeled events to train a recurrent neural network for event detection. It then creates a time series of event clusters, where clusters are based on BERT contextual word embedding representations of the identified events. Using these time series dataset, we assess four methods based on Granger causality for inferring causal relations. Granger causality is a statistical concept of causality that is based on forecasting. It states that a cause occurs before the effect, and the cause produces unique changes in the effect, so past values of the cause help predict future values of the effect. The four analyzed methods are the pairwise Granger test, VAR(1), BigVar and SiMoNe. The framework is applied to the New York Times dataset, which covers news for a period of 246 months. This preliminary analysis delivers important insights into the nature of each method, identifies differences and commonalities, and points out some of their strengths and weaknesses.
- Jayadev Acharya, Arnab Bhattacharyya, Constantinos Daskalakis, and Saravanan Kandasamy. 2018. Learning and testing causal models with interventions. In Advances in Neural Information Processing Systems. MIT press, Montréal, Canada, 9447--9460.Google Scholar
- Ananth Balashankar, Sunandan Chakraborty, Samuel Fraiberger, and Lakshmi-narayanan Subramanian. 2019. Identifying Predictive Causal Factors from News Streams. In Proceedings of EMNLP-IJCNLP 2019. Association for Computational Linguistics, Hong Kong, China, 2338--2348. https://doi.org/10.18653/v1/D19-1238Google ScholarCross Ref
- Elias Bareinboim and Judea Pearl. 2015. Causal inference from big data: Theoretical foundations and the data-fusion problem. Technical Report. DTIC Document.Google Scholar
- Camille Charbonnier, Julien Chiquet, and Christophe Ambroise. 2010. Weighted-LASSO for structured network inference from time course data. Statistical applications in genetics and molecular biology 9, 1 (2010), 15.Google Scholar
- Julien Chiquet, Alexander Smith, Gilles Grasseau, Catherine Matias, and Christophe Ambroise. 2009. Simone: Statistical inference for modular networks. Bioinformatics 25, 3 (2009), 417--418.Google ScholarDigital Library
- Rahim Dehkharghani, Hanefi Mercan, Arsalan Javeed, and Yucel Saygin. 2014. Sentimental causal rule discovery from Twitter. Expert Systems with Applications 41, 10 (2014), 4950--4958.Google ScholarCross Ref
- Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.Google Scholar
- Martin Ester, Hans-Peter Kriegel, Jörg Sander, Xiaowei Xu, et al. 1996. A density-based algorithm for discovering clusters in large spatial databases with noise. In KDD. AAAI Press, Portland, OR, USA, 226--231.Google Scholar
- Hao Fang. 2018. Multivariate density forecast evaluation and nonparametric Granger causality testing. Universiteit van Amsterdam, Amsterdam, Netherlands.Google Scholar
- Nir Friedman and Daphne Koller. 2003. Being Bayesian about network structure. A Bayesian approach to structure discovery in Bayesian networks. Machine learning 50, 1-2 (2003), 95--125.Google Scholar
- Roxana Girju and Dan Moldovan. 2002. Text Mining for Causal Relations. In In Proceedings of the FLAIRS Conference. AAAI Press, Pensacola, FL, USA, 360--364.Google Scholar
- Clive WJ Granger. 1969. Investigating causal relations by econometric models and cross-spectral methods. Econometrica: Journal of the Econometric Society 37, 3 (1969), 424--438.Google ScholarCross Ref
- Clive WJ Granger. 1988. Some recent development in a concept of causality. Journal of econometrics 39, 1-2 (1988), 199--211.Google ScholarCross Ref
- Aapo Hyvärinen, Kun Zhang, Shohei Shimizu, and Hoyer Patrik. 2010. Estimation of a structural vector autoregression model using non-gaussianity. Journal of Machine Learning Research 11 (2010), 1709--1731.Google ScholarDigital Library
- Daphne Koller and Nir Friedman. 2009. Probabilistic graphical models: principles and techniques. MIT press, Cambridge, MA, USA.Google Scholar
- Kumar Mainali, Sharon Bewick, Briana Vecchio-Pagan, David Karig, and William F Fagan. 2019. Detecting interaction networks in the human micro-biome with conditional Granger causality. PLoS computational biology 15, 5 (2019), e1007037.Google Scholar
- Mariano Maisonnave, Fernando Delbianco, Fernando Tohmé, Ana Maguitman, and Evangelos Milios. 2020. Improving Event Detection using Contextual Word and Sentence Embeddings. arXiv preprint arXiv:2007.01379.Google Scholar
- Nicolai Meinshausen and Peter Bühlmann. 2006. High-dimensional graphs and variable selection with the lasso. The annals of statistics 34, 3 (2006), 1436--1462.Google Scholar
- Peter Molenaar. 2019. Granger causality testing with intensive longitudinal data. Prevention Science 20(3) (2019), 442--451.Google Scholar
- William Nicholson, David Matteson, and Jacob Bien. 2017. Bigvar: Tools for modeling sparse high-dimensional multivariate time series. arXiv preprint arXiv:1702.07094.Google Scholar
- Judea Pearl. 2009. Causality. Cambridge university press, Cambridge, England.Google Scholar
- Kira Radinsky, Sagie Davidovich, and Shaul Markovitch. 2012. Learning Causality for News Events Prediction. In Proceedings of the 21st International Conference on World Wide Web (Lyon, France) (WWW '12). ACM, New York, NY, USA, 909--918. https://doi.org/10.1145/2187836.2187958Google ScholarDigital Library
- Olivia Sanchez-Graillet and Massimo Poesio. 2004. Acquiring Bayesian Networks from Text. In LREC. ELRA, Lisbon, Portugal.Google Scholar
- Craig Silverstein, Sergey Brin, Rajeev Motwani, and Jeff Ullman. 2000. Scalable techniques for mining causal structures. Data Mining and Knowledge Discovery 4, 2-3 (2000), 163--192.Google ScholarDigital Library
- Christopher A Sims. 1980. Macroeconomics and reality. Econometrica: journal of the Econometric Society 48, 1 (1980), 1--48.Google Scholar
- Peter Spirtes and Kun Zhang. 2016. Causal discovery and inference: concepts and recent methodological advances. Applied Informatics 3, 3 (2016). https://doi.org/10.1186/s40535-016-0018-xGoogle Scholar
- Harald Steck et al. 2001. Constraint-based structural learning in Bayesian networks using finite data sets. Ph.D. Dissertation. Technischen Universität München.Google Scholar
- Hal R Varian. 2014. Bigdata: Newtricks for econometrics. The Journal of Economic Perspectives 28, 2 (2014), 3--27.Google ScholarCross Ref
Index Terms
- Assessing Causality Structures learned from Digital Text Media
Recommendations
A Wiener Causality Defined by Divergence
AbstractDiscovering causal relationships is a fundamental task in investigating the dynamics of complex systems (Pearl in Stat Surv 3:96–146, 2009). Traditional approaches like Granger causality or transfer entropy fail to capture all the interdependence ...
Granger causality test with nonlinear neural-network-based methods: Python package and simulation study
Highlights- Granger causality test has limitations due to the usage of autoregressive models.
Abstract Background and objectiveCausality defined by Granger in 1969 is a widely used concept, particularly in neuroscience and economics. As there is an increasing interest in nonlinear causality research, a Python package with a ...
Variable-lag Granger Causality and Transfer Entropy for Time Series Analysis
Granger causality is a fundamental technique for causal inference in time series data, commonly used in the social and biological sciences. Typical operationalizations of Granger causality make a strong assumption that every time point of the effect ...
Comments