short-paper

Assessing Causality Structures learned from Digital Text Media

Authors:
Mariano Maisonnave

Depto. de Cs. e Ing. de la Computación, Instituto de Cs. e Ing. de la Computación (ICIC UNS-CONICET), Bahía Blanca, Argentina

Depto. de Cs. e Ing. de la Computación, Instituto de Cs. e Ing. de la Computación (ICIC UNS-CONICET), Bahía Blanca, Argentina
View Profile

,
Fernando Delbianco

Depto. de Economía, Instituto de Matemática de Bahía Blanca, (INMABB UNS-CONICET), Bahía Blanca, Argentina

Depto. de Economía, Instituto de Matemática de Bahía Blanca, (INMABB UNS-CONICET), Bahía Blanca, Argentina
View Profile

,
Fernando Tohmé

Depto. de Economía, Instituto de Matemática de Bahía Blanca, (INMABB UNS-CONICET), Bahía Blanca, Argentina

Depto. de Economía, Instituto de Matemática de Bahía Blanca, (INMABB UNS-CONICET), Bahía Blanca, Argentina
View Profile

,
Ana G. Maguitman

Depto. de Cs. e Ing. de la Computación, Instituto de Cs. e Ing. de la Computación (ICIC UNS-CONICET), Bahía Blanca, Argentina

Depto. de Cs. e Ing. de la Computación, Instituto de Cs. e Ing. de la Computación (ICIC UNS-CONICET), Bahía Blanca, Argentina
View Profile

,
Evangelos E. Milios

Faculty of Computer Science, Dalhousie University, Halifax, Nova Scotia, Canada

Faculty of Computer Science, Dalhousie University, Halifax, Nova Scotia, Canada
View Profile

DocEng '20: Proceedings of the ACM Symposium on Document Engineering 2020September 2020Article No.: 18Pages 1–4https://doi.org/10.1145/3395027.3419594

Published:29 September 2020Publication History

DocEng '20: Proceedings of the ACM Symposium on Document Engineering 2020

Pages 1–4

ABSTRACT

In this paper we describe a framework to uncover potential causal relations between event mentions from streaming text of news media. This framework relies on a dataset of manually labeled events to train a recurrent neural network for event detection. It then creates a time series of event clusters, where clusters are based on BERT contextual word embedding representations of the identified events. Using these time series dataset, we assess four methods based on Granger causality for inferring causal relations. Granger causality is a statistical concept of causality that is based on forecasting. It states that a cause occurs before the effect, and the cause produces unique changes in the effect, so past values of the cause help predict future values of the effect. The four analyzed methods are the pairwise Granger test, VAR(1), BigVar and SiMoNe. The framework is applied to the New York Times dataset, which covers news for a period of 246 months. This preliminary analysis delivers important insights into the nature of each method, identifies differences and commonalities, and points out some of their strengths and weaknesses.

References

Jayadev Acharya, Arnab Bhattacharyya, Constantinos Daskalakis, and Saravanan Kandasamy. 2018. Learning and testing causal models with interventions. In Advances in Neural Information Processing Systems. MIT press, Montréal, Canada, 9447--9460.Google Scholar
Ananth Balashankar, Sunandan Chakraborty, Samuel Fraiberger, and Lakshmi-narayanan Subramanian. 2019. Identifying Predictive Causal Factors from News Streams. In Proceedings of EMNLP-IJCNLP 2019. Association for Computational Linguistics, Hong Kong, China, 2338--2348. https://doi.org/10.18653/v1/D19-1238Google ScholarCross Ref
Elias Bareinboim and Judea Pearl. 2015. Causal inference from big data: Theoretical foundations and the data-fusion problem. Technical Report. DTIC Document.Google Scholar
Camille Charbonnier, Julien Chiquet, and Christophe Ambroise. 2010. Weighted-LASSO for structured network inference from time course data. Statistical applications in genetics and molecular biology 9, 1 (2010), 15.Google Scholar
Julien Chiquet, Alexander Smith, Gilles Grasseau, Catherine Matias, and Christophe Ambroise. 2009. Simone: Statistical inference for modular networks. Bioinformatics 25, 3 (2009), 417--418.Google ScholarDigital Library
Rahim Dehkharghani, Hanefi Mercan, Arsalan Javeed, and Yucel Saygin. 2014. Sentimental causal rule discovery from Twitter. Expert Systems with Applications 41, 10 (2014), 4950--4958.Google ScholarCross Ref
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.Google Scholar
Martin Ester, Hans-Peter Kriegel, Jörg Sander, Xiaowei Xu, et al. 1996. A density-based algorithm for discovering clusters in large spatial databases with noise. In KDD. AAAI Press, Portland, OR, USA, 226--231.Google Scholar
Hao Fang. 2018. Multivariate density forecast evaluation and nonparametric Granger causality testing. Universiteit van Amsterdam, Amsterdam, Netherlands.Google Scholar
Nir Friedman and Daphne Koller. 2003. Being Bayesian about network structure. A Bayesian approach to structure discovery in Bayesian networks. Machine learning 50, 1-2 (2003), 95--125.Google Scholar
Roxana Girju and Dan Moldovan. 2002. Text Mining for Causal Relations. In In Proceedings of the FLAIRS Conference. AAAI Press, Pensacola, FL, USA, 360--364.Google Scholar
Clive WJ Granger. 1969. Investigating causal relations by econometric models and cross-spectral methods. Econometrica: Journal of the Econometric Society 37, 3 (1969), 424--438.Google ScholarCross Ref
Clive WJ Granger. 1988. Some recent development in a concept of causality. Journal of econometrics 39, 1-2 (1988), 199--211.Google ScholarCross Ref
Aapo Hyvärinen, Kun Zhang, Shohei Shimizu, and Hoyer Patrik. 2010. Estimation of a structural vector autoregression model using non-gaussianity. Journal of Machine Learning Research 11 (2010), 1709--1731.Google ScholarDigital Library
Daphne Koller and Nir Friedman. 2009. Probabilistic graphical models: principles and techniques. MIT press, Cambridge, MA, USA.Google Scholar
Kumar Mainali, Sharon Bewick, Briana Vecchio-Pagan, David Karig, and William F Fagan. 2019. Detecting interaction networks in the human micro-biome with conditional Granger causality. PLoS computational biology 15, 5 (2019), e1007037.Google Scholar
Mariano Maisonnave, Fernando Delbianco, Fernando Tohmé, Ana Maguitman, and Evangelos Milios. 2020. Improving Event Detection using Contextual Word and Sentence Embeddings. arXiv preprint arXiv:2007.01379.Google Scholar
Nicolai Meinshausen and Peter Bühlmann. 2006. High-dimensional graphs and variable selection with the lasso. The annals of statistics 34, 3 (2006), 1436--1462.Google Scholar
Peter Molenaar. 2019. Granger causality testing with intensive longitudinal data. Prevention Science 20(3) (2019), 442--451.Google Scholar
William Nicholson, David Matteson, and Jacob Bien. 2017. Bigvar: Tools for modeling sparse high-dimensional multivariate time series. arXiv preprint arXiv:1702.07094.Google Scholar
Judea Pearl. 2009. Causality. Cambridge university press, Cambridge, England.Google Scholar
Kira Radinsky, Sagie Davidovich, and Shaul Markovitch. 2012. Learning Causality for News Events Prediction. In Proceedings of the 21st International Conference on World Wide Web (Lyon, France) (WWW '12). ACM, New York, NY, USA, 909--918. https://doi.org/10.1145/2187836.2187958Google ScholarDigital Library
Olivia Sanchez-Graillet and Massimo Poesio. 2004. Acquiring Bayesian Networks from Text. In LREC. ELRA, Lisbon, Portugal.Google Scholar
Craig Silverstein, Sergey Brin, Rajeev Motwani, and Jeff Ullman. 2000. Scalable techniques for mining causal structures. Data Mining and Knowledge Discovery 4, 2-3 (2000), 163--192.Google ScholarDigital Library
Christopher A Sims. 1980. Macroeconomics and reality. Econometrica: journal of the Econometric Society 48, 1 (1980), 1--48.Google Scholar
Peter Spirtes and Kun Zhang. 2016. Causal discovery and inference: concepts and recent methodological advances. Applied Informatics 3, 3 (2016). https://doi.org/10.1186/s40535-016-0018-xGoogle Scholar
Harald Steck et al. 2001. Constraint-based structural learning in Bayesian networks using finite data sets. Ph.D. Dissertation. Technischen Universität München.Google Scholar
Hal R Varian. 2014. Bigdata: Newtricks for econometrics. The Journal of Economic Perspectives 28, 2 (2014), 3--27.Google ScholarCross Ref

Index Terms

Assessing Causality Structures learned from Digital Text Media
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Information extraction
2. Mathematics of computing
  1. Probability and statistics
    1. Probabilistic representations
      1. Causal networks
    2. Statistical paradigms
      1. Time series analysis

Recommendations

A Wiener Causality Defined by Divergence
Abstract
Discovering causal relationships is a fundamental task in investigating the dynamics of complex systems (Pearl in Stat Surv 3:96–146, 2009). Traditional approaches like Granger causality or transfer entropy fail to capture all the interdependence ...
Read More
Granger causality test with nonlinear neural-network-based methods: Python package and simulation study
Highlights
- Granger causality test has limitations due to the usage of autoregressive models.
Abstract Background and objective
Causality defined by Granger in 1969 is a widely used concept, particularly in neuroscience and economics. As there is an increasing interest in nonlinear causality research, a Python package with a ...
Read More
Variable-lag Granger Causality and Transfer Entropy for Time Series Analysis

Granger causality is a fundamental technique for causal inference in time series data, commonly used in the social and biological sciences. Typical operationalizations of Granger causality make a strong assumption that every time point of the effect ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

DocEng '20: Proceedings of the ACM Symposium on Document Engineering 2020
September 2020
130 pages
ISBN:9781450380003
DOI:10.1145/3395027

Copyright © 2020 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 29 September 2020
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Event Detection
Granger Causality
Time Series
Qualifiers
- short-paper
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate178of537submissions,33%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 134
  Total Downloads
- Downloads (Last 12 months)13
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Assessing Causality Structures learned from Digital Text Media

DocEng '20: Proceedings of the ACM Symposium on Document Engineering 2020

ABSTRACT

References

Cited By

Index Terms

Recommendations

A Wiener Causality Defined by Divergence

Granger causality test with nonlinear neural-network-based methods: Python package and simulation study

Variable-lag Granger Causality and Transfer Entropy for Time Series Analysis

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Assessing Causality Structures learned from Digital Text Media

DocEng '20: Proceedings of the ACM Symposium on Document Engineering 2020

ABSTRACT

References

Cited By

Index Terms

Recommendations

A Wiener Causality Defined by Divergence

Granger causality test with nonlinear neural-network-based methods: Python package and simulation study

Variable-lag Granger Causality and Transfer Entropy for Time Series Analysis

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media