research-article

A Neural Model to Jointly Predict and Explain Truthfulness of Statements

Authors:
Erik Brand

The University of Queensland, Brisbane, Australia

The University of Queensland, Brisbane, Australia

0000-0002-0013-6982
View Profile

,
Kevin Roitero

University of Udine, Udine, Italy

University of Udine, Udine, Italy

0000-0002-9191-3280
View Profile

,
Michael Soprano

University of Udine, Udine, Italy

University of Udine, Udine, Italy

0000-0002-7337-7592
View Profile

,
Afshin Rahimi

The University of Queensland, Brisbane, Australia

The University of Queensland, Brisbane, Australia

0000-0001-7889-5025
View Profile

,
Gianluca Demartini

The University of Queensland, Brisbane, Australia

The University of Queensland, Brisbane, Australia

0000-0002-7311-3693
View Profile

Authors Info & Claims

Journal of Data and Information Quality Volume 15 Issue 1Article No.: 4pp 1–19https://doi.org/10.1145/3546917

Published:28 December 2022Publication History

Journal of Data and Information Quality

Abstract

Automated fact-checking (AFC) systems exist to combat disinformation, however, their complexity usually makes them opaque to the end-user, making it difficult to foster trust in the system. In this article, we introduce the E-BART model with the hope of making progress on this front. E-BART is able to provide a veracity prediction for a claim and jointly generate a human-readable explanation for this decision. We show that E-BART is competitive with the state-of-the-art on the e-FEVER and e-SNLI tasks. In addition, we validate the joint-prediction architecture by showing (1) that generating explanations does not significantly impede the model from performing well in its main task of veracity prediction, and (2) that predicted veracity and explanations are more internally coherent when generated jointly than separately. We also calibrate the E-BART model, allowing the output of the final model to be correctly interpreted as the confidence of correctness. Finally, we also conduct an extensive human evaluation on the impact of generated explanations and observe that: Explanations increase human ability to spot misinformation and make people more skeptical about claims, and explanations generated by E-BART are competitive with ground truth explanations.

REFERENCES

[1] Ahmadi Naser, Lee Joohyung, Papotti Paolo, and Saeed Mohammed. 2019. Explainable fact checking with probabilistic answer set programming. In Proceedings of the Truth and Trust Online Conference.Google ScholarCross Ref
[2] Atanasova Pepa, Simonsen Jakob Grue, Lioma Christina, and Augenstein Isabelle. 2020. Generating fact checking explanations. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 7352–7364.Google ScholarCross Ref
[3] Beckler Dylan T., Thumser Zachary C., Schofield Jonathon S., and Marasco Paul D.. 2018. Reliability in evaluator-based tests: Using simulation-constructed models to determine contextually relevant agreement thresholds. BMC Med. Res. Methodol. 18, 1 (2018), 1–12.Google ScholarCross Ref
[4] Bowman Samuel R., Angeli Gabor, Potts Christopher, and Manning Christopher D.. 2015. A large annotated corpus for learning natural language inference. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 632–642.Google ScholarCross Ref
[5] Brown Tom B., Mann Benjamin, Ryder Nick, Subbiah Melanie, Kaplan Jared, Dhariwal Prafulla, Neelakantan Arvind, Shyam Pranav, Sastry Girish, Askell Amanda, et al. 2020. Language models are few-shot learners. In Proceedings of the Annual Conference on Neural Information Processing Systems.Google Scholar
[6] Camburu Oana-Maria, Rocktäschel Tim, Lukasiewicz Thomas, and Blunsom Phil. 2018. E-SNLI: Natural language inference with natural language explanations. In Proceedings of the Annual Conference on Neural Information Processing Systems. 9560–9572.Google Scholar
[7] Denaux Ronald and Gomez-Perez Jose Manuel. 2020. Linked credibility reviews for explainable misinformation detection. In The Semantic Web – ISWC 2020. Springer International Publishing, 147–163.Google ScholarDigital Library
[8] Devlin Jacob, Chang Ming-Wei, Lee Kenton, and Toutanova Kristina. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics. 4171–4186.Google Scholar
[9] Lucas Graves. 2018. Understanding the promise and limits of automated fact-checking. Factsheet published by the Reuters Institute for the Study of Journalism (2018).Google Scholar
[10] Guo Chuan, Pleiss Geoff, Sun Yu, and Weinberger Kilian Q.. 2017. On calibration of modern neural networks. In Proceedings of the 34th International Conference on Machine Learning (ICML’17). JMLR.org, 1321–1330.Google Scholar
[11] Hanselowski Andreas, Zhang Hao, Li Zile, Sorokin Daniil, Schiller Benjamin, Schulz Claudia, and Gurevych Iryna. 2018. UKP-Athene: Multi-sentence textual entailment for claim verification. In Proceedings of the 1st Workshop on Fact Extraction and VERification (FEVER).Google ScholarCross Ref
[12] Kim Seonhoon, Kang Inho, and Kwak Nojun. 2019. Semantic sentence matching with densely-connected recurrent and co-attentive information. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence. AAAI Press, 6586–6593.Google ScholarDigital Library
[13] Kotonya Neema and Toni Francesca. 2020. Explainable automated fact-checking for public health claims. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 7740–7754.Google ScholarCross Ref
[14] Klaus Krippendorff. 2011. Computing Krippendorff’s alpha-reliability. Working Paper. Retrieved from https://repository.upenn.edu/asc_papers/43.Google Scholar
[15] Kutlu Mücahid, McDonnell Tyler, Elsayed Tamer, and Lease Matthew. 2020. Annotator rationales for labeling tasks in crowdsourcing. J. Artif. Intell. Res. 69 (2020), 143–189.Google ScholarCross Ref
[16] Lewis Mike, Liu Yinhan, Goyal Naman, Ghazvininejad Marjan, Mohamed Abdelrahman, Levy Omer, Stoyanov Veselin, and Zettlemoyer Luke. 2020. BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 7871–7880.Google ScholarCross Ref
[17] Liu Xiaodong, He Pengcheng, Chen Weizhu, and Gao Jianfeng. 2019. Multi-task deep neural networks for natural language understanding. In Proceedings of the 57th Conference of the Association for Computational Linguistics. Association for Computational Linguistics, 4487–4496.Google ScholarCross Ref
[18] Liu Yinhan, Ott Myle, Goyal Naman, Du Jingfei, Joshi Mandar, Chen Danqi, Levy Omer, Lewis Mike, Zettlemoyer Luke, and Stoyanov Veselin. 2019. RoBERTa: A robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692 (2019).Google Scholar
[19] Naeini Mahdi Pakdaman, Cooper Gregory F., and Hauskrecht Milos. 2015. Obtaining well calibrated probabilities using Bayesian binning. In Proceedings of the 29th AAAI Conference on Artificial Intelligence (AAAI’15). AAAI Press, 2901–2907. Google Scholar
[20] Nie Yixin, Chen Haonan, and Bansal Mohit. 2019. Combining fact extraction and verification with neural semantic matching networks. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence. 6859–6866.Google ScholarDigital Library
[21] Pennycook Gordon, Epstein Ziv, Mosleh Mohsen, Arechar Antonio A., Eckles Dean, and Rand David G.. 2021. Shifting attention to accuracy can reduce misinformation online. Nature 592, 7855 (2021), 590–595.Google ScholarCross Ref
[22] Pilault Jonathan, Elhattami Amine, and Pal Christopher J.. 2021. Conditionally adaptive multi-task learning: Improving transfer learning in NLP using fewer parameters & less data. In Proceedings of the 9th International Conference on Learning Representations.Google Scholar
[23] Platt John et al. 1999. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Adv. Large Margin Classif. 10, 3 (1999), 61–74.Google Scholar
[24] Portelli Beatrice, Zhao Jason, Schuster Tal, Serra Giuseppe, and Santus Enrico. 2020. Distilling the evidence to augment fact verification models. In Proceedings of the 3rd Workshop on Fact Extraction and VERification (FEVER). Association for Computational Linguistics, 47–51.Google ScholarCross Ref
[25] Mücahid Kutlu, Tyler McDonnell, Tamer Elsayed, and Matthew Lease. 2020. Annotator rationales for labeling tasks in crowdsourcing. J. Artif. Intell. Res. 69 (2020), 143–189. Google ScholarCross Ref
[26] Reimers Nils and Gurevych Iryna. 2019. Sentence-BERT: Sentence embeddings using siamese BERT-networks. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Association for Computational Linguistics, 3980–3990.Google ScholarCross Ref
[27] Roitero Kevin, Soprano Michael, Fan Shaoyang, Spina Damiano, Mizzaro Stefano, and Demartini Gianluca. 2020. Can the crowd identify misinformation objectively? The effects of judgment scale and assessor’s background. In Proceedings of the 43rd International ACM SIGIR Conference. Association for Computing Machinery, New York, NY, 439–448.Google Scholar
[28] Sanh Victor, Debut Lysandre, Chaumond Julien, and Wolf Thomas. 2019. DistilBERT, a distilled version of BERT: Smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019).Google Scholar
[29] Shleifer Sam and Rush Alexander M.. 2020. Pre-trained summarization distillation. arXiv preprint arXiv:2010.13002 (2020).Google Scholar
[30] Shu Kai, Cui Limeng, Wang Suhang, Lee Dongwon, and Liu Huan. 2019. dEFEND: Explainable fake news detection. In Proceedings of the 25th ACM SIGKDD International Conference. Association for Computing Machinery, New York, NY, 395–405.Google ScholarDigital Library
[31] Soleimani Amir, Monz Christof, and Worring Marcel. 2020. BERT for evidence retrieval and claim verification. In Proceedings of the 42nd European Conference on IR Research. Springer, 359–366.Google ScholarDigital Library
[32] Stammbach Dominik and Ash Elliott. 2020. e-FEVER: Explanations and summaries for automated fact checking. In Proceedings of theTruth and Trust Online Conference (TTO’20). Hacks Hackers.Google Scholar
[33] Stammbach Dominik and Neumann Guenter. 2019. Team DOMLIN: Exploiting evidence enhancement for the FEVER shared task. In Proceedings of the 2nd Workshop on Fact Extraction and VERification (FEVER). 105–109.Google ScholarCross Ref
[34] Thorne James, Vlachos Andreas, Cocarascu Oana, Christodoulopoulos Christos, and Mittal Arpit. 2018. The fact extraction and VERification (FEVER) shared task. In Proceedings of the 1st Workshop on Fact Extraction and VERification (FEVER). Association for Computational Linguistics, 1–9.Google ScholarCross Ref
[35] Toreini Ehsan, Aitken Mhairi, Coopamootoo Kovila, Elliott Karen, Zelaya Carlos Gonzalez, and Moorsel Aad van. 2020. The relationship between trust in AI and trustworthy machine learning technologies. In Proceedings of the Conference on Fairness, Accountability, and Transparency. Association for Computing Machinery, New York, NY, 272–283.Google ScholarDigital Library
[36] Vaswani Ashish, Shazeer Noam, Parmar Niki, Uszkoreit Jakob, Jones Llion, Gomez Aidan N., Kaiser Lukasz, and Polosukhin Illia. 2017. Attention is all you need. In Proceedings of the Annual Conference on Neural Information Processing Systems. 5998–6008.Google Scholar
[37] Vlachos Andreas and Riedel Sebastian. 2014. Fact checking: Task definition and dataset construction. In Proceedings of the Workshop on Language Technologies and Computational Social Science. Association for Computational Linguistics, 18–22.Google ScholarCross Ref
[38] Wu Lianwei, Rao Yuan, Zhao Yongqiang, Liang Hao, and Nazir Ambreen. 2020. DTCA: Decision tree-based co-attention networks for explainable claim verification. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 1024–1035.Google ScholarCross Ref
[39] Yoneda Takuma, Mitchell Jeff, Welbl Johannes, Stenetorp Pontus, and Riedel Sebastian. 2018. UCL machine reading group: Four factor framework for fact finding (HexaF). In Proceedings of the 1st Workshop on Fact Extraction and VERification (FEVER). Association for Computational Linguistics, 97–102.Google ScholarCross Ref
[40] Zadrozny Bianca and Elkan Charles. 2001. Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers. In Proceedings of the 18th International Conference on Machine Learning (ICML’01). 609–616. Google Scholar
[41] Zapf Antonia, Castell Stefanie, Morawietz Lars, and Karch André. 2016. Measuring inter-rater reliability for nominal data—which coefficients and confidence intervals are appropriate? BMC Med. Res. Methodol. 16, 1 (2016), 1–10.Google ScholarCross Ref
[42] Zhang Zhuosheng, Wu Yuwei, Li Zuchao, and Zhao Hai. 2018. Explicit contextual semantics for text comprehension. arXiv preprint arXiv:1809.02794 (2018).Google Scholar
[43] Zhang Zhuosheng, Wu Yuwei, Zhao Hai, Li Zuchao, Zhang Shuailiang, Zhou Xi, and Zhou Xiang. 2020. Semantics-aware BERT for language understanding. In Proceedings of the 34th AAAI Conference on Artificial Intelligence. AAAI Press, New York, NY, 9628–9635.Google ScholarCross Ref

Index Terms

A Neural Model to Jointly Predict and Explain Truthfulness of Statements
1. Human-centered computing
  1. Collaborative and social computing
2. Information systems
  1. Information retrieval

Recommendations

Believability and Harmfulness Shape the Virality of Misleading Social Media Posts
WWW '23: Proceedings of the ACM Web Conference 2023

Misinformation on social media presents a major threat to modern societies. While previous research has analyzed the virality across true and false social media posts, not every misleading post is necessarily equally viral. Rather, misinformation has ...
Read More
Explain and Predict, and then Predict Again
WSDM '21: Proceedings of the 14th ACM International Conference on Web Search and Data Mining

A desirable property of learning systems is to be both effective and interpretable. Towards this goal, recent models have been proposed that first generate an extractive explanation from the input text and then generate a prediction on just the ...
Read More
Linked Open Government Data to Predict and Explain House Prices: The Case of Scottish Statistics Portal
Abstract
Accurately estimating the prices of houses is important for various stakeholders including house owners, real estate agencies, government agencies, and policy-makers. Towards this end, traditional statistics and, only recently, ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
Journal of Data and Information Quality Volume 15, Issue 1
March 2023
197 pages
ISSN:1936-1955
EISSN:1936-1963
DOI:10.1145/3578367
Editor:
Tiziana Catarci
Sapienza University of Rome, Rome, Italy
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 28 December 2022
- Online AM: 9 July 2022
- Accepted: 19 May 2022
- Revised: 31 March 2022
- Received: 22 November 2021
Published in jdiq Volume 15, Issue 1

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Misinformation
explainable artificial intelligence
Qualifiers
- research-article
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 382
  Total Downloads
- Downloads (Last 12 months)148
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

View Full Text

HTML Format

View this article in HTML Format .

View HTML Format

A Neural Model to Jointly Predict and Explain Truthfulness of Statements

Journal of Data and Information Quality

Abstract

REFERENCES

Cited By

Index Terms

Recommendations

Believability and Harmfulness Shape the Virality of Misleading Social Media Posts

Explain and Predict, and then Predict Again

Linked Open Government Data to Predict and Explain House Prices: The Case of Scottish Statistics Portal