Abstract
Automated fact-checking (AFC) systems exist to combat disinformation, however, their complexity usually makes them opaque to the end-user, making it difficult to foster trust in the system. In this article, we introduce the E-BART model with the hope of making progress on this front. E-BART is able to provide a veracity prediction for a claim and jointly generate a human-readable explanation for this decision. We show that E-BART is competitive with the state-of-the-art on the e-FEVER and e-SNLI tasks. In addition, we validate the joint-prediction architecture by showing (1) that generating explanations does not significantly impede the model from performing well in its main task of veracity prediction, and (2) that predicted veracity and explanations are more internally coherent when generated jointly than separately. We also calibrate the E-BART model, allowing the output of the final model to be correctly interpreted as the confidence of correctness. Finally, we also conduct an extensive human evaluation on the impact of generated explanations and observe that: Explanations increase human ability to spot misinformation and make people more skeptical about claims, and explanations generated by E-BART are competitive with ground truth explanations.
- [1] . 2019. Explainable fact checking with probabilistic answer set programming. In Proceedings of the Truth and Trust Online Conference.Google ScholarCross Ref
- [2] . 2020. Generating fact checking explanations. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 7352–7364.Google ScholarCross Ref
- [3] . 2018. Reliability in evaluator-based tests: Using simulation-constructed models to determine contextually relevant agreement thresholds. BMC Med. Res. Methodol. 18, 1 (2018), 1–12.Google ScholarCross Ref
- [4] . 2015. A large annotated corpus for learning natural language inference. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 632–642.Google ScholarCross Ref
- [5] . 2020. Language models are few-shot learners. In Proceedings of the Annual Conference on Neural Information Processing Systems.Google Scholar
- [6] . 2018. E-SNLI: Natural language inference with natural language explanations. In Proceedings of the Annual Conference on Neural Information Processing Systems. 9560–9572.Google Scholar
- [7] . 2020. Linked credibility reviews for explainable misinformation detection. In The Semantic Web – ISWC 2020. Springer International Publishing, 147–163.Google ScholarDigital Library
- [8] . 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics. 4171–4186.Google Scholar
- [9] Lucas Graves. 2018. Understanding the promise and limits of automated fact-checking. Factsheet published by the Reuters Institute for the Study of Journalism (2018).Google Scholar
- [10] . 2017. On calibration of modern neural networks. In Proceedings of the 34th International Conference on Machine Learning (ICML’17). JMLR.org, 1321–1330.Google Scholar
- [11] . 2018. UKP-Athene: Multi-sentence textual entailment for claim verification. In Proceedings of the 1st Workshop on Fact Extraction and VERification (FEVER).Google ScholarCross Ref
- [12] . 2019. Semantic sentence matching with densely-connected recurrent and co-attentive information. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence. AAAI Press, 6586–6593.Google ScholarDigital Library
- [13] . 2020. Explainable automated fact-checking for public health claims. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 7740–7754.Google ScholarCross Ref
- [14] Klaus Krippendorff. 2011. Computing Krippendorff’s alpha-reliability. Working Paper. Retrieved from https://repository.upenn.edu/asc_papers/43.Google Scholar
- [15] . 2020. Annotator rationales for labeling tasks in crowdsourcing. J. Artif. Intell. Res. 69 (2020), 143–189.Google ScholarCross Ref
- [16] . 2020. BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 7871–7880.Google ScholarCross Ref
- [17] . 2019. Multi-task deep neural networks for natural language understanding. In Proceedings of the 57th Conference of the Association for Computational Linguistics. Association for Computational Linguistics, 4487–4496.Google ScholarCross Ref
- [18] . 2019. RoBERTa: A robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692 (2019).Google Scholar
- [19] . 2015. Obtaining well calibrated probabilities using Bayesian binning. In Proceedings of the 29th AAAI Conference on Artificial Intelligence (AAAI’15). AAAI Press, 2901–2907. Google Scholar
- [20] . 2019. Combining fact extraction and verification with neural semantic matching networks. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence. 6859–6866.Google ScholarDigital Library
- [21] . 2021. Shifting attention to accuracy can reduce misinformation online. Nature 592, 7855 (2021), 590–595.Google ScholarCross Ref
- [22] . 2021. Conditionally adaptive multi-task learning: Improving transfer learning in NLP using fewer parameters & less data. In Proceedings of the 9th International Conference on Learning Representations.Google Scholar
- [23] . 1999. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Adv. Large Margin Classif. 10, 3 (1999), 61–74.Google Scholar
- [24] . 2020. Distilling the evidence to augment fact verification models. In Proceedings of the 3rd Workshop on Fact Extraction and VERification (FEVER). Association for Computational Linguistics, 47–51.Google ScholarCross Ref
- [25] Mücahid Kutlu, Tyler McDonnell, Tamer Elsayed, and Matthew Lease. 2020. Annotator rationales for labeling tasks in crowdsourcing. J. Artif. Intell. Res. 69 (2020), 143–189. Google ScholarCross Ref
- [26] . 2019. Sentence-BERT: Sentence embeddings using siamese BERT-networks. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Association for Computational Linguistics, 3980–3990.Google ScholarCross Ref
- [27] . 2020. Can the crowd identify misinformation objectively? The effects of judgment scale and assessor’s background. In Proceedings of the 43rd International ACM SIGIR Conference. Association for Computing Machinery, New York, NY, 439–448.Google Scholar
- [28] . 2019. DistilBERT, a distilled version of BERT: Smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019).Google Scholar
- [29] . 2020. Pre-trained summarization distillation. arXiv preprint arXiv:2010.13002 (2020).Google Scholar
- [30] . 2019. dEFEND: Explainable fake news detection. In Proceedings of the 25th ACM SIGKDD International Conference. Association for Computing Machinery, New York, NY, 395–405.Google ScholarDigital Library
- [31] . 2020. BERT for evidence retrieval and claim verification. In Proceedings of the 42nd European Conference on IR Research. Springer, 359–366.Google ScholarDigital Library
- [32] . 2020. e-FEVER: Explanations and summaries for automated fact checking. In Proceedings of theTruth and Trust Online Conference (TTO’20). Hacks Hackers.Google Scholar
- [33] . 2019. Team DOMLIN: Exploiting evidence enhancement for the FEVER shared task. In Proceedings of the 2nd Workshop on Fact Extraction and VERification (FEVER). 105–109.Google ScholarCross Ref
- [34] . 2018. The fact extraction and VERification (FEVER) shared task. In Proceedings of the 1st Workshop on Fact Extraction and VERification (FEVER). Association for Computational Linguistics, 1–9.Google ScholarCross Ref
- [35] . 2020. The relationship between trust in AI and trustworthy machine learning technologies. In Proceedings of the Conference on Fairness, Accountability, and Transparency. Association for Computing Machinery, New York, NY, 272–283.Google ScholarDigital Library
- [36] . 2017. Attention is all you need. In Proceedings of the Annual Conference on Neural Information Processing Systems. 5998–6008.Google Scholar
- [37] . 2014. Fact checking: Task definition and dataset construction. In Proceedings of the Workshop on Language Technologies and Computational Social Science. Association for Computational Linguistics, 18–22.Google ScholarCross Ref
- [38] . 2020. DTCA: Decision tree-based co-attention networks for explainable claim verification. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 1024–1035.Google ScholarCross Ref
- [39] . 2018. UCL machine reading group: Four factor framework for fact finding (HexaF). In Proceedings of the 1st Workshop on Fact Extraction and VERification (FEVER). Association for Computational Linguistics, 97–102.Google ScholarCross Ref
- [40] . 2001. Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers. In Proceedings of the 18th International Conference on Machine Learning (ICML’01). 609–616. Google Scholar
- [41] . 2016. Measuring inter-rater reliability for nominal data—which coefficients and confidence intervals are appropriate? BMC Med. Res. Methodol. 16, 1 (2016), 1–10.Google ScholarCross Ref
- [42] . 2018. Explicit contextual semantics for text comprehension. arXiv preprint arXiv:1809.02794 (2018).Google Scholar
- [43] . 2020. Semantics-aware BERT for language understanding. In Proceedings of the 34th AAAI Conference on Artificial Intelligence. AAAI Press, New York, NY, 9628–9635.Google ScholarCross Ref
Index Terms
- A Neural Model to Jointly Predict and Explain Truthfulness of Statements
Recommendations
Believability and Harmfulness Shape the Virality of Misleading Social Media Posts
WWW '23: Proceedings of the ACM Web Conference 2023Misinformation on social media presents a major threat to modern societies. While previous research has analyzed the virality across true and false social media posts, not every misleading post is necessarily equally viral. Rather, misinformation has ...
Explain and Predict, and then Predict Again
WSDM '21: Proceedings of the 14th ACM International Conference on Web Search and Data MiningA desirable property of learning systems is to be both effective and interpretable. Towards this goal, recent models have been proposed that first generate an extractive explanation from the input text and then generate a prediction on just the ...
Linked Open Government Data to Predict and Explain House Prices: The Case of Scottish Statistics Portal
AbstractAccurately estimating the prices of houses is important for various stakeholders including house owners, real estate agencies, government agencies, and policy-makers. Towards this end, traditional statistics and, only recently, ...
Comments