ABSTRACT
Social media have democratized content creation and have made it easy for anybody to spread information online. However, stripping traditional media from their gate-keeping role has left the public unprotected against biased, deceptive and disinformative content, which could now travel online at breaking-news speed and influence major public events. For example, during the COVID-19 pandemic, a new blending of medical and political disinformation has given rise to the first global infodemic. We offer an overview of the emerging and inter-connected research areas of fact-checking, disinformation, "fake news'', propaganda, and media bias detection. We explore the general fact-checking pipeline and important elements thereof such as check-worthiness estimation, spotting previously fact-checked claims, stance detection, source reliability estimation, detection of persuasion techniques, and detecting malicious users in social media. We also cover large-scale pre-trained language models, and the challenges and opportunities they offer for generating and for defending against neural fake news. Finally, we discuss the ongoing COVID-19 infodemic.
- Firoj Alam, Stefano Cresci, Tanmoy Chakraborty, Fabrizio Silvestri, Dimiter Dimitrov, Giovanni Da San Martino, Shaden Shaar, Hamed Firooz, and Preslav Nakov. 2021 a. A Survey on Multimodal Disinformation Detection. arxiv: 2103.12541 [cs.MM]Google Scholar
- Firoj Alam, Fahim Dalvi, Shaden Shaar, Nadir Durrani, Hamdy Mubarak, Alex Nikolov, Giovanni Da San Martino, Ahmed Abdelali, Hassan Sajjad, Kareem Darwish, and Preslav Nakov. 2021 b. Fighting the COVID-19 Infodemic in Social Media: A Holistic Perspective and a Call to Arms. In Proceedings of the Fifteenth International AAAI Conference on Web and Social Media (ICWSM '21). AAAI Press, Online, 913--922.Google ScholarCross Ref
- Isabelle Augenstein, Christina Lioma, Dongsheng Wang, Lucas Chaves Lima, Casper Hansen, Christian Hansen, and Jakob Grue Simonsen. 2019. MultiFC: A Real-World Multi-Domain Dataset for Evidence-Based Fact Checking of Claims. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP '19). Association for Computational Linguistics, Hong Kong, China, 4685--4697.Google ScholarCross Ref
- Ricardo Baeza-Yates. 2018. Bias on the Web. Commun. ACM , Vol. 61, 6 (May 2018), 54--61.Google ScholarDigital Library
- Ramy Baly, Georgi Karadzhov, Jisun An, Haewoon Kwak, Yoan Dinkov, Ahmed Ali, James Glass, and Preslav Nakov. 2020. What Was Written vs. Who Read It: News Media Profiling Using Text Analysis and Social Media Context. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL '20). Association for Computational Linguistics, Online, 3364--3374.Google ScholarCross Ref
- Ramy Baly, Georgi Karadzhov, Abdelrhman Saleh, James Glass, and Preslav Nakov. 2019. Multi-Task Ordinal Regression for Jointly Predicting the Trustworthiness and the Leading Political Ideology of News Media. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT '19). Association for Computational Linguistics, Minneapolis, Minnesota, USA, 2109--2116.Google ScholarCross Ref
- Alberto Barrón-Cede no, Giovanni Da San Martino, Israa Jaradat, and Preslav Nakov. 2019. Proppy: Organizing the news based on their propagandistic content . Information Processing & Management , Vol. 56, 5 (2019), 1849 -- 1864.Google ScholarDigital Library
- Anya Belz. 2019. Fully Automatic Journalism: We need to talk about Nonfake News Generation. In Proceedings of the First Conference on Truth and Trust Online (TTO '19). London, United Kingdom.Google ScholarCross Ref
- Carlos Castillo, Marcelo Mendoza, and Barbara Poblete. 2011. Information Credibility on Twitter. In Proceedings of the International Conference on World Wide Web (WWW '11). Association for Computing Machinery, Hyderabad, India, 675----684.Google ScholarDigital Library
- Wenhu Chen, Hongmin Wang, Jianshu Chen, Yunkai Zhang, Hong Wang, Shiyang Li, Xiyou Zhou, and William Yang Wang. 2020. TabFact: A Large-scale Dataset for Table-based Fact Verification. In Proceedings of the 8th International Conference on Learning Representations (ICLR '20). OpenReview.net, Addis Ababa, Ethiopia.Google Scholar
- Giovanni Da San Martino, Stefano Cresci, Alberto Barrón-Cedeño, Seunghak Yu, Roberto Di Pietro, and Preslav Nakov. 2020. A Survey on Computational Propaganda Detection. In Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence (IJCAI-PRICAI '20). International Joint Conferences on Artificial Intelligence Organization, Yokohama, Japan (online), 4826--4832.Google Scholar
- Giovanni Da San Martino, Seunghak Yu, Alberto Barron-Cedeno, Rostislav Petrov, and Preslav Nakov. 2019. Fine-Grained Analysis of Propaganda in News Articles. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP '19). Association for Computational Linguistics, Hong Kong, China, 5636--5646.Google ScholarCross Ref
- Leon Derczynski, Kalina Bontcheva, Maria Liakata, Rob Procter, Geraldine Wong Sak Hoi, and Arkaitz Zubiaga. 2017. SemEval-2017 Task 8: RumourEval: Determining rumour veracity and support for rumours. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval '17). Association for Computational Linguistics, Vancouver, Canada, 60--67.Google ScholarCross Ref
- Emilio Ferrara, Onur Varol, Clayton Davis, Filippo Menczer, and Alessandro Flammini. 2016. The Rise of Social Bots. Commun. ACM , Vol. 59, 7 (June 2016), 96--104.Google ScholarDigital Library
- Genevieve Gorrell, Elena Kochkina, Maria Liakata, Ahmet Aker, Arkaitz Zubiaga, Kalina Bontcheva, and Leon Derczynski. 2019. SemEval-2019 Task 7: RumourEval, Determining Rumour Veracity and Support for Rumours. In Proceedings of the 13th International Workshop on Semantic Evaluation (SemEval '19). Association for Computational Linguistics, Minneapolis, Minnesota, USA, 845--854.Google ScholarCross Ref
- Andreas Hanselowski, Avinesh PVS, Benjamin Schiller, Felix Caspelherr, Debanjan Chaudhuri, Christian M. Meyer, and Iryna Gurevych. 2018. A Retrospective Analysis of the Fake News Challenge Stance-Detection Task. In Proceedings of the 27th International Conference on Computational Linguistics (COLING '18). Association for Computational Linguistics, Santa Fe, New Mexico, USA, 1859--1874.Google Scholar
- Momchil Hardalov, Arnav Arora, Preslav Nakov, and Isabelle Augenstein. 2021. A Survey on Stance Detection for Mis- and Disinformation Identification. arXiv/2103.00242 (2021).Google Scholar
- Naeemul Hassan, Gensheng Zhang, Fatma Arslan, Josue Caraballo, Damian Jimenez, Siddhant Gawsane, Shohedul Hasan, Minumol Joseph, Aaditya Kulkarni, Anil Kumar Nayak, Vikas Sable, Chengkai Li, and Mark Tremayne. 2017. ClaimBuster: The First-ever End-to-end Fact-checking System. Proc. VLDB Endow. , Vol. 10, 12 (Aug. 2017), 1945--1948.Google ScholarDigital Library
- Israa Jaradat, Pepa Gencheva, Alberto Barrón-Cede no, Lluís Màrquez, and Preslav Nakov. 2018. ClaimRank: Detecting Check-Worthy Claims in Arabic and English. In Proceedings of the 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT '18). Association for Computational Linguistics, New Orleans, Louisiana, USA, 26--30.Google ScholarCross Ref
- Lev Konstantinovskiy, Oliver Price, Mevan Babakar, and Arkaitz Zubiaga. 2021. Toward Automated Factchecking: Developing an Annotation Schema and Benchmark for Consistent Automated Claim Detection. Digital Threats: Research and Practice , Vol. 2, 2, Article 14 (April 2021), bibinfonumpages16 pages. https://doi.org/10.1145/3412869Google ScholarDigital Library
- Dilek Küccük and Fazli Can. 2020. Stance Detection: A Survey. ACM Comput. Surv. , Vol. 53, 1, Article 12 (Feb. 2020), bibinfonumpages37 pages.Google Scholar
- David M.J. Lazer, Matthew A. Baum, Yochai Benkler, Adam J. Berinsky, Kelly M. Greenhill, Filippo Menczer, Miriam J. Metzger, Brendan Nyhan, Gordon Pennycook, David Rothschild, Michael Schudson, Steven A. Sloman, Cass R. Sunstein, Emily A. Thorson, Duncan J. Watts, and Jonathan L. Zittrain. 2018. The science of fake news. Science , Vol. 359, 6380 (2018), 1094--1096.Google Scholar
- Yaliang Li, Jing Gao, Chuishi Meng, Qi Li, Lu Su, Bo Zhao, Wei Fan, and Jiawei Han. 2016. A Survey on Truth Discovery. SIGKDD Explor. Newsl. , Vol. 17, 2 (Feb. 2016), 1--16.Google ScholarDigital Library
- Todor Mihaylov, Tsvetomila Mihaylova, Preslav Nakov, Llu'i s Mà rquez, Georgi Georgiev, and Ivan Koychev. 2018. The dark side of news community forums: opinion manipulation trolls. Internet Res. , Vol. 28, 5 (2018), 1292--1312.Google ScholarCross Ref
- Tsvetomila Mihaylova, Georgi Karadzhov, Pepa Atanasova, Ramy Baly, Mitra Mohtarami, and Preslav Nakov. 2019. SemEval-2019 Task 8: Fact Checking in Community Question Answering Forums. In Proceedings of the 13th International Workshop on Semantic Evaluation (SemEval '19). Association for Computational Linguistics, Minneapolis, Minnesota, USA, 860--869.Google ScholarCross Ref
- Preslav Nakov, David Corney, Maram Hasanain, Firoj Alam, Tamer Elsayed, Alberto Barrón-Cedeño, Paolo Papotti, Shaden Shaar, and Giovanni Da San Martino. 2021 a. Automated Fact-Checking for Assisting Human Fact-Checkers. In Proceedings of the 30th International Joint Conference on Artificial Intelligence (IJCAI '21). International Joint Conferences on Artificial Intelligence Organization, Online, 4551--4558.Google ScholarCross Ref
- Preslav Nakov, Giovanni Da San Martino, Tamer Elsayed, Alberto Barró n-Cede n o, Rubén Míguez, Shaden Shaar, Firoj Alam, Fatima Haouari, Maram Hasanain, Nikolay Babulkov, Alex Nikolov, Gautam Kishore Shahi, Julia Maria Struß, and Thomas Mandl. 2021 b. The CLEF-2021 CheckThat! Lab on Detecting Check-Worthy Claims, Previously Fact-Checked Claims, and Fake News. In Proceedings of the 43rd European Conference on Information Retrieval (ECIR '21). Springer, Lucca, Italy, 639--649.Google Scholar
- Preslav Nakov, Husrev Taha Sencar, Jisun An, and Haewoon Kwak. 2021 c. A Survey on Predicting the Factuality and the Bias of News Media. arXiv/2103.12506 (2021).Google Scholar
- Van-Hoang Nguyen, Kazunari Sugiyama, Preslav Nakov, and Min-Yen Kan. 2020. FANG: Leveraging Social Context for Fake News Detection Using Graph Representation. In Proceedings of the 29th ACM International Conference on Information and Knowledge Management (CIKM '20). ACM, Ireland (online), 1165--1174.Google ScholarDigital Library
- Jeppe Nørregaard, Benjamin D. Horne, and Sibel Adali. 2019. NELA-GT-2018: A Large Multi-Labelled News Dataset for the Study of Misinformation in News Articles. In Proceedings of the Thirteenth International Conference on Web and Social Media (ICWSM '19), , Jü rgen Pfeffer, Ceren Budak, Yu-Ru Lin, and Fred Morstatter (Eds.). AAAI Press, Munich, Germany, 630--638.Google ScholarCross Ref
- Kashyap Popat, Subhabrata Mukherjee, Jannik Strötgen, and Gerhard Weikum. 2017. Where the Truth Lies: Explaining the Credibility of Emerging Claims on the Web and Social Media. In Proceedings of the 26th International Conference on World Wide Web Companion (WWW '17). International World Wide Web Conferences Steering Committee, Perth, Australia, 1003--1012.Google ScholarDigital Library
- Martin Potthast, Johannes Kiesel, Kevin Reinartz, Janek Bevendorff, and Benno Stein. 2018. A Stylometric Inquiry into Hyperpartisan and Fake News. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL '18). Association for Computational Linguistics, Melbourne, Australia, 231--240.Google ScholarCross Ref
- Hannah Rashkin, Eunsol Choi, Jin Yea Jang, Svitlana Volkova, and Yejin Choi. 2017. Truth of Varying Shades: Analyzing Language in Fake News and Political Fact-Checking. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP '17), , Martha Palmer, Rebecca Hwa, and Sebastian Riedel (Eds.). Association for Computational Linguistics, Copenhagen, Denmark, 2931--2937.Google ScholarCross Ref
- Shaden Shaar, Nikolay Babulkov, Giovanni Da San Martino, and Preslav Nakov. 2020. That is a Known Lie: Detecting Previously Fact-Checked Claims. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL '20). Association for Computational Linguistics, Online, 3607--3618.Google ScholarCross Ref
- Shaden Shaar, Maram Hasanain, Bayan Hamdan, Zien Sheikh Ali, Fatima Haouari, Alex Nikolov, Mucahid Kutlu, Yavuz Selim Kartal, Firoj Alam, Giovanni Da San Martino, Alberto Barró n-Cede n o, Rubén Míguez, Javier Beltrán, Tamer Elsayed, and Preslav Nakov. 2021. Overview of the CLEF-2021 CheckThat! Lab Task 1 on Check-Worthiness Estimation in Tweets and Political Debates. In Working Notes of CLEF 2021--Conference and Labs of the Evaluation Forum. CEUR-WS.org, Bucharest, Romania (online), 369--392.Google Scholar
- Kai Shu, Deepak Mahudeswaran, Suhang Wang, Dongwon Lee, and Huan Liu. 2018. FakeNewsNet: A Data Repository with News Content, Social Context and Dynamic Information for Studying Fake News on Social Media. arXiv preprint arXiv:1809.01286 (2018).Google Scholar
- Kai Shu, Amy Sliva, Suhang Wang, Jiliang Tang, and Huan Liu. 2017. Fake News Detection on Social Media: A Data Mining Perspective. SIGKDD Explor. Newsl. , Vol. 19, 1 (2017), 22--36.Google ScholarDigital Library
- James Thorne and Andreas Vlachos. 2018. Automated Fact Checking: Task Formulations, Methods and Future Directions. In Proceedings of the 27th International Conference on Computational Linguistics (COLING '18). Association for Computational Linguistics, Santa Fe, New Mexico, USA, 3346--3359.Google Scholar
- James Thorne, Andreas Vlachos, Christos Christodoulopoulos, and Arpit Mittal. 2018. FEVER: a Large-scale Dataset for Fact Extraction and VERification. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT '18). Association for Computational Linguistics, New Orleans, Louisiana, USA, 809--819.Google ScholarCross Ref
- Slavena Vasileva, Pepa Atanasova, Lluís Màrquez, Alberto Barrón-Cedeño, and Preslav Nakov. 2019. It Takes Nine to Smell a Rat: Neural Multi-Task Learning for Check-Worthiness Prediction. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP '19). INCOMA Ltd., Varna, Bulgaria, 1229--1239.Google ScholarCross Ref
- Nguyen Vo and Kyumin Lee. 2020. Where Are the Facts? Searching for Fact-checked Information to Alleviate the Spread of Fake News. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP '20). Association for Computational Linguistics, Online, 7717--7731.Google ScholarCross Ref
- Soroush Vosoughi, Deb Roy, and Sinan Aral. 2018. The spread of true and false news online. Science , Vol. 359, 6380 (2018), 1146--1151.Google Scholar
- Kai-Cheng Yang, Onur Varol, Pik-Mai Hui, and Filippo Menczer. 2020. Scalable and Generalizable Social Bot Detection through Data Selection. In Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI '20). AAAI Press, New York, New York, USA, 1096--1103.Google ScholarCross Ref
- Savvas Zannettou, Michael Sirivianos, Jeremy Blackburn, and Nicolas Kourtellis. 2019. The Web of False Information: Rumors, Fake News, Hoaxes, Clickbait, and Various Other Shenanigans. J. Data and Information Quality , Vol. 11, 3 (2019), 10:1--10:37.Google ScholarDigital Library
- Rowan Zellers, Ari Holtzman, Hannah Rashkin, Yonatan Bisk, Ali Farhadi, Franziska Roesner, and Yejin Choi. 2019. Defending Against Neural Fake News. In Advances in Neural Information Processing Systems 32 (NeurIPS '19). Curran Associates, Inc., Vancouver, Canada, 9054--9065.Google Scholar
- Xinyi Zhou, Apurva Mulay, Emilio Ferrara, and Reza Zafarani. 2020. ReCOVery: A Multimodal Repository for COVID-19 News Credibility Research. In Proceedings of the 29th ACM International Conference on Information and Knowledge Management (CIKM '20). ACM, Online, 3205--3212.Google ScholarDigital Library
- Dimitrina Zlatkova, Preslav Nakov, and Ivan Koychev. 2019. Fact-Checking Meets Fauxtography: Verifying Claims About Images. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP '19). Association for Computing Machinery, Hong Kong, China, 2099--2108.Google ScholarCross Ref
- Arkaitz Zubiaga, Ahmet Aker, Kalina Bontcheva, Maria Liakata, and Rob Procter. 2018. Detection and Resolution of Rumours in Social Media: A Survey. ACM Comput. Surv. , Vol. 51, 2, Article 32 (Feb. 2018), bibinfonumpages36 pages.Google Scholar
Recommendations
Fake News, Disinformation, Propaganda, Media Bias, and Flattening the Curve of the COVID-19 Infodemic
KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data MiningThe rise of social media has democratized content creation and has made it easy for anybody to share and to spread information online. On the positive side, this has given rise to citizen journalism, thus enabling much faster dissemination of ...
Fake News, Disinformation, Propaganda, and Media Bias
CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge ManagementThe rise of Internet and social media changed not only how we consume information, but it also democratized the process of content creation and dissemination, thus making it easily available to anybody. Despite the hugely positive impact, this situation ...
Co-spread of Misinformation and Fact-Checking Content During the Covid-19 Pandemic
Social InformaticsAbstractIn the context of the Covid-19 pandemic, the consequences of misinformation are a matter of life and death. Correcting misconceptions and false beliefs are important for injecting reliable information about the outbreak. Fact-checking ...
Comments