research-article

Explainable Machine Learning for Fake News Detection

Authors:
Julio C. S. Reis

Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, Brazil

Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, Brazil
View Profile

,
André Correia

Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, Brazil

Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, Brazil
View Profile

,
Fabrício Murai

Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, Brazil

Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, Brazil
View Profile

,
Adriano Veloso

Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, Brazil

Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, Brazil
View Profile

,
Fabrício Benevenuto

Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, Brazil

Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, Brazil
View Profile

WebSci '19: Proceedings of the 10th ACM Conference on Web ScienceJune 2019Pages 17–26https://doi.org/10.1145/3292522.3326027

Published:26 June 2019Publication History

WebSci '19: Proceedings of the 10th ACM Conference on Web Science

Pages 17–26

ABSTRACT

Recently, there have been many research efforts aiming to understand fake news phenomena and to identify typical patterns and features of fake news. Yet, the real discriminating power of these features is still unknown: some are more general, but others perform well only with specific data. In this work, we conduct a highly exploratory investigation that produced hundreds of thousands of models from a large and diverse set of features. These models are unbiased in the sense that their features are randomly chosen from the pool of available features. While the vast majority of models are ineffective, we were able to produce a number of models that yield highly accurate decisions, thus effectively separating fake news from actual stories. Specifically, we focused our analysis on models that rank a randomly chosen fake news story higher than a randomly chosen fact with more than 0.85 probability. For these models we found a strong link between features and model predictions, showing that some features are clearly tailored for detecting certain types of fake news, thus evidencing that different combinations of features cover a specific region of the fake news space. Finally, we present an explanation of factors contributing to model decisions, thus promoting civic reasoning by complementing our ability to evaluate digital content and reach warranted conclusions.

References

Hadeer Ahmed, Issa Traore, and Sherif Saad. 2017. Detection of online fake news using N-gram analysis and machine learning techniques. In Int'l Conference on Intelligent, Secure, and Dependable Systems in Distributed and Cloud Environments (ISDDC).Google Scholar
Hunt Allcott and Matthew Gentzkow. 2017. Social media and fake news in the 2016 election. Journal of Economic Perspectives 31, 2 (2017), 211--36.Google ScholarCross Ref
David Arthur and Sergei Vassilvitskii. 2007. k-means++: The advantages of careful seeding. In Proc. of the Annual ACM-SIAM symposium on Discrete Algorithms (SODA). Society for Industrial and Applied Mathematics. Google ScholarDigital Library
Ricardo Baeza-Yates and Berthier Ribeiro-Neto. 1999. Modern information retrieval. Vol. 463. ACM press New York. Google ScholarDigital Library
Sreyasee Das Bhattacharjee, Ashit Talukder, and Bala Venkatram Balantrapu. 2017. Active learning based news veracity detection with feature weighting and deep-shallow fusion. In Proc. of the Int'l Conference on Big Data (Big Data). IEEE.Google ScholarCross Ref
Carlos Castillo, Marcelo Mendoza, and Barbara Poblete. 2011. Information credibility on twitter. In Proc. of the Int'l Conference on World Wide Web (WWW). Google ScholarDigital Library
Tianqi Chen and Carlos Guestrin. 2016. Xgboost: A scalable tree boosting system. In Proc. of the Int'l Conference on Knowledge Discovery and Data Mining (KDD). Google ScholarDigital Library
Giovanni Luca Ciampaglia, Prashant Shiralkar, Luis M Rocha, Johan Bollen, Filippo Menczer, and Alessandro Flammini. 2015. Computational fact checking from knowledge networks. PLOS ONE 10, 6 (2015).Google Scholar
Niall J Conroy, Victoria L Rubin, and Yimin Chen. 2015. Automatic deception detection: Methods for finding fake news. In Proc. of the Annual Meeting of the (ASIS&T). Google ScholarDigital Library
Daniel H Dalip, Marcos André Gonçalves, Marco Cristo, and Pável Calado. 2017. A general multiview framework for assessing the quality of collaboratively created content on web 2.0. Journal of the Association for Information Science and Technology 68, 2 (2017), 286--308. Google ScholarDigital Library
Samantha Finn, Panagiotis Takis Metaxas, Eni Mustafaraj, Megan Oâ??Keefe, Lindsay Tang, Susan Tang, and Laura Zeng. 2014. TRAILS: A system for monitoring the propagation of rumors on twitter. In Proc. of the Computation + Journalism Conference (C+J).Google Scholar
Adrien Friggeri, Lada A Adamic, Dean Eckles, and Justin Cheng. 2014. Rumor Cascades. In Proc. of the Int'l AAAI Conference on Weblogs and Social (ICWSM).Google Scholar
Kevin Gallagher. 2017. The Social Media Demographics Report: Differences in age, gender, and income at the top platforms. http://www.businessinsider.com/thesocial- media-demographics-report-2017--8, Business Insider (2017).Google Scholar
Jennifer Golbeck, Matthew Mauriello, Brooke Auxier, Keval H Bhanushali, Christopher Bonk, Mohamed Amine Bouzaghrane, Cody Buntain, Riya Chanduka, Paul Cheakalos, Jennine B Everett, and others. 2018. Fake News vs Satire: A Dataset and Analysis. In Proc. of the Int'l Conference onWeb Science (WebScience). Google ScholarDigital Library
Aditi Gupta, Ponnurangam Kumaraguru, Carlos Castillo, and Patrick Meier. 2014. Tweetcred: Real-time credibility assessment of content on twitter. In Proc. of the Int'l Conference on Social Informatics (SocInfo).Google ScholarCross Ref
Zhiwei Jin, Juan Cao, Yongdong Zhang, Jianshe Zhou, and Qi Tian. 2017. Novel visual and statistical image features for microblogs news verification. IEEE Transactions on Multimedia 19, 3 (2017), 598--608. Google ScholarDigital Library
Jooyeon Kim, Behzad Tabibian, Alice Oh, Bernhard Schölkopf, and Manuel Gomez-Rodriguez. 2018. Leveraging the crowd to detect and reduce the spread of fake news and misinformation. In Proc. of the Int'l Conference on Web Search and Data Mining (WSDM). Google ScholarDigital Library
Srijan Kumar, Robert West, and Jure Leskovec. 2016. Disinformation on the web: Impact, characteristics, and detection of wikipedia hoaxes. In Proc. of the WWW Companion. Google ScholarDigital Library
Sejeong Kwon, Meeyoung Cha, and Kyomin Jung. 2017. Rumor detection over varying time windows. PLOS ONE 12, 1 (2017).Google Scholar
David MJ Lazer, Matthew A Baum, Yochai Benkler, Adam J Berinsky, Kelly M Greenhill, Filippo Menczer, Miriam J Metzger, Brendan Nyhan, Gordon Pennycook, David Rothschild, and others. 2018. The science of fake news. Science 359, 6380 (2018), 1094--1096.Google Scholar
Yaliang Li, Qi Li, Jing Gao, Lu Su, Bo Zhao, Wei Fan, and Jiawei Han. 2015. On the discovery of evolving truth. In Proc. of the Int'l Conference on Knowledge Discovery and Data Mining (KDD). Google ScholarDigital Library
Scott M Lundberg and Su-In Lee. 2017. A Unified Approach to Interpreting Model Predictions. In Proc. of the Neural Information Processing Systems (NIPS), I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). Curran Associates, Inc. Google ScholarDigital Library
Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing Data Using t-SNE. Journal of machine learning research 9, Nov (2008), 2579--2605.Google Scholar
J. W. Pennebaker, M. E. Francis, and R. J. Booth. 2001. Linguistic inquiry and word count: LIWC 2001. Mahway: Lawrence Erlbaum Associates (2001).Google Scholar
Verónica Pérez-Rosas, Bennett Kleinberg, Alexandra Lefevre, and Rada Mihalcea. 2017. Automatic detection of fake news. Proc. of the Int'l Conference on Computational Linguistics (2017).Google Scholar
Anirudh Ramachandran and Nick Feamster. 2006. Understanding the networklevel behavior of spammers. In Proc. of the Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications (SIGCOMM). Google ScholarDigital Library
Jacob Ratkiewicz, Michael Conover, Mark R Meiss, Bruno Gonçalves, Alessandro Flammini, and Filippo Menczer. 2011. Detecting and tracking political abuse in social media. In Proc. of the Int'l AAAI Conference onWeblogs and Social (ICWSM).Google Scholar
Julio C. S. Reis, André Correia, Fabrício Murai, Adriano Veloso, and Fabrício Benevenuto. 2019. Supervised Learning for Fake News Detection. IEEE Intelligent Systems 34, 2 (2019). Google ScholarDigital Library
Manoel H. Ribeiro, Pedro H. C. Guerra,Wagner Meira Jr., and VirgÃ?lio Almeida. 2017. "Everything I Disagree With is# FakeNews": Correlating Political Polarization and Spread of Misinformation. In Proc. of Data Science + Journalism Workshop.Google Scholar
Peter J Rousseeuw. 1987. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of computational and applied mathematics 20 (1987), 53--65. Google ScholarDigital Library
Victoria Rubin, Niall Conroy, Yimin Chen, and Sarah Cornwell. 2016. Fake news or truth? using satirical cues to detect potentially misleading news. In Proc. of the Workshop on Computational Approaches to Deception Detection (NAACL-HLT).Google ScholarCross Ref
Giovanni Santia and Jake Williams. 2018. BuzzFace: A News Veracity Dataset with Facebook User Commentary and Egos. In Proc. of the Int'l AAAI Conference on Weblogs and Social (ICWSM).Google Scholar
Chengcheng Shao, Giovanni Luca Ciampaglia, Alessandro Flammini, and Filippo Menczer. 2016. Hoaxy: A platform for tracking online misinformation. In Proc. of the WWW Companion. Google ScholarDigital Library
Chengcheng Shao, Giovanni Luca Ciampaglia, Onur Varol, Kai-Cheng Yang, Alessandro Flammini, and Filippo Menczer. 2018. The spread of low-credibility content by social bots. Nature communications 9, 1 (2018), 4787.Google Scholar
Kai Shu, Amy Sliva, Suhang Wang, Jiliang Tang, and Huan Liu. 2017. Fake news detection on social media: A data mining perspective. ACM SIGKDD Explorations Newsletter 19, 1 (2017), 22--36. Google ScholarDigital Library
C. Silverman, L. Strapagiel, H. Shaban, E. Hall,, and J. Singer-Vine. 2016. Hyperpartisan facebook pages are publishing false and misleading information at an alarming rate. https://www.buzzfeed.com/craigsilverman/partisan-fb-pagesanalysis, Buzzfeed (2016).Google Scholar
Eugenio Tacchini, Gabriele Ballarin, Marco L Della Vedova, Stefano Moret, and Luca de Alfaro. 2017. Some like it hoax: Automated fake news detection in social networks. In Proc. of the Workshop on Data Science for Social Good (SoGood).Google Scholar
Sebastian Tschiatschek, Adish Singla, Manuel Gomez Rodriguez, Arpit Merchant, and Andreas Krause. 2018. Fake News Detection in Social Networks via Crowd Signals. In Proc. of the WWW Companion. Google ScholarDigital Library
Svitlana Volkova, Kyle Shaffer, Jin Yea Jang, and Nathan Hodas. 2017. Separating facts from fiction: Linguistic models to classify suspicious and trusted news posts on twitter. In Proc. of the Annual Meeting of the ACL.Google ScholarCross Ref
Soroush Vosoughi, Deb Roy, and Sinan Aral. 2018. The spread of true and false news online. Science 359, 6380 (2018), 1146--1151.Google Scholar
William Yang Wang. 2017. "Liar, Liar Pants on Fire": A New Benchmark Dataset for Fake News Detection. In Proc. of the Annual Meeting of the ACL.Google ScholarCross Ref
Wei Wei and Xiaojun Wan. 2017. Learning to identify ambiguous and misleading news headlines. In Proc. of the Int'l Joint Conference on AI (IJCAI). Google ScholarDigital Library
Zhe Zhao, Paul Resnick, and Qiaozhu Mei. 2015. Enquiring minds: Early detection of rumors in social media from enquiry posts. In Proc. of the WWW Companion Google ScholarDigital Library

Index Terms

Explainable Machine Learning for Fake News Detection
1. Applied computing
  1. Law, social and behavioral sciences
    1. Sociology
2. Human-centered computing
  1. Collaborative and social computing
    1. Collaborative and social computing theory, concepts and paradigms
      1. Social media

Recommendations

“This is Fake! Shared it by Mistake”:Assessing the Intent of Fake News Spreaders
WWW '22: Proceedings of the ACM Web Conference 2022

Individuals can be misled by fake news and spread it unintentionally without knowing it is false. This phenomenon has been frequently observed but has not been investigated. Our aim in this work is to assess the intent of fake news spreaders. To ...
Read More
Science Disinformation: On the Problem of Fake News
Abstract
This article is devoted to an important socio-cultural phenomenon that undermines public confidence in science, that is, fake science news. The term fake news is analyzed and data on the dissemination of fake news on social networks is provided. ...
Read More
Multidimensional Analysis of Fake News Spreaders on Twitter
Computational Data and Social Networks
Abstract
Social media has become a tool to spread false information with the help of its large complex network. The consequences of such misinformation could be very severe. The paper uses the Twitter conversations about the scrapping of Article 370 in ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WebSci '19: Proceedings of the 10th ACM Conference on Web Science
June 2019
395 pages
ISBN:9781450362023
DOI:10.1145/3292522
General Chairs:
Paolo Boldi
Università degli Studi, Milano, Italy
,
Brooke Foucault Welles
Northeastern University, Boston, USA
,
Katharina Kinder-Kurlanda
GESIS - Leibniz Institute for the Social Sciences, Cologne, Germany
,
Christo Wilson
Northeastern University, Boston, USA
,
Program Chairs:
Isabella Peters
ZBW Leibniz Information Center for Economics & Kiel University, Kiel, Germany
,
Wagner Meira
Universidade Federal de Minas Gerais, Belo Horizonte, Brazil
Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 26 June 2019
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
civic reasoning
fake news
features
social media
Qualifiers
- research-article
Conference

Acceptance Rates
WebSci '19 Paper Acceptance Rate41of130submissions,32%Overall Acceptance Rate218of875submissions,25%
More
Upcoming Conference
Websci '24

Sponsor:

sigweb

16th ACM Web Science Conference

May 21 - 24, 2024

Stuttgart , Germany
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 58
  Total Citations
  View Citations
- 2,121
  Total Downloads
- Downloads (Last 12 months)266
- Downloads (Last 6 weeks)23
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Explainable Machine Learning for Fake News Detection

WebSci '19: Proceedings of the 10th ACM Conference on Web Science

ABSTRACT

References

Cited By

Index Terms

Recommendations

“This is Fake! Shared it by Mistake”:Assessing the Intent of Fake News Spreaders

Science Disinformation: On the Problem of Fake News

Multidimensional Analysis of Fake News Spreaders on Twitter