Augmentation-Based Ensemble Learning for Stance and Fake News Detection

Salah, Ilhem; Jouini, Khaled; Korbaa, Ouajdi

doi:10.1007/978-3-031-16210-7_3

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1653))

Included in the following conference series:

International Conference on Computational Collective Intelligence

864 Accesses
5 Citations

Abstract

Data augmentation is an unsupervised technique used to generate additional training data by slightly modifying already existing data. Besides preventing data scarcity, one of the main interest of data augmentation is that it increases training data diversity, and hence improves models’ ability to generalize to unseen data. In this work we investigate the use of text data augmentation for the task of stance and fake news detection.

In the first part of our work, we explore the effect of various text augmentation techniques on the performance of common classification algorithms. Besides identifying the best performing (classification algorithm, augmentation technique) pairs, our study reveals that the motto “the more, the better” is the wrong approach regarding text augmentation and that there is no one-size-fits-all text augmentation technique.

The second part of our work leverages the results of our study to propose a novel augmentation-based, ensemble learning approach that can be seen as a mixture between stacking and bagging. The proposed approach leverages text augmentation to enhance base learners’ diversity and accuracy, ergo the predictive performance of the ensemble. Experiments conducted on two real-world datasets show that our ensemble learning approach achieves very promising predictive performances.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Fake News and Hostile Posts Detection Using an Ensemble Learning Model

The Role of AI in Combating Fake News and Misinformation

Text-mining-based Fake News Detection Using Ensemble Methods

Article 18 February 2020

References

Karnyoto, A.S., Sun, C., Liu, B., Wang, X.: Augmentation and heterogeneous graph neural network for AAAI2021-Covid-19 fake news detection. Int. J. Mach. Learn. Cybern. 13 (2022). https://doi.org/10.1007/s13042-021-01503-5
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
MATH Google Scholar
Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci. 41(6), 391–407 (1990)
Article Google Scholar
Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. CoRR abs/1810.04805 (2018)
Google Scholar
Dulhanty, C., Deglint, J.L., Daya, I.B., Wong, A.: Taking a stance on fake news: towards automatic disinformation assessment via deep bidirectional transformer language models for stance detection. CoRR abs/1911.11951 (2019)
Google Scholar
Fellbaum, C.: Wordnet and wordnets. In: Barber, A. (ed.) Encyclopedia of Language and Linguistics, pp. 2–665. Elsevier, Amsterdam (2005)
Google Scholar
Hanselowski, A., et al.: A retrospective analysis of the fake news challenge stance-detection task (2018)
Google Scholar
Hsu, C.C., Ajorlou, A., Jadbabaie, Ali, P.: News sharing, and cascades on social networks, December 2021. https://ssrn.com/abstract=3934010 or https://doi.org/10.2139/ssrn.3934010. Accessed 05 Jan 2022
Jouini, K., Maaloul, M.H., Korbaa, O.: Real-time, CNN-based assistive device for visually impaired people. In: 2021 14th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI), pp. 1–6 (2021)
Google Scholar
Khan, J.Y., Khondaker, M.T.I., Afroz, S., Uddin, G., Iqbal, A.: A benchmark study of machine learning models for online fake news detection. Mach. Learn. Appl. 4, 100032 (2021). https://doi.org/10.1016/j.mlwa.2021.100032, https://www.sciencedirect.com/science/article/pii/S266682702100013X
Li, B., Hou, Y., Che, W.: Data augmentation approaches in natural language processing: a survey. CoRR abs/2110.01852 (2021). https://arxiv.org/abs/2110.01852
Li, S., et al.: Stacking-based ensemble learning on low dimensional features for fake news detection. In: 2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS) (2019). https://doi.org/10.1109/HPCC/SmartCity/DSS.2019.00383
Ma, E.: NLP augmentation (2019). https://github.com/makcedward/nlpaug. Accessed 15 May 2021
Mahabub, A.: A robust technique of fake news detection using ensemble voting classifier and comparison with other classifiers. SN Appl. Sci. 2(4), 1–9 (2020). https://doi.org/10.1007/s42452-020-2326-y
Article MathSciNet Google Scholar
Marivate, V., Sefara, T.: Improving short text classification through global augmentation methods. CoRR abs/1907.03752 (2019). http://arxiv.org/abs/1907.03752
NLTK.org: Natural Language Toolkit. https://github.com/nltk/nltk. Accessed 15 May 2021
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
MathSciNet MATH Google Scholar
Pomerleau, D., Rao, D.: The fake news challenge: exploring how artificial intelligence technologies could be leveraged to combat fake news (2017). http://www.fakenewschallenge.org/. Accessed 15 Dec 2021
Riedel, B., Augenstein, I., Spithourakis, G.P., Riedel, S.: A simple but tough-to-beat baseline for the fake news challenge stance detection task. CoRR abs/1707.03264 (2017). http://arxiv.org/abs/1707.03264
Sepúlveda Torres, R., Vicente, M., Saquete, E., Lloret, E., Sanz, M.: Headlinestancechecker: exploiting summarization to detect headline disinformation. J. Web Semant. 71, 100660 (2021). https://doi.org/10.1016/j.websem.2021.100660
Serrano, E., Iglesias, C.A., Garijo, M.: A survey of Twitter rumor spreading simulations. In: Núñez, M., Nguyen, N.T., Camacho, D., Trawiński, B. (eds.) ICCCI 2015. LNCS (LNAI), vol. 9329, pp. 113–122. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24069-5_11
Chapter Google Scholar
Shi, L., Liu, D., Liu, G., Meng, K.: AUG-BERT: an efficient data augmentation algorithm for text classification. In: Liang, Q., Wang, W., Liu, X., Na, Z., Jia, M., Zhang, B. (eds.) CSPS 2019. LNEE, vol. 571, pp. 2191–2198. Springer, Singapore (2020). https://doi.org/10.1007/978-981-13-9409-6_266
Chapter Google Scholar
Shorten, C., Khoshgoftaar, T.M., Furht, B.: Text data augmentation for deep learning. J. Big Data 8(1), 1–34 (2021). https://doi.org/10.1186/s40537-021-00492-0
Article Google Scholar
Shu, K.: FakeNewsNet (2019). https://doi.org/10.7910/DVN/UEMMHS. Accessed 15 Dec 2021
Slovikovskaya, V.: Transfer learning from transformers to fake news challenge stance detection (FNC-1) task. In: Proceedings of the 12th Language Resources and Evaluation Conference, pp. 1211–1218. European Language Resources Association (2019). https://www.aclweb.org/anthology/2020.lrec-1.152
Surowiecki, J.: The Wisdom of Crowds, 1st edn. Anchor Books, New York (2005)
Google Scholar
Suting, Y., Ning, Z.: Construction of structural diversity of ensemble learning based on classification coding. In: 2020 IEEE 9th Joint International Information Technology and Artificial Intelligence Conference (ITAIC), vol. 9, pp. 1205–1208 (2020). https://doi.org/10.1109/ITAIC49862.2020.9338807
Tesfagergish, S.G., Damaševičius, R., Kapočiūtė-Dzikienė, J.: Deep fake recognition in tweets using text augmentation, word embeddings and deep learning. In: Gervasi, O., et al. (eds.) ICCSA 2021. LNCS, vol. 12954, pp. 523–538. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86979-3_37
Chapter Google Scholar
Wang, W.Y.: “liar, liar pants on fire”: a new benchmark dataset for fake news detection. CoRR abs/1705.00648 (2017). http://arxiv.org/abs/1705.00648
Xie, Q., Dai, Z., Hovy, E.H., Luong, M., Le, Q.V.: Unsupervised data augmentation. CoRR abs/1904.12848 (2019). http://arxiv.org/abs/1904.12848

Download references

Author information

Authors and Affiliations

MARS Research Lab LR17ES05, ISITCom, University of Sousse, H. Sousse, 4011, Tunisia
Ilhem Salah, Khaled Jouini & Ouajdi Korbaa

Authors

Ilhem Salah
View author publications
You can also search for this author in PubMed Google Scholar
Khaled Jouini
View author publications
You can also search for this author in PubMed Google Scholar
Ouajdi Korbaa
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Khaled Jouini .

Editor information

Editors and Affiliations

University of Craiova, Craiova, Romania
Costin Bădică
Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
Jan Treur
Claude Bernard University Lyon 1, Villeurbanne Cedex, France
Djamal Benslimane
Wrocław University of Science and Technology, Wrocław, Poland
Bogumiła Hnatkowska
Wrocław University of Science and Technology, Wrocław, Poland
Marek Krótkiewicz

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Salah, I., Jouini, K., Korbaa, O. (2022). Augmentation-Based Ensemble Learning for Stance and Fake News Detection. In: Bădică, C., Treur, J., Benslimane, D., Hnatkowska, B., Krótkiewicz, M. (eds) Advances in Computational Collective Intelligence. ICCCI 2022. Communications in Computer and Information Science, vol 1653. Springer, Cham. https://doi.org/10.1007/978-3-031-16210-7_3

Download citation

DOI: https://doi.org/10.1007/978-3-031-16210-7_3
Published: 21 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16209-1
Online ISBN: 978-3-031-16210-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Augmentation-Based Ensemble Learning for Stance and Fake News Detection

Abstract

Access this chapter

Similar content being viewed by others

Fake News and Hostile Posts Detection Using an Ensemble Learning Model

The Role of AI in Combating Fake News and Misinformation

Text-mining-based Fake News Detection Using Ensemble Methods

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Augmentation-Based Ensemble Learning for Stance and Fake News Detection

Abstract

Access this chapter

Similar content being viewed by others

Fake News and Hostile Posts Detection Using an Ensemble Learning Model

The Role of AI in Combating Fake News and Misinformation

Text-mining-based Fake News Detection Using Ensemble Methods

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation