research-article

Detecting Arabic Spam Reviews in Social Networks Based on Classification Algorithms

Authors:
Hassan Najadat

Jordan University of Science and Technology, Irbid, Jordan

Jordan University of Science and Technology, Irbid, Jordan
View Profile

,
Mohammad A. Alzubaidi

Yarmouk University, Irbid, Jordan

Yarmouk University, Irbid, Jordan
View Profile

,
Islam Qarqaz

Jordan University of Science and Technology, Irbid, Jordan

Jordan University of Science and Technology, Irbid, Jordan
View Profile

ACM Transactions on Asian and Low-Resource Language Information Processing Volume 21 Issue 1Article No.: 11pp 1–13https://doi.org/10.1145/3476115

Published:01 November 2021Publication History

ACM Transactions on Asian and Low-Resource Language Information Processing

Abstract

Reviews or comments that users leave on social media have great importance for companies and business entities. New product ideas can be evaluated based on customer reactions. However, this use of social media is complicated by those who post spam on social media in the form of reviews and comments.

Designing methodologies to automatically detect and block social media spam is complicated by the fact that spammers continuously develop new ways to leave their spam comments. Researchers have proposed several methods to detect English spam reviews. However, few studies have been conducted to detect Arabic spam reviews. This article proposes a keyword-based method for detecting Arabic spam reviews. Keywords or Features are subsets of words from the original text that are labelled as important. A term's weight, Term Frequency–Inverse Document Frequency (TF-IDF) matrix, and filter methods (such as information gain, chi-squared, deviation, correlation, and uncertainty) have been used to extract keywords from Arabic text.

The method proposed in this article detects Arabic spam in Facebook comments. The dataset consists of 3,000 Arabic comments extracted from Facebook pages. Four different machine learning algorithms are used in the detection process, including C4.5, kNN, SVM, and Naïve Bayes classifiers. The results show that the Decision Tree classifier outperforms the other classification algorithms, with a detection accuracy of 92.63%.

REFERENCES

[1] Liu B.. 2012. Sentiment analysis and opinion mining. Synthesis Lectures on Human Language Technologies 5, 1 (2012), 1–167.Google ScholarCross Ref
[2] Pang B. and Lee L.. 2008. Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval 2 (1–2), 1–135.Google ScholarDigital Library
[3] Khanna P.. 2017. Sentiment analysis: An approach to opinion mining from Twitter data using r. International Journal of Advanced Research in Computer Science 8, 8 (2017), 252–256.Google ScholarCross Ref
[4] Jindal N. and Liu B.. 2007. Review spam detection. In Proceedings of the 16th International Conference on World Wide Web (WWW’07). 1189–1190.Google ScholarDigital Library
[5] Jindal N. and Liu B.. 2008. Opinion spam and analysis. In Proceedings of the International Conference on Web Search and Web Data Mining (WSDM’08). 219--230.Google ScholarDigital Library
[6] Ma Y. and Li F.. 2012. Detecting review spam: Challenges and opportunities. In Proceedings of the 8th IEEE International Conference on Collaborative Computing: Networking, Applications and Worksharing. IEEE, 651--654.Google ScholarCross Ref
[7] Jadon E. and Sharma R.. 2017. Data mining: Document classification using naive Bayes classifier. International Journal of Computer Applications 167, 6 (2017), 13–16.Google ScholarCross Ref
[8] Mhamed Mataoui et al. 2017. A proposed spam detection approach for Arabic social networks content. In International Conference on Mathematics and Information Technology (ICMIT’17). IEEE, 222--226.Google Scholar
[9] Al-Kabi M., Wahsheh H., Alsmadi I., Al-Shawakfa E., Wahbeh A., and Al-Hmoud A.. 2012. Content-based analysis to detect Arabic web spam. Journal of Information Science 38, 3 (2012), 284–296.Google ScholarDigital Library
[10] Witten I., Frank E., Hall M. A., and Pal C. J.. 2016. Data Mining: Practical Machine Learning Tools and Techniques (4th ed.). Morgan Kaufmann, Burlington, MA.Google Scholar
[11] Fürnkranz J.. 2017. Decision tree. In Encyclopedia of Machine Learning and Data Mining. Springer, Boston, MA, 330–333.Google Scholar
[12] Salloum S. A., AlHamad A. Q., Al-Emran M., and Shaalan K.. 2018. A survey of Arabic text mining. In Intelligent Natural Language Processing: Trends and Applications. K. Shaalan, A. Hassanien, F. Tolba (eds.). Studies in Computational Intelligence, Springer, Cham, 740.Google Scholar
[13] Pang B. and Lee L.. 2004. A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. arXiv preprint cs/0409058.Google Scholar
[14] Saif H., He Y., and Alani H.. 2012. Semantic sentiment analysis of Twitter. In The Semantic Web (ISWC’12). Springer, Berlin, Heidelberg, 508--524.Google Scholar
[15] Abdul-Mageed M., Diab M., and Kübler S.. 2014. SAMAR: Subjectivity and sentiment analysis for Arabic social media. Computer Speech & Language 28, 1 (2014), 20–37.Google ScholarDigital Library
[16] El-Halees A.. 2011. Arabic opinion mining using combined classification approach. In Proceedings of the International Arab Conference on Information Technology (ACIT’11), Naif Arab University for Security Science (NAUSS), Riyadh, Saudi Arabia. 11–14.Google Scholar
[17] Duwairi R. and Qarqaz I.. 2016. A framework for Arabic sentiment analysis using supervised classification. International Journal of Data Mining, Modelling and Management 8, 4 (2016), 369.Google ScholarCross Ref
[18] Song L., Lau R. Y., and Yin C.. 2014. Discriminative topic mining for social spam detection. In PACIS.Google Scholar
[19] Lam H.-Y. and Yeung D.-Y.. 2007. A learning approach to spam detection based on social networks. In Proceedings of 4th Conference on Email and Anti-Spam (CEAS’07).Google ScholarCross Ref
[20] Lupher A., Engle C., and Xin R.. 2012. Feature selection and classification of spam on social networking sites. Retrieved October 11, 2021 from http://bid.berkeley.edu/cs294-1-spring12/images/archive/6/6a/20120515031244!Spam-lupher-engle-xin.pdf.Google Scholar
[21] Markines B., Cattuto C., and Menczer F.. 2009. Social spam detection. In Proceedings of the 5th International Workshop on Adversarial Information Retrieval on the Web (AIRWeb’09). 41--48.Google ScholarDigital Library
[22] Wang D., Irani D., and Pu C.. 2011. A social-spam detection framework. In Proceedings of the 8th Annual Collaboration, Electronic Messaging, Anti-Abuse and Spam Conference (CEAS’11). 46--54.Google ScholarDigital Library
[23] Gao H., Hu J., Wilson C., Li Z., Chen Y., and Zhao B. Y.. 2010. Detecting and characterizing social spam campaigns. In Proceedings of the 10th Annual Conference on Internet Measurement (IMC’10). 35--47.Google Scholar
[24] Wahsheh H. A., Al-Kabi M. N., and Alsmadi I. M.. 2012. A link and content hybrid approach for Arabic web spam detection. International Journal of Intelligent Systems and Applications 5, 1 (2012), 30–43.Google ScholarCross Ref
[25] Wahsheh H., Al-Kabi M., and Alsmadi I.. 2012. Spam detection methods for Arabic web pages. In Proceedings of the 1st Taibah University International Conference on Computing and Information Technology (ICCIT'2), Al-Madinah Al-Munawwarah, Saudi Arabia. v2,(2012c). 486--490.Google Scholar
[26] Abu Hammad A. and El-Halees A.. 2015. An approach for detecting spam in Arabic opinion reviews. International Arab Journal of Information Technology 12, 1 (2015), 9–16.Google Scholar
[27] Aski A. and Sourati N.. 2016. Proposed efficient algorithm to filter spam using machine learning techniques. Pacific Science Review A: Natural Science and Engineering 18, 2 (2016), 145–149.Google ScholarCross Ref
[28] Kaur G. and Gurm R. K.. 2016. A survey on classification techniques in Internet environment. In International Journal of Advanced Research in Computer and Communication Engineering 5, 3 (2016), 589–593.Google Scholar
[29] Shahariar G. M., Biswas S., Omar F., Shah F. M., and Hassan S. B.. 2019. Spam review detection using deep learning. In IEEE 10th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON’19). IEEE, 0027–0033.Google ScholarCross Ref
[30] Makkar A. and Kumar N.. 2020. An efficient deep learning-based scheme for web spam detection in IoT environment. Future Generation Computer Systems 108, 467–487.Google ScholarCross Ref
[31] Roy P. K., Singh J. P., and Banerjee S.. 2020. Deep learning to filter SMS spam. Future Generation Computer Systems 102 (2020), 524–533.Google ScholarDigital Library
[32] Qaiser S. and Ali R.. 2018. Text mining: Use of TF-IDF to examine the relevance of words to documents. International Journal of Computer Applications. 181. DOI: 10.5120/ijca2018917395.Google ScholarCross Ref

Index Terms

Detecting Arabic Spam Reviews in Social Networks Based on Classification Algorithms
1. Computing methodologies
2. Information systems
  1. Information systems applications

Recommendations

Co-detecting social spammers and spam messages in microblogging via exploiting social contexts

Microblogging websites, such as Twitter, have become popular platforms for information dissemination and sharing. However, they are also full of spammers who frequently conduct social spamming on them. Massive social spammers and spam messages heavily ...
Read More
Building social capital with Facebook: Type of network, availability of other media, and social self-efficacy matter^#
Highlights
- Type of friends affects building social capital via Facebook and traditional media.
Abstract
Findings about Facebook's effect on relationships are mixed, possibly due to lack of models that acknowledge differences across users, types of their friends, and use of competing media. To address this, we proposed and tested how ...
Read More
UNIK: unsupervised social network spam detection
CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge Management

Social network spam increases explosively with the rapid development and wide usage of various social networks on the Internet. To timely detect spam in large social network sites, it is desirable to discover unsupervised schemes that can save the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Asian and Low-Resource Language Information Processing Volume 21, Issue 1
January 2022
442 pages
ISSN:2375-4699
EISSN:2375-4702
DOI:10.1145/3494068
Editor:
Imed Zitouni
Google, USA
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 November 2021
- Accepted: 1 July 2021
- Revised: 1 June 2021
- Received: 1 July 2020
Published in tallip Volume 21, Issue 1

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Arabic language
Facebook
spam detection
social networks
classification algorithms
Qualifiers
- research-article
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 3
  Total Citations
  View Citations
- 337
  Total Downloads
- Downloads (Last 12 months)76
- Downloads (Last 6 weeks)17
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

View Full Text

HTML Format

View this article in HTML Format .

View HTML Format

Detecting Arabic Spam Reviews in Social Networks Based on Classification Algorithms

ACM Transactions on Asian and Low-Resource Language Information Processing

Abstract

REFERENCES

Cited By

Index Terms

Recommendations

Co-detecting social spammers and spam messages in microblogging via exploiting social contexts

Building social capital with Facebook: Type of network, availability of other media, and social self-efficacy matter^#

UNIK: unsupervised social network spam detection

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Full Text

HTML Format

Caption

Detecting Arabic Spam Reviews in Social Networks Based on Classification Algorithms

ACM Transactions on Asian and Low-Resource Language Information Processing

Abstract

REFERENCES

Cited By

Index Terms

Recommendations

Co-detecting social spammers and spam messages in microblogging via exploiting social contexts

Building social capital with Facebook: Type of network, availability of other media, and social self-efficacy matter#

UNIK: unsupervised social network spam detection

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Full Text

HTML Format

Share this Publication link

Share on Social Media

Building social capital with Facebook: Type of network, availability of other media, and social self-efficacy matter^#