Abstract
Reviews or comments that users leave on social media have great importance for companies and business entities. New product ideas can be evaluated based on customer reactions. However, this use of social media is complicated by those who post spam on social media in the form of reviews and comments.
Designing methodologies to automatically detect and block social media spam is complicated by the fact that spammers continuously develop new ways to leave their spam comments. Researchers have proposed several methods to detect English spam reviews. However, few studies have been conducted to detect Arabic spam reviews. This article proposes a keyword-based method for detecting Arabic spam reviews. Keywords or Features are subsets of words from the original text that are labelled as important. A term's weight, Term Frequency–Inverse Document Frequency (TF-IDF) matrix, and filter methods (such as information gain, chi-squared, deviation, correlation, and uncertainty) have been used to extract keywords from Arabic text.
The method proposed in this article detects Arabic spam in Facebook comments. The dataset consists of 3,000 Arabic comments extracted from Facebook pages. Four different machine learning algorithms are used in the detection process, including C4.5, kNN, SVM, and Naïve Bayes classifiers. The results show that the Decision Tree classifier outperforms the other classification algorithms, with a detection accuracy of 92.63%.
- [1] . 2012. Sentiment analysis and opinion mining. Synthesis Lectures on Human Language Technologies 5, 1 (2012), 1–167.Google ScholarCross Ref
- [2] . 2008. Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval 2 (1–2), 1–135.Google ScholarDigital Library
- [3] . 2017. Sentiment analysis: An approach to opinion mining from Twitter data using r. International Journal of Advanced Research in Computer Science 8, 8 (2017), 252–256.Google ScholarCross Ref
- [4] . 2007. Review spam detection. In Proceedings of the 16th International Conference on World Wide Web (WWW’07). 1189–1190.Google ScholarDigital Library
- [5] . 2008. Opinion spam and analysis. In Proceedings of the International Conference on Web Search and Web Data Mining (WSDM’08). 219--230.Google ScholarDigital Library
- [6] . 2012. Detecting review spam: Challenges and opportunities. In Proceedings of the 8th IEEE International Conference on Collaborative Computing: Networking, Applications and Worksharing.
IEEE , 651--654.Google ScholarCross Ref - [7] . 2017. Data mining: Document classification using naive Bayes classifier. International Journal of Computer Applications 167, 6 (2017), 13–16.Google ScholarCross Ref
- [8] 2017. A proposed spam detection approach for Arabic social networks content. In International Conference on Mathematics and Information Technology (ICMIT’17). IEEE, 222--226.Google Scholar
- [9] . 2012. Content-based analysis to detect Arabic web spam. Journal of Information Science 38, 3 (2012), 284–296.Google ScholarDigital Library
- [10] . 2016. Data Mining: Practical Machine Learning Tools and Techniques (4th ed.). Morgan Kaufmann, Burlington, MA.Google Scholar
- [11] . 2017. Decision tree. In Encyclopedia of Machine Learning and Data Mining. Springer, Boston, MA, 330–333.Google Scholar
- [12] . 2018. A survey of Arabic text mining. In Intelligent Natural Language Processing: Trends and Applications. K. Shaalan, A. Hassanien, F. Tolba (eds.). Studies in Computational Intelligence, Springer, Cham, 740.Google Scholar
- [13] . 2004. A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. arXiv preprint cs/0409058.Google Scholar
- [14] . 2012. Semantic sentiment analysis of Twitter. In The Semantic Web (ISWC’12). Springer, Berlin, Heidelberg, 508--524.Google Scholar
- [15] . 2014. SAMAR: Subjectivity and sentiment analysis for Arabic social media. Computer Speech & Language 28, 1 (2014), 20–37.Google ScholarDigital Library
- [16] . 2011. Arabic opinion mining using combined classification approach. In Proceedings of the International Arab Conference on Information Technology (ACIT’11),
Naif Arab University for Security Science (NAUSS) ,Riyadh, Saudi Arabia . 11–14.Google Scholar - [17] . 2016. A framework for Arabic sentiment analysis using supervised classification. International Journal of Data Mining, Modelling and Management 8, 4 (2016), 369.Google ScholarCross Ref
- [18] . 2014. Discriminative topic mining for social spam detection. In PACIS.Google Scholar
- [19] . 2007. A learning approach to spam detection based on social networks. In Proceedings of 4th Conference on Email and Anti-Spam (CEAS’07).Google ScholarCross Ref
- [20] . 2012. Feature selection and classification of spam on social networking sites. Retrieved October 11, 2021 from http://bid.berkeley.edu/cs294-1-spring12/images/archive/6/6a/20120515031244!Spam-lupher-engle-xin.pdf.Google Scholar
- [21] . 2009. Social spam detection. In Proceedings of the 5th International Workshop on Adversarial Information Retrieval on the Web (AIRWeb’09). 41--48.Google ScholarDigital Library
- [22] . 2011. A social-spam detection framework. In Proceedings of the 8th Annual Collaboration, Electronic Messaging, Anti-Abuse and Spam Conference (CEAS’11). 46--54.Google ScholarDigital Library
- [23] . 2010. Detecting and characterizing social spam campaigns. In Proceedings of the 10th Annual Conference on Internet Measurement (IMC’10). 35--47.Google Scholar
- [24] . 2012. A link and content hybrid approach for Arabic web spam detection. International Journal of Intelligent Systems and Applications 5, 1 (2012), 30–43.Google ScholarCross Ref
- [25] . 2012. Spam detection methods for Arabic web pages. In Proceedings of the 1st Taibah University International Conference on Computing and Information Technology (ICCIT'2), Al-Madinah Al-Munawwarah, Saudi Arabia. v2,(2012c). 486--490.Google Scholar
- [26] . 2015. An approach for detecting spam in Arabic opinion reviews. International Arab Journal of Information Technology 12, 1 (2015), 9–16.Google Scholar
- [27] . 2016. Proposed efficient algorithm to filter spam using machine learning techniques. Pacific Science Review A: Natural Science and Engineering 18, 2 (2016), 145–149.Google ScholarCross Ref
- [28] . 2016. A survey on classification techniques in Internet environment. In International Journal of Advanced Research in Computer and Communication Engineering 5, 3 (2016), 589–593.Google Scholar
- [29] . 2019. Spam review detection using deep learning. In IEEE 10th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON’19). IEEE, 0027–0033.Google ScholarCross Ref
- [30] . 2020. An efficient deep learning-based scheme for web spam detection in IoT environment. Future Generation Computer Systems 108, 467–487.Google ScholarCross Ref
- [31] . 2020. Deep learning to filter SMS spam. Future Generation Computer Systems 102 (2020), 524–533.Google ScholarDigital Library
- [32] . 2018. Text mining: Use of TF-IDF to examine the relevance of words to documents. International Journal of Computer Applications. 181. DOI: 10.5120/ijca2018917395.Google ScholarCross Ref
Index Terms
- Detecting Arabic Spam Reviews in Social Networks Based on Classification Algorithms
Recommendations
Co-detecting social spammers and spam messages in microblogging via exploiting social contexts
Microblogging websites, such as Twitter, have become popular platforms for information dissemination and sharing. However, they are also full of spammers who frequently conduct social spamming on them. Massive social spammers and spam messages heavily ...
Building social capital with Facebook: Type of network, availability of other media, and social self-efficacy matter#
Highlights- Type of friends affects building social capital via Facebook and traditional media.
AbstractFindings about Facebook's effect on relationships are mixed, possibly due to lack of models that acknowledge differences across users, types of their friends, and use of competing media. To address this, we proposed and tested how ...
UNIK: unsupervised social network spam detection
CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge ManagementSocial network spam increases explosively with the rapid development and wide usage of various social networks on the Internet. To timely detect spam in large social network sites, it is desirable to discover unsupervised schemes that can save the ...
Comments