SENTIMENTANALYSIS FOR ARABIC AND ENGLISH DATASETS

Document Type : Original Article

Authors

1 Communication and Computer systems Department, Faculty of Engineering, Mansoura University - Egypt

2 Information Systems Department, Faculty of Computers and Information,Mansoura University - Egypt

Abstract

Sentiment analysis is an important topic that has tracked attention since 2001. It basically is
text classification based on analyzing opinions that expressed by writing (e.g., social media, blogs,
discussion groups, etc). The widespread use of social networks has, also, led to a widespread
availability of opinionated posts, making research in the area more viable and important. We need to
make sentiment analysis to calculate the percentage of user acceptance or rejection according to their
comments.Although Arabic is the native language of hundreds of millions of people in twenty countries
across the Middle East and North Africa, the research in the area of Arabic sentiment analysis is
progressing at a very slow pace compared to that being carried out in English[2].In this paper, we
presnet our work in which we start by testing on English texts that wrere collected from Amazon (book,
DVD, and electronics).Then, we applied the same process on Arabic dataset that we collect from
YouTubeArabic pages. We applied more than one machine learning on algorithms both (Arabic.
English) (Decision trees, Navie Bayes, functions, and support vector machines. We also createda
Sentiword Lexicon based on the Corpus that we gathered. Then we evaluated each method and
compared their accuracies.