Abstract
Emotion identification from text has recently gained attention due to its versatile ability to analyze human-machine interaction. This work focuses on detecting emotions from textual data. Languages, like English, Chinese, and German are widely used for text classification, however, limited research is done on resource-poor oriental languages. Roman Urdu (RU) is a resource-constrained language extensively used across Asia. This work focuses on predicting emotions from RU text. For this, a dataset is collected from different social media domains and based on Paul Ekman's theory it is annotated with six basic emotions, i.e., happy, surprise, angry, sad, fear, and disgusting. Dense word embedding representations of different languages is adopted that utilize existing pre-trained models. BERT is additionally pre-trained and fine-tuned for the classification task. The proposed approach is compared with baseline machine learning and deep learning algorithms. Additionally, a comparison of the current work is also performed with different approaches for the same task. Based on the empirical evaluation, the proposed approach performs better than the existing state-of-the-art with an average accuracy of 91%.
- L. Fang, H. Zhu, B. Lv, Z. Liu, W. Meng, Y. Yu, S. Ji, Z. Cao., "HandiText: Handwriting Recognition Based on Dynamic Characteristics with Incremental LSTM," ACM/IMS Transactions on Data Science, vol. 1, no. 2691-1922, p. 18, 2020.Google Scholar
- C. H. Wu, Z. J. Chuang, Y. C. Lin, "Emotion recognition from text using semantic labels and separable mixture models," ACM Transactions on Asian Language Information Processing, vol. 5, no. 1530-0226, p. 19, 2006.Google Scholar
- Z. Halim, M. Waqar, M. Tahir, "A machine learning-based investigation utilizing the in-text features for the identification of dominant emotion in an email," Knowledge-Based Systems, vol. 208, no. 0950-7051, p. 106443, 2020.Google ScholarCross Ref
- A. Majeed, H. Mujtaba, M. O. Beg, "Emotion Detection in Roman Urdu Text Using Machine Learning," Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering,2006.Google Scholar
- B. Liu, L. Zhang, "A survey of opinion mining and sentiment analysis," Mining Text Data,page 415-463,2013.Google Scholar
- K. Mehmood, D. Essam, K. Shafi, M. K. Malik, "Discriminative Feature Spamming Technique for Roman Urdu Sentiment Analysis," IEEE Access, vol. 7, pp. 47991-48002, 2019.Google ScholarCross Ref
- J. Devlin, M. -W. Chang, K. Lee, K. Toutanova, "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding," arXiv, 2019.Google Scholar
- K. Mehmood, D. Essam, K. Shafi, M. K. Malik, "Sentiment Analysis for a Resource Poor Language—Roman Urdu," ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP), vol. 19, p. 1 –15, 2020.Google ScholarDigital Library
- Z. Mahmood, I. Safder, R. Nawab, F. Bukhari, R. Nawaz, A. Alfakeeh, N. Aljohani,S. Hassan, "Deep sentiments in Roman Urdu text using Recurrent Convolutional Neural Network model," Information Processing & Management, vol. 57, no. 102233, 2020.Google Scholar
- T. Tehreem, H. Tahir, "Sentiment Analysis for YouTube Comments in Roman Urdu," CoRR, vol. abs/2102.10075, 2021.Google Scholar
- M. A. Manzoor, S. Mamoon,S. K. Tao, A. Zakir, M. Adil, J. Lu, "Lexical Variation and Sentiment Analysis of Roman Urdu Sentences with Deep Neural Networks," International Journal of Advanced Computer Science and Applications, vol. 11, no. 2, 2020.Google ScholarCross Ref
- G. Hussain,Z. Feng,L. Wenjia,X. Yutong, "Deep Learning-Based Sentiment Analysis for Roman Urdu Text," Procedia Computer Science, vol. 147, no. 1877-0509, pp. 131-135, 2019.Google Scholar
- D. Ali, M. M. S. Missen, M. Husnain, "Multiclass Event Classification from Text," Journal TitleScientific Programming, no. 6660651, p. 15, 2021.Google Scholar
- D. M. Awais and D. M. Shoaib, "Role of discourse information in Urdu sentiment classification," ACM Transactions on Asian and Low-Resource Language Information Processing, vol. 18, no. 4, p. 1–37, 2019.Google ScholarDigital Library
- J. P. Singh, Y. K. Dwivedi, N. P. Rana, A. Kumar, and K. K. Kapoor, "Event classification and location prediction from tweets during disasters," Annals of Operations Research, vol. 283, no. 1-2, p. 737–757, 2019.Google ScholarCross Ref
- Q. A. Al-Radaideh and M. A. Al-Abrat, "An Arabic text categorization approach using term weighting and multiple reducts," Soft Computing, vol. 23, no. 14, p. 5849–5863, 2019.Google ScholarDigital Library
- B. Pang, L. Lee, and S. Vaithyanathan, "Thumbs up?: Sentiment classification using machine learning techniques," ACL-02, vol. 10, p. 79–86, 2002.Google ScholarDigital Library
- C. Yang, K. H. Y. Lin, and H. H. Chen, "Emotion classification using web blog corpora," in IEEE/WIC/ACM International Conference on Web Intelligence, 2007.Google Scholar
- R. M. Duwairi, R. Marji, N. Sha'ban, and S. Rushaidat, "Sentiment analysis in Arabic tweets," in 5th International Conference on Information and Communication Systems (ICICS). IEEE, 2014.Google Scholar
- A. Wahdan, S. Hantoobi, S. Salloum, K. Shaalan., "A systematic review of text classification research based on deep learning models in Arabic language," pp. 6629-6643, 2020.Google Scholar
- Y. Li, H. Wu, "A Clustering Method Based on K-Means Algorithm," Physics Procedia, vol. 25, no. 1875-3892, pp. 1104-1109, 2012.Google ScholarCross Ref
- J. Han & M. Kamber, "Data Mining Concepts and Techniques," in The Morgan Kaufmann Series in Data Management Systems, New Delhi, Morgan Kaufmann Publishers, August 2001.Google Scholar
- T. Sajid, M. Hassan, M. Ali and R. Gillani, "Roman Urdu Multi-Class Offensive Text Detection using Hybrid Features and SVM," in IEEE 23rd International Multitopic Conference (INMIC), 2020,PP 1-5.Google Scholar
- A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N Gomez, L. Kaiser, I. Polosukhin, "Attention is All you Need," in Neural Information Processing Systems, 2017, p. 6000–6010.Google Scholar
- T. Jenbao, K. Weiwei, C. Yidan, T. Qiaoxin, S. Chenyuan, and L. Long, “Text Classification Method Based on BiGRU-Attention and CNN Hybrid Model”, In 4th International Conference on Artificial Intelligence and Pattern Recognition. Association for Computing Machinery,2021, USA, 614–622.Google Scholar
Index Terms
- Towards Enhanced Identification of Emotion from Resource-Constrained Language through a novel Multilingual BERT Approach
Recommendations
Emotion Detection in Code-Mixed Roman Urdu - English Text
Emotion detection is a widely studied topic in natural language processing due to its significance in a number of application areas. A plethora of studies have been conducted on emotion detection in European as well as Asian languages. However, a large ...
Emotion detection in Roman Urdu text using machine learning
ASE '20: Proceedings of the 35th IEEE/ACM International Conference on Automated Software EngineeringEmotion detection is playing a very important role in our life. People express their emotions in different ways i.e face expression, gestures, speech, and text. This research focuses on detecting emotions from the Roman Urdu text. Previously, A lot of ...
Multilingual author profiling on Facebook
Proposed a multilingual (Roman Urdu and English) author profiling corpus of Facebook profiles.Manually developed a bilingual dictionary (Roman Urdu to English) of 7749 entries and translated multilingual corpus using it.Applied 64 stylometry and 11 ...
Comments