skip to main content
research-article
Free Access
Just Accepted

Towards Enhanced Identification of Emotion from Resource-Constrained Language through a novel Multilingual BERT Approach

Authors Info & Claims
Online AM:19 April 2023Publication History
Skip Abstract Section

Abstract

Emotion identification from text has recently gained attention due to its versatile ability to analyze human-machine interaction. This work focuses on detecting emotions from textual data. Languages, like English, Chinese, and German are widely used for text classification, however, limited research is done on resource-poor oriental languages. Roman Urdu (RU) is a resource-constrained language extensively used across Asia. This work focuses on predicting emotions from RU text. For this, a dataset is collected from different social media domains and based on Paul Ekman's theory it is annotated with six basic emotions, i.e., happy, surprise, angry, sad, fear, and disgusting. Dense word embedding representations of different languages is adopted that utilize existing pre-trained models. BERT is additionally pre-trained and fine-tuned for the classification task. The proposed approach is compared with baseline machine learning and deep learning algorithms. Additionally, a comparison of the current work is also performed with different approaches for the same task. Based on the empirical evaluation, the proposed approach performs better than the existing state-of-the-art with an average accuracy of 91%.

References

  1. L. Fang, H. Zhu, B. Lv, Z. Liu, W. Meng, Y. Yu, S. Ji, Z. Cao., "HandiText: Handwriting Recognition Based on Dynamic Characteristics with Incremental LSTM," ACM/IMS Transactions on Data Science, vol. 1, no. 2691-1922, p. 18, 2020.Google ScholarGoogle Scholar
  2. C. H. Wu, Z. J. Chuang, Y. C. Lin, "Emotion recognition from text using semantic labels and separable mixture models," ACM Transactions on Asian Language Information Processing, vol. 5, no. 1530-0226, p. 19, 2006.Google ScholarGoogle Scholar
  3. Z. Halim, M. Waqar, M. Tahir, "A machine learning-based investigation utilizing the in-text features for the identification of dominant emotion in an email," Knowledge-Based Systems, vol. 208, no. 0950-7051, p. 106443, 2020.Google ScholarGoogle ScholarCross RefCross Ref
  4. A. Majeed, H. Mujtaba, M. O. Beg, "Emotion Detection in Roman Urdu Text Using Machine Learning," Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering,2006.Google ScholarGoogle Scholar
  5. B. Liu, L. Zhang, "A survey of opinion mining and sentiment analysis," Mining Text Data,page 415-463,2013.Google ScholarGoogle Scholar
  6. K. Mehmood, D. Essam, K. Shafi, M. K. Malik, "Discriminative Feature Spamming Technique for Roman Urdu Sentiment Analysis," IEEE Access, vol. 7, pp. 47991-48002, 2019.Google ScholarGoogle ScholarCross RefCross Ref
  7. J. Devlin, M. -W. Chang, K. Lee, K. Toutanova, "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding," arXiv, 2019.Google ScholarGoogle Scholar
  8. K. Mehmood, D. Essam, K. Shafi, M. K. Malik, "Sentiment Analysis for a Resource Poor Language—Roman Urdu," ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP), vol. 19, p. 1 –15, 2020.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Z. Mahmood, I. Safder, R. Nawab, F. Bukhari, R. Nawaz, A. Alfakeeh, N. Aljohani,S. Hassan, "Deep sentiments in Roman Urdu text using Recurrent Convolutional Neural Network model," Information Processing & Management, vol. 57, no. 102233, 2020.Google ScholarGoogle Scholar
  10. T. Tehreem, H. Tahir, "Sentiment Analysis for YouTube Comments in Roman Urdu," CoRR, vol. abs/2102.10075, 2021.Google ScholarGoogle Scholar
  11. M. A. Manzoor, S. Mamoon,S. K. Tao, A. Zakir, M. Adil, J. Lu, "Lexical Variation and Sentiment Analysis of Roman Urdu Sentences with Deep Neural Networks," International Journal of Advanced Computer Science and Applications, vol. 11, no. 2, 2020.Google ScholarGoogle ScholarCross RefCross Ref
  12. G. Hussain,Z. Feng,L. Wenjia,X. Yutong, "Deep Learning-Based Sentiment Analysis for Roman Urdu Text," Procedia Computer Science, vol. 147, no. 1877-0509, pp. 131-135, 2019.Google ScholarGoogle Scholar
  13. D. Ali, M. M. S. Missen, M. Husnain, "Multiclass Event Classification from Text," Journal TitleScientific Programming, no. 6660651, p. 15, 2021.Google ScholarGoogle Scholar
  14. D. M. Awais and D. M. Shoaib, "Role of discourse information in Urdu sentiment classification," ACM Transactions on Asian and Low-Resource Language Information Processing, vol. 18, no. 4, p. 1–37, 2019.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. P. Singh, Y. K. Dwivedi, N. P. Rana, A. Kumar, and K. K. Kapoor, "Event classification and location prediction from tweets during disasters," Annals of Operations Research, vol. 283, no. 1-2, p. 737–757, 2019.Google ScholarGoogle ScholarCross RefCross Ref
  16. Q. A. Al-Radaideh and M. A. Al-Abrat, "An Arabic text categorization approach using term weighting and multiple reducts," Soft Computing, vol. 23, no. 14, p. 5849–5863, 2019.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. B. Pang, L. Lee, and S. Vaithyanathan, "Thumbs up?: Sentiment classification using machine learning techniques," ACL-02, vol. 10, p. 79–86, 2002.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. C. Yang, K. H. Y. Lin, and H. H. Chen, "Emotion classification using web blog corpora," in IEEE/WIC/ACM International Conference on Web Intelligence, 2007.Google ScholarGoogle Scholar
  19. R. M. Duwairi, R. Marji, N. Sha'ban, and S. Rushaidat, "Sentiment analysis in Arabic tweets," in 5th International Conference on Information and Communication Systems (ICICS). IEEE, 2014.Google ScholarGoogle Scholar
  20. A. Wahdan, S. Hantoobi, S. Salloum, K. Shaalan., "A systematic review of text classification research based on deep learning models in Arabic language," pp. 6629-6643, 2020.Google ScholarGoogle Scholar
  21. Y. Li, H. Wu, "A Clustering Method Based on K-Means Algorithm," Physics Procedia, vol. 25, no. 1875-3892, pp. 1104-1109, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  22. J. Han & M. Kamber, "Data Mining Concepts and Techniques," in The Morgan Kaufmann Series in Data Management Systems, New Delhi, Morgan Kaufmann Publishers, August 2001.Google ScholarGoogle Scholar
  23. T. Sajid, M. Hassan, M. Ali and R. Gillani, "Roman Urdu Multi-Class Offensive Text Detection using Hybrid Features and SVM," in IEEE 23rd International Multitopic Conference (INMIC), 2020,PP 1-5.Google ScholarGoogle Scholar
  24. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N Gomez, L. Kaiser, I. Polosukhin, "Attention is All you Need," in Neural Information Processing Systems, 2017, p. 6000–6010.Google ScholarGoogle Scholar
  25. T. Jenbao, K. Weiwei, C. Yidan, T. Qiaoxin, S. Chenyuan, and L. Long, “Text Classification Method Based on BiGRU-Attention and CNN Hybrid Model”, In 4th International Conference on Artificial Intelligence and Pattern Recognition. Association for Computing Machinery,2021, USA, 614–622.Google ScholarGoogle Scholar

Index Terms

  1. Towards Enhanced Identification of Emotion from Resource-Constrained Language through a novel Multilingual BERT Approach
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            Full Access

            • Published in

              cover image ACM Transactions on Asian and Low-Resource Language Information Processing
              ACM Transactions on Asian and Low-Resource Language Information Processing Just Accepted
              ISSN:2375-4699
              EISSN:2375-4702
              Table of Contents

              Copyright © 2023 Copyright held by the owner/author(s). Publication rights licensed to ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Online AM: 19 April 2023
              • Accepted: 8 April 2023
              • Revised: 29 March 2023
              • Received: 24 February 2023
              Published in tallip Just Accepted

              Check for updates

              Qualifiers

              • research-article
            • Article Metrics

              • Downloads (Last 12 months)313
              • Downloads (Last 6 weeks)28

              Other Metrics

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader