Abstract
Elongated words such as “Wiiiiiin” or “allloooo” are common in oral communication and are often used to emphasize or exaggerate the hidden message of the root word. While elongated words are rarely found in written languages and dictionaries, they are prevalent in social media networks. Considering elongation in sentiment analysis can provide valuable insights into user sentiments. In this article, we analyze the impact of elongation on sentiment classification, along with an in-depth study of lexical forms of elongation. We propose a method to enhance sentiment classification accuracy by incorporating elongation-based features using BERT (bidirectional encoder representations from transformers) approaches. Experimental results conducted on Twitter data demonstrate that our model achieves an average accuracy of 87% through 10-fold cross-validation experiments.
Similar content being viewed by others
Availability of data and materials
Data available on request from the authors
References
Gray TJ, Danforth CM, Dodds PS (2020) Hahahahaha, Duuuuude, Yeeessss!: a two-parameter characterization of stretchable words and the dynamics of mistypings and misspellings. PloS ONE 15(5):e0232938
Weiner ES, Simpson JA (1989) The Oxford English dictionary. Oxford 21989:65
McCulloch G (2020) Because internet: understanding the new rules of language. In: Penguin
Torregrossa F, Allesiardo R, Claveau V, Kooli N, Gravier G (2021) A survey on training and evaluation of word embeddings. In: International journal of data science and analytics, vol 11, p 85–103
Gujjar JP, Kumar HP (2021) Sentiment analysis: Textblob for decision making. Int J Sci Res Eng Trends 7(2):1097–1099
B. Shelke M, Sawant DD, Kadam CB, Ambhure K, Deshmukh SN (2023) Marathi SentiWordNet: a lexical resource for sentiment analysis of Marathi. Concurr Comput Pract Exp 35(2):e7497
Mowlaei ME, Abadeh MS, Keshavarz H (2020) Aspect-based sentiment analysis using adaptive aspect-based lexicons. Expert Syst Appl 148:113234
Govindan V, Balakrishnan V (2022) A machine learning approach in analysing the effect of hyperboles using negative sentiment tweets for sarcasm detection. J King Saud Univ - Comput Inf Sci 34(8):5110–5120
Saddam MA, Dewantara EK, Solichin A (2023) Sentiment analysis of flood disaster management in Jakarta on Twitter using support vector machines. Sinkron: Jurnal Dan Penelitian Teknik Informatika 8(1):470–479
Singh R, Singh R (2023) Applications of sentiment analysis and machine learning techniques in disease outbreak prediction-A review. Mater Today: Proc 81:1006–1011
Ritha N, Hayaty N, Matulatan T, Uperiati A, Rathomi M, Bettiza M, Farasalsabila F (2023) Sentiment analysis of health protocol policy using K-nearest neighbor and cosine similarity. In: ICSEDTI 2022: proceedings of the 1st international conference on sustainable engineering development and technological innovation, ICSEDTI 2022, 11-13 October 2022, Tanjungpinang, Indonesia. European Alliance for Innovation, pp 195
Gaur P, Vashistha S, Jha P (2023) Twitter sentiment analysis using Naive Bayes-based machine learning technique. In: Sentiment analysis and deep learning: proceedings of ICSADL 2022. Singapore : Springer Nature Singapore, pp 367–376
Heikal M, Torki M, El-Makky N (2018) Sentiment analysis of Arabic tweets using deep learning. Procedia Comput Sci 142:114–122
Mardjo A, Choksuchat C (2022) HyVADRF: Hybrid VADER-random forest and GWO for bitcoin tweet sentiment analysis. IEEE Access 10:101889–101897
Pilar GD, Isabel SB, Diego PM, Luis GAJ (2023) A novel flexible feature extraction algorithm for Spanish tweet sentiment analysis based on the context of words. Expert Syst Appl 212:118817
Geetha MP, Renuka DK (2021) Improving the performance of aspect based sentiment analysis using fine-tuned Bert Base Uncased model. Int J Intell Netw 2:64–69
Karimi A, Rossi L, Prati A (2020) Improving bert performance for aspect-based sentiment analysis. In arXiv:2010, 11731
Pota M, Ventura M, Catelli R, Esposito M (2020) An effective BERT-based pipeline for Twitter sentiment analysis: a case study in Italian. Sensors 21(1):133
Ramakrishnan S, Babu LD (2023) Enhancing twitter sentiment analysis using attention-based BiLSTM and BERT embedding. In: 2023 9th international conference on smart computing and communications (ICSCC). IEEE, pp 36–40
Kumari K, Jha SS, Dayanand ZK, Sharma P (2023) September). ML &AI_IIITRanchi@ DravidianLangTech: fine-tuning IndicBERT for exploring language-specific features for sentiment classification in code-mixed dravidian languages. In: Proceedings of the third workshop on speech and language technologies for Dravidian languages. pp 192–197
Eisenstein J (2013) What to do about bad language on the internet. In: Proceedings of the 2013 conference of the North American chapter of the association for computational linguistics: human language technologies. pp 359–369
Pandarachalil R, Sendhilkumar S, Mahalakshmi GS (2015) Twitter sentiment analysis for large-scale data: an unsupervised approach. Cogn Comput 7(2):254–262
El-Beltagy SR, Khalil T, Halaby A, Hammad M (2018) Combining lexical features and a supervised learning approach for Arabic sentiment analysis. In: Computational linguistics and intelligent text processing: 17th international conference, CICLing 2016, Konya, Turkey, April 3–9, 2016, Revised Selected Papers, Part II 17. Springer International Publishing, pp 307–319
Bhattacharjee S, Das A, Bhattacharya U, Parui SK, Roy S (2015) Sentiment analysis using cosine similarity measure. In: 2015 IEEE 2nd international conference on recent trends in information systems (ReTIS). IEEE, pp 27–32
Elshakankery K, Ahmed MF (2019) HILATSA: a hybrid Incremental learning approach for Arabic tweets sentiment analysis. Egypt Inform J 20(3):163–171
Altaf A, Anwar MW, Jamal MH, Bajwa UI (2023) Exploiting linguistic features for effective sentence-level sentiment analysis in Urdu language. Multimed Tools Appl 2023:1–27
Mostafa AM, Aljasir M, Alruily M, Alsayat A, Ezz M (2023) Innovative forward fusion feature selection algorithm for sentiment analysis using supervised classification. Appl Sci 13(4):2074
Kukkar A, Mohana R, Sharma A, Nayyar A, Shah MA (2023) Improving sentiment analysis in social media by handling lengthened words. IEEE Access 11:9775–9788
https://www.kaggle.com/kazanova/sentiment140. Accessed 14 May 2023
Eshan SC, Hasan MS (2017) An application of machine learning to detect abusive Bengali text. In: 2017 20th international conference of computer and information technology (ICCIT). IEEE, pp 1–6
Berrar D (2019) Cross-Validation, 542-545
Chen KY, Lee HC, Lin TC, Lee CY, Ho ZP (2023) Deep learning algorithms with LIME and similarity distance analysis on COVID-19 chest X-ray dataset. Int J Environ Res Public Health 20(5):4330
Zhang Z, Lu Y, Zheng L, Li S, Yu Z, Li Y (2018) A new varying-parameter convergent-differential neural-network for solving time-varying convex QP problem constrained by linear-equality. IEEE Trans Autom Control 63(12):4110–4125
Zhang Z, Zheng L, Weng J, Mao Y, Lu W, Xiao L (2018) A new varying-parameter recurrent neural-network for online solution of time-varying Sylvester equation. IEEE Trans Cybern 48(11):3135–3148
Zhang Z, Fu T, Yan Z, Jin L, Xiao L, Sun Y, Li Y (2018) A varying-parameter convergent-differential neural network for solving joint-angular-drift problems of redundant robot manipulators. IEEE/ASME Trans Mechatronics 23(2):679–689
Funding
This research received no external funding.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Rafae, A., Erritali, M. & Roche, M. Fusion of BERT embeddings and elongation-driven features. Multimed Tools Appl (2024). https://doi.org/10.1007/s11042-024-18786-9
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11042-024-18786-9