Skip to main content
Log in

Fusion of BERT embeddings and elongation-driven features

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Elongated words such as “Wiiiiiin” or “allloooo” are common in oral communication and are often used to emphasize or exaggerate the hidden message of the root word. While elongated words are rarely found in written languages and dictionaries, they are prevalent in social media networks. Considering elongation in sentiment analysis can provide valuable insights into user sentiments. In this article, we analyze the impact of elongation on sentiment classification, along with an in-depth study of lexical forms of elongation. We propose a method to enhance sentiment classification accuracy by incorporating elongation-based features using BERT (bidirectional encoder representations from transformers) approaches. Experimental results conducted on Twitter data demonstrate that our model achieves an average accuracy of 87% through 10-fold cross-validation experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Availability of data and materials

Data available on request from the authors

References

  1. Gray TJ, Danforth CM, Dodds PS (2020) Hahahahaha, Duuuuude, Yeeessss!: a two-parameter characterization of stretchable words and the dynamics of mistypings and misspellings. PloS ONE 15(5):e0232938

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Weiner ES, Simpson JA (1989) The Oxford English dictionary. Oxford 21989:65

    Google Scholar 

  3. McCulloch G (2020) Because internet: understanding the new rules of language. In: Penguin

  4. Torregrossa F, Allesiardo R, Claveau V, Kooli N, Gravier G (2021) A survey on training and evaluation of word embeddings. In: International journal of data science and analytics, vol 11, p 85–103

  5. Gujjar JP, Kumar HP (2021) Sentiment analysis: Textblob for decision making. Int J Sci Res Eng Trends 7(2):1097–1099

    Google Scholar 

  6. B. Shelke M, Sawant DD, Kadam CB, Ambhure K, Deshmukh SN (2023) Marathi SentiWordNet: a lexical resource for sentiment analysis of Marathi. Concurr Comput Pract Exp 35(2):e7497

    Article  Google Scholar 

  7. Mowlaei ME, Abadeh MS, Keshavarz H (2020) Aspect-based sentiment analysis using adaptive aspect-based lexicons. Expert Syst Appl 148:113234

    Article  Google Scholar 

  8. Govindan V, Balakrishnan V (2022) A machine learning approach in analysing the effect of hyperboles using negative sentiment tweets for sarcasm detection. J King Saud Univ - Comput Inf Sci 34(8):5110–5120

    Google Scholar 

  9. Saddam MA, Dewantara EK, Solichin A (2023) Sentiment analysis of flood disaster management in Jakarta on Twitter using support vector machines. Sinkron: Jurnal Dan Penelitian Teknik Informatika 8(1):470–479

    Article  Google Scholar 

  10. Singh R, Singh R (2023) Applications of sentiment analysis and machine learning techniques in disease outbreak prediction-A review. Mater Today: Proc 81:1006–1011

    CAS  Google Scholar 

  11. Ritha N, Hayaty N, Matulatan T, Uperiati A, Rathomi M, Bettiza M, Farasalsabila F (2023) Sentiment analysis of health protocol policy using K-nearest neighbor and cosine similarity. In: ICSEDTI 2022: proceedings of the 1st international conference on sustainable engineering development and technological innovation, ICSEDTI 2022, 11-13 October 2022, Tanjungpinang, Indonesia. European Alliance for Innovation, pp 195

  12. Gaur P, Vashistha S, Jha P (2023) Twitter sentiment analysis using Naive Bayes-based machine learning technique. In: Sentiment analysis and deep learning: proceedings of ICSADL 2022. Singapore : Springer Nature Singapore, pp 367–376

  13. Heikal M, Torki M, El-Makky N (2018) Sentiment analysis of Arabic tweets using deep learning. Procedia Comput Sci 142:114–122

    Article  Google Scholar 

  14. Mardjo A, Choksuchat C (2022) HyVADRF: Hybrid VADER-random forest and GWO for bitcoin tweet sentiment analysis. IEEE Access 10:101889–101897

    Article  Google Scholar 

  15. Pilar GD, Isabel SB, Diego PM, Luis GAJ (2023) A novel flexible feature extraction algorithm for Spanish tweet sentiment analysis based on the context of words. Expert Syst Appl 212:118817

    Article  Google Scholar 

  16. Geetha MP, Renuka DK (2021) Improving the performance of aspect based sentiment analysis using fine-tuned Bert Base Uncased model. Int J Intell Netw 2:64–69

    Google Scholar 

  17. Karimi A, Rossi L, Prati A (2020) Improving bert performance for aspect-based sentiment analysis. In arXiv:2010, 11731

  18. Pota M, Ventura M, Catelli R, Esposito M (2020) An effective BERT-based pipeline for Twitter sentiment analysis: a case study in Italian. Sensors 21(1):133

    Article  ADS  PubMed  PubMed Central  Google Scholar 

  19. Ramakrishnan S, Babu LD (2023) Enhancing twitter sentiment analysis using attention-based BiLSTM and BERT embedding. In: 2023 9th international conference on smart computing and communications (ICSCC). IEEE, pp 36–40

  20. Kumari K, Jha SS, Dayanand ZK, Sharma P (2023) September). ML &AI_IIITRanchi@ DravidianLangTech: fine-tuning IndicBERT for exploring language-specific features for sentiment classification in code-mixed dravidian languages. In: Proceedings of the third workshop on speech and language technologies for Dravidian languages. pp 192–197

  21. Eisenstein J (2013) What to do about bad language on the internet. In: Proceedings of the 2013 conference of the North American chapter of the association for computational linguistics: human language technologies. pp 359–369

  22. Pandarachalil R, Sendhilkumar S, Mahalakshmi GS (2015) Twitter sentiment analysis for large-scale data: an unsupervised approach. Cogn Comput 7(2):254–262

    Article  Google Scholar 

  23. El-Beltagy SR, Khalil T, Halaby A, Hammad M (2018) Combining lexical features and a supervised learning approach for Arabic sentiment analysis. In: Computational linguistics and intelligent text processing: 17th international conference, CICLing 2016, Konya, Turkey, April 3–9, 2016, Revised Selected Papers, Part II 17. Springer International Publishing, pp 307–319

  24. Bhattacharjee S, Das A, Bhattacharya U, Parui SK, Roy S (2015) Sentiment analysis using cosine similarity measure. In: 2015 IEEE 2nd international conference on recent trends in information systems (ReTIS). IEEE, pp 27–32

  25. Elshakankery K, Ahmed MF (2019) HILATSA: a hybrid Incremental learning approach for Arabic tweets sentiment analysis. Egypt Inform J 20(3):163–171

    Article  Google Scholar 

  26. Altaf A, Anwar MW, Jamal MH, Bajwa UI (2023) Exploiting linguistic features for effective sentence-level sentiment analysis in Urdu language. Multimed Tools Appl 2023:1–27

    Google Scholar 

  27. Mostafa AM, Aljasir M, Alruily M, Alsayat A, Ezz M (2023) Innovative forward fusion feature selection algorithm for sentiment analysis using supervised classification. Appl Sci 13(4):2074

    Article  CAS  Google Scholar 

  28. Kukkar A, Mohana R, Sharma A, Nayyar A, Shah MA (2023) Improving sentiment analysis in social media by handling lengthened words. IEEE Access 11:9775–9788

    Article  Google Scholar 

  29. https://www.kaggle.com/kazanova/sentiment140. Accessed 14 May 2023

  30. Eshan SC, Hasan MS (2017) An application of machine learning to detect abusive Bengali text. In: 2017 20th international conference of computer and information technology (ICCIT). IEEE, pp 1–6

  31. Berrar D (2019) Cross-Validation, 542-545

  32. Chen KY, Lee HC, Lin TC, Lee CY, Ho ZP (2023) Deep learning algorithms with LIME and similarity distance analysis on COVID-19 chest X-ray dataset. Int J Environ Res Public Health 20(5):4330

    Article  PubMed  PubMed Central  Google Scholar 

  33. Zhang Z, Lu Y, Zheng L, Li S, Yu Z, Li Y (2018) A new varying-parameter convergent-differential neural-network for solving time-varying convex QP problem constrained by linear-equality. IEEE Trans Autom Control 63(12):4110–4125

    Article  MathSciNet  Google Scholar 

  34. Zhang Z, Zheng L, Weng J, Mao Y, Lu W, Xiao L (2018) A new varying-parameter recurrent neural-network for online solution of time-varying Sylvester equation. IEEE Trans Cybern 48(11):3135–3148

    Article  PubMed  Google Scholar 

  35. Zhang Z, Fu T, Yan Z, Jin L, Xiao L, Sun Y, Li Y (2018) A varying-parameter convergent-differential neural network for solving joint-angular-drift problems of redundant robot manipulators. IEEE/ASME Trans Mechatronics 23(2):679–689

    Article  Google Scholar 

Download references

Funding

This research received no external funding.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Abderrahim Rafae.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rafae, A., Erritali, M. & Roche, M. Fusion of BERT embeddings and elongation-driven features. Multimed Tools Appl (2024). https://doi.org/10.1007/s11042-024-18786-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11042-024-18786-9

Keywords

Navigation