Skip to main content

BERT-Based Sentiment Analysis: A Software Engineering Perspective

  • Conference paper
  • First Online:
Database and Expert Systems Applications (DEXA 2021)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12923))

Included in the following conference series:

Abstract

Sentiment analysis can provide a suitable lead for the tools used in software engineering along with the API recommendation systems and relevant libraries to be used. In this context, the existing tools like SentiCR, SentiStrength-SE, etc. exhibited low f1-scores that completely defeats the purpose of deployment of such strategies, thereby there is enough scope for performance improvement. Recent advancements show that transformer based pre-trained models (e.g., BERT, RoBERTa, ALBERT, etc.) have displayed better results in the text classification task. Following this context, the present research explores different BERT-based models to analyze the sentences in GitHub comments, Jira comments, and Stack Overflow posts. The paper presents three different strategies to analyse BERT based model for sentiment analysis, where in the first strategy the BERT based pre-trained models are fine-tuned; in the second strategy an ensemble model is developed from BERT variants, and in the third strategy a compressed model (Distil BERT) is used. The experimental results show that the BERT based ensemble approach and the compressed BERT model attain improvements by 6–12% over prevailing tools for the F1 measure on all three datasets.

All authors have contributed equally.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ahmed, T., Bosu, A., Iqbal, A., Rahimi, S.: SentiCR: a customized sentiment analysis tool for code review interactions, October 2017

    Google Scholar 

  2. Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information (2016)

    Google Scholar 

  3. Calefato, F., Lanubile, F., Maiorano, F., Novielli, N.: Sentiment polarity detection for software development (2017)

    Google Scholar 

  4. Calefato, F., Lanubile, F., Novielli, N.: EmoTxt: a toolkit for emotion recognition from text, October 2017

    Google Scholar 

  5. Chen, Z., Cao, Y., Lu, X., Mei, Q., Liu, X.: SEntiMoji: an emoji-powered learning approach for sentiment analysis in software engineering, July 2019

    Google Scholar 

  6. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding, October 2018

    Google Scholar 

  7. Goldberg, Y., Levy, O.: word2vec explained: deriving Mikolov et al’.s negative-sampling word-embedding method, February 2014

    Google Scholar 

  8. Imtiaz, N., Middleton, J., Murphy-Hill, E., Girouard, P.: Sentiment and politeness analysis tools on developer discussions are unreliable, but so are people, June 2018

    Google Scholar 

  9. Islam, M., Zibran, M.: Leveraging automated sentiment analysis in software engineering, May 2017

    Google Scholar 

  10. Islam, M., Zibran, M.: DEVA: sensing emotions in the valence arousal space in software engineering text, April 2018

    Google Scholar 

  11. Islam, M., Zibran, M.: SentiStrength-SE: exploiting domain specificity for improved sentiment analysis in software engineering text. J. Syst. Softw. 145, 125–146 (2018)

    Google Scholar 

  12. Islam, M., Zibran, M.: SentiStrength-SE: exploiting domain specificity for improved sentiment analysis in software engineering text, August 2018

    Google Scholar 

  13. Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: ALBERT: a lite BERT for self-supervised learning of language representations, September 2019

    Google Scholar 

  14. Liesting, T., Frasincar, F., Trusca, M.M.: Data augmentation in a hybrid approach for aspect-based sentiment analysis (2021)

    Google Scholar 

  15. Lin, B., Zampetti, F., Bavota, G., Di Penta, M., Lanza, M., Oliveto, R.: Sentiment analysis for software engineering: how far can we go? May 2018

    Google Scholar 

  16. Lin, B., Zampetti, F., Oliveto, R., Di Penta, M., Lanza, M., Bavota, G.: Two datasets for sentiment analysis in software engineering, September 2018

    Google Scholar 

  17. Liu, Y., et al.: RoBERTA: a robustly optimized BERT pretraining approach, July 2019

    Google Scholar 

  18. Loper, E., Bird, S.: NLTK: the natural language toolkit, July 2002

    Google Scholar 

  19. Mangnoesing, G.V.H., Trusca, M.M., Frasincar, F.: Pattern learning for detecting defect reports and improvement requests in app reviews (2020)

    Google Scholar 

  20. Manning, C., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S., McClosky, D.: The Stanford coreNLP natural language processing toolkit, January 2014

    Google Scholar 

  21. Novielli, N., Girardi, D., Lanubile, F.: A benchmark study on sentiment analysis for software engineering research, March 2018

    Google Scholar 

  22. Pagliardini, M., Gupta, P., Jaggi, M.: Unsupervised learning of sentence embeddings using compositional n-gram features. In: NAACL 2018 - Conference of the North American Chapter of the Association for Computational Linguistics (2018)

    Google Scholar 

  23. Pennington, J., Socher, R., Manning, C.: GloVe: global vectors for word representation, January 2014

    Google Scholar 

  24. Punn, N.S., Agarwal, S.: CHS-Net: a deep learning approach for hierarchical segmentation of COVID-19 infected CT images. arXiv preprint arXiv:2012.07079 (2020)

  25. Punn, N.S., Agarwal, S.: Multi-modality encoded fusion with 3d inception U-Net and decoder model for brain tumor segmentation. Multimedia Tools Appl., 1–16 (2020)

    Google Scholar 

  26. Rahman, M.M., Roy, C., Kievanloo, I.: Recommending insightful comments for source code using crowdsourced knowledge, September 2015

    Google Scholar 

  27. Rajora, H., Punn, N.S., Sonbhadra, S.K., Agarwal, S.: Web based disease prediction and recommender system (2021)

    Google Scholar 

  28. Socher, R., et al.: Recursive deep models for semantic compositionality over a sentiment treebank, January 2013

    Google Scholar 

  29. Thelwall, M., Buckley, K., Paltoglou, G., Cai, D., Kappas, A.: Sentiment strength detection in short informal text, December 2010

    Google Scholar 

  30. Torfi, A., Shirvani, R.A., Keneshloo, Y., Tavaf, N., Fox, E.A.: Natural language processing advancements by deep learning: a survey. arXiv preprint arXiv:2003.01200 (2020)

  31. Vaswani, A., et al.: Attention is all you need, June 2017

    Google Scholar 

  32. Xie, Z., Genthial, G., Xie, S., Ng, A., Jurafsky, D.: Noising and denoising natural language: Diverse backtranslation for grammar correction, January 2018

    Google Scholar 

  33. Zhang, Y., Hou, D.: Extracting problematic API features from forum discussions, May 2013

    Google Scholar 

Download references

Acknowledgment

We thank our institute, Indian Institute of Information Technology Allahabad (IIITA), India and Big Data Analytics (BDA) lab for allocating the centralised computing facility and other necessary resources to perform this research. We extend our thanks to our colleagues for their valuable guidance and suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Narinder Singh Punn .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Batra, H., Punn, N.S., Sonbhadra, S.K., Agarwal, S. (2021). BERT-Based Sentiment Analysis: A Software Engineering Perspective. In: Strauss, C., Kotsis, G., Tjoa, A.M., Khalil, I. (eds) Database and Expert Systems Applications. DEXA 2021. Lecture Notes in Computer Science(), vol 12923. Springer, Cham. https://doi.org/10.1007/978-3-030-86472-9_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-86472-9_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-86471-2

  • Online ISBN: 978-3-030-86472-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics