BERT-Based Sentiment Analysis: A Software Engineering Perspective

Batra, Himanshu; Punn, Narinder Singh; Sonbhadra, Sanjay Kumar; Agarwal, Sonali

doi:10.1007/978-3-030-86472-9_13

Himanshu Batra¹²,
Narinder Singh Punn¹²,
Sanjay Kumar Sonbhadra¹² &
…
Sonali Agarwal¹²

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12923))

Included in the following conference series:

International Conference on Database and Expert Systems Applications

1819 Accesses
11 Citations
3 Altmetric

Abstract

Sentiment analysis can provide a suitable lead for the tools used in software engineering along with the API recommendation systems and relevant libraries to be used. In this context, the existing tools like SentiCR, SentiStrength-SE, etc. exhibited low f1-scores that completely defeats the purpose of deployment of such strategies, thereby there is enough scope for performance improvement. Recent advancements show that transformer based pre-trained models (e.g., BERT, RoBERTa, ALBERT, etc.) have displayed better results in the text classification task. Following this context, the present research explores different BERT-based models to analyze the sentences in GitHub comments, Jira comments, and Stack Overflow posts. The paper presents three different strategies to analyse BERT based model for sentiment analysis, where in the first strategy the BERT based pre-trained models are fine-tuned; in the second strategy an ensemble model is developed from BERT variants, and in the third strategy a compressed model (Distil BERT) is used. The experimental results show that the BERT based ensemble approach and the compressed BERT model attain improvements by 6–12% over prevailing tools for the F1 measure on all three datasets.

All authors have contributed equally.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Ahmed, T., Bosu, A., Iqbal, A., Rahimi, S.: SentiCR: a customized sentiment analysis tool for code review interactions, October 2017
Google Scholar
Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information (2016)
Google Scholar
Calefato, F., Lanubile, F., Maiorano, F., Novielli, N.: Sentiment polarity detection for software development (2017)
Google Scholar
Calefato, F., Lanubile, F., Novielli, N.: EmoTxt: a toolkit for emotion recognition from text, October 2017
Google Scholar
Chen, Z., Cao, Y., Lu, X., Mei, Q., Liu, X.: SEntiMoji: an emoji-powered learning approach for sentiment analysis in software engineering, July 2019
Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding, October 2018
Google Scholar
Goldberg, Y., Levy, O.: word2vec explained: deriving Mikolov et al’.s negative-sampling word-embedding method, February 2014
Google Scholar
Imtiaz, N., Middleton, J., Murphy-Hill, E., Girouard, P.: Sentiment and politeness analysis tools on developer discussions are unreliable, but so are people, June 2018
Google Scholar
Islam, M., Zibran, M.: Leveraging automated sentiment analysis in software engineering, May 2017
Google Scholar
Islam, M., Zibran, M.: DEVA: sensing emotions in the valence arousal space in software engineering text, April 2018
Google Scholar
Islam, M., Zibran, M.: SentiStrength-SE: exploiting domain specificity for improved sentiment analysis in software engineering text. J. Syst. Softw. 145, 125–146 (2018)
Google Scholar
Islam, M., Zibran, M.: SentiStrength-SE: exploiting domain specificity for improved sentiment analysis in software engineering text, August 2018
Google Scholar
Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: ALBERT: a lite BERT for self-supervised learning of language representations, September 2019
Google Scholar
Liesting, T., Frasincar, F., Trusca, M.M.: Data augmentation in a hybrid approach for aspect-based sentiment analysis (2021)
Google Scholar
Lin, B., Zampetti, F., Bavota, G., Di Penta, M., Lanza, M., Oliveto, R.: Sentiment analysis for software engineering: how far can we go? May 2018
Google Scholar
Lin, B., Zampetti, F., Oliveto, R., Di Penta, M., Lanza, M., Bavota, G.: Two datasets for sentiment analysis in software engineering, September 2018
Google Scholar
Liu, Y., et al.: RoBERTA: a robustly optimized BERT pretraining approach, July 2019
Google Scholar
Loper, E., Bird, S.: NLTK: the natural language toolkit, July 2002
Google Scholar
Mangnoesing, G.V.H., Trusca, M.M., Frasincar, F.: Pattern learning for detecting defect reports and improvement requests in app reviews (2020)
Google Scholar
Manning, C., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S., McClosky, D.: The Stanford coreNLP natural language processing toolkit, January 2014
Google Scholar
Novielli, N., Girardi, D., Lanubile, F.: A benchmark study on sentiment analysis for software engineering research, March 2018
Google Scholar
Pagliardini, M., Gupta, P., Jaggi, M.: Unsupervised learning of sentence embeddings using compositional n-gram features. In: NAACL 2018 - Conference of the North American Chapter of the Association for Computational Linguistics (2018)
Google Scholar
Pennington, J., Socher, R., Manning, C.: GloVe: global vectors for word representation, January 2014
Google Scholar
Punn, N.S., Agarwal, S.: CHS-Net: a deep learning approach for hierarchical segmentation of COVID-19 infected CT images. arXiv preprint arXiv:2012.07079 (2020)
Punn, N.S., Agarwal, S.: Multi-modality encoded fusion with 3d inception U-Net and decoder model for brain tumor segmentation. Multimedia Tools Appl., 1–16 (2020)
Google Scholar
Rahman, M.M., Roy, C., Kievanloo, I.: Recommending insightful comments for source code using crowdsourced knowledge, September 2015
Google Scholar
Rajora, H., Punn, N.S., Sonbhadra, S.K., Agarwal, S.: Web based disease prediction and recommender system (2021)
Google Scholar
Socher, R., et al.: Recursive deep models for semantic compositionality over a sentiment treebank, January 2013
Google Scholar
Thelwall, M., Buckley, K., Paltoglou, G., Cai, D., Kappas, A.: Sentiment strength detection in short informal text, December 2010
Google Scholar
Torfi, A., Shirvani, R.A., Keneshloo, Y., Tavaf, N., Fox, E.A.: Natural language processing advancements by deep learning: a survey. arXiv preprint arXiv:2003.01200 (2020)
Vaswani, A., et al.: Attention is all you need, June 2017
Google Scholar
Xie, Z., Genthial, G., Xie, S., Ng, A., Jurafsky, D.: Noising and denoising natural language: Diverse backtranslation for grammar correction, January 2018
Google Scholar
Zhang, Y., Hou, D.: Extracting problematic API features from forum discussions, May 2013
Google Scholar

Download references

Acknowledgment

We thank our institute, Indian Institute of Information Technology Allahabad (IIITA), India and Big Data Analytics (BDA) lab for allocating the centralised computing facility and other necessary resources to perform this research. We extend our thanks to our colleagues for their valuable guidance and suggestions.

Author information

Authors and Affiliations

Indian Institute of Information Technology Allahabad, Prayagraj, India
Himanshu Batra, Narinder Singh Punn, Sanjay Kumar Sonbhadra & Sonali Agarwal

Authors

Himanshu Batra
View author publications
You can also search for this author in PubMed Google Scholar
Narinder Singh Punn
View author publications
You can also search for this author in PubMed Google Scholar
Sanjay Kumar Sonbhadra
View author publications
You can also search for this author in PubMed Google Scholar
Sonali Agarwal
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Narinder Singh Punn .

Editor information

Editors and Affiliations

University of Vienna, Vienna, Austria
Christine Strauss
Johannes Kepler University of Linz, Linz, Oberösterreich, Austria
Gabriele Kotsis
Vienna University of Technology, Vienna, Austria
A Min Tjoa
Johannes Kepler University of Linz, Linz, Austria
Ismail Khalil

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Batra, H., Punn, N.S., Sonbhadra, S.K., Agarwal, S. (2021). BERT-Based Sentiment Analysis: A Software Engineering Perspective. In: Strauss, C., Kotsis, G., Tjoa, A.M., Khalil, I. (eds) Database and Expert Systems Applications. DEXA 2021. Lecture Notes in Computer Science(), vol 12923. Springer, Cham. https://doi.org/10.1007/978-3-030-86472-9_13

Download citation

DOI: https://doi.org/10.1007/978-3-030-86472-9_13
Published: 31 August 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86471-2
Online ISBN: 978-3-030-86472-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics