ABSTRACT
Sentiment analysis aims at extracting opinions and or emotions mainly from written text. The most popular problem in sentiment analysis certainly is polarity detection, which falls into the broader class of Natural Language Processing (NLP) problems of text classification. To date, state-of-the-art approaches to text classification use neural language models built on popular architectures such as Transformers. However, these approaches are difficult to apply in low-resource languages and domains, as for instance the Italian language or small clinical trials. Motivated by this, this paper presents VADER-IT, a lexicon-based algorithm for polarity prediction in written text, that is an adaptation to the Italian language of the popular VADER. Unlike VADER, our system also predicts a polarity class (i.e. positive, negative or neutral). The system was tested on a dataset of 5495 healthcare related reviews from QSalute https://www.qsalute.it/, reaching a micro averaged F1--score = 81% and a micro averaged Jaccard - score = 73%.
- Apoorv Agarwal, Boyi Xie, Ilia Vovsha, Owen Rambow, and Rebecca J Passonneau. 2011. Sentiment analysis of twitter data. In Proceedings of the workshop on language in social media (LSM 2011). 30--38.Google ScholarDigital Library
- Luca Bacco, Andrea Cimino, Luca Paulon, Mario Merone, and Felice Dell'Orletta. 2020. A Machine Learning approach for Sentiment Analysis for Italian Reviews in Healthcare. Computational Linguistics CLiC-it 2020 630, 699 (2020), 16.Google Scholar
- Valerio Basile and Malvina Nissim. 2013. Sentiment analysis on Italian tweets. In Proceedings of the 4th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis. Association for Computational Linguistics, Atlanta, Georgia, 100--107. https://www.aclweb.org/anthology/W13-1614Google Scholar
- Margaret M Bradley and Peter J Lang. 1999. Affective norms for English words (ANEW): Instruction manual and affective ratings. Technical Report. Technical report C-1, the center for research in psychophysiology ....Google Scholar
- Rosario Catelli, Serena Pelosi, and Massimo Esposito. 2022. Lexicon-Based vs. Bert-Based Sentiment Analysis: A Comparative Study in Italian. Electronics 11 (01 2022), 374. Google ScholarCross Ref
- Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 4171--4186. Google ScholarCross Ref
- Shihab Elbagir and Jing Yang. 2019. Twitter sentiment analysis using natural language toolkit and VADER sentiment. In Proceedings of the international multiconference of engineers and computer scientists, Vol. 122. 16.Google Scholar
- Yu Gu, Robert Tinn, Hao Cheng, Michael Lucas, Naoto Usuyama, Xiaodong Liu, Tristan Naumann, Jianfeng Gao, and Hoifung Poon. 2020. Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing. arXiv:arXiv:2007.15779Google Scholar
- Kexin Huang, Jaan Altosaar, and Rajesh Ranganath. 2019. Clinicalbert: Modeling clinical notes and predicting hospital readmission. arXiv preprint arXiv:1904.05342 (2019).Google Scholar
- Clayton Hutto and Eric Gilbert. 2014. Vader: A parsimonious rule-based model for sentiment analysis of social media text. In Proceedings of the international AAAI conference on web and social media, Vol. 8. 216--225.Google ScholarCross Ref
- Katikapalli Subramanyam Kalyan, Ajit Rajasekharan, and Sivanesan Sangeetha. 2022. AMMU: A survey of transformer-based biomedical pretrained language models. Journal of Biomedical Informatics 126 (2022), 103982. Google ScholarDigital Library
- Svetlana Kiritchenko, Xiaodan Zhu, and Saif M Mohammad. 2014. Sentiment analysis of short informal texts. Journal of Artificial Intelligence Research 50 (2014), 723--762.Google ScholarCross Ref
- Jinhyuk Lee, Wonjin Yoon, Sungdong Kim, Donghyeon Kim, Sunkyu Kim, Chan Ho So, and Jaewoo Kang. 2020. BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36, 4 (2020), 1234--1240.Google ScholarCross Ref
- Shervin Minaee, Nal Kalchbrenner, Erik Cambria, Narjes Nikzad, Meysam Chenaghlu, and Jianfeng Gao. 2021. Deep learning-based text classification: a comprehensive review. ACM Computing Surveys (CSUR) 54, 3 (2021), 1--40.Google ScholarDigital Library
- James W Pennebaker, Ryan L Boyd, Kayla Jordan, and Kate Blackburn. 2015. The development and psychometric properties of LIWC2015. Technical Report.Google Scholar
- Emanuele Pianta, Luisa Bentivogli, and Christian Girardi. 2002. MultiWordNet: developing an aligned multilingual database. In First international conference on global WordNet. 293--302.Google Scholar
- Laila Rasmy, Yang Xiang, Ziqian Xie, Cui Tao, and Degui Zhi. 2021. Med-BERT: pretrained contextualized embeddings on large-scale structured electronic health records for disease prediction. NPJ digital medicine 4, 1 (2021), 1--13.Google Scholar
- Karsten Tymann, Matthias Lutz, Patrick Palsbröker, and Carsten Gips. 2019. GerVADER-A German Adaptation of the VADER Sentiment Analysis Tool for Social Media Texts.. In LWDA. 178--189.Google Scholar
- Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in neural information processing systems. 5998--6008.Google Scholar
- Chiara Zucco, Barbara Calabrese, Giuseppe Agapito, Pietro H Guzzi, and Mario Cannataro. 2020. Sentiment analysis for mining texts and social networks data: Methods and tools. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 10, 1 (2020), e1333.Google ScholarCross Ref
- Chiara Zucco, Clarissa Paglia, Sonia Graziano, Sergio Bella, and Mario Cannataro. 2020. Sentiment analysis and text mining of questionnaires to support telemonitoring programs. Information 11, 12 (2020), 550.Google ScholarCross Ref
Index Terms
- An Italian lexicon-based sentiment analysis approach for medical applications
Recommendations
Investigating the Sentiment in Italian Long-COVID Narrations
Computational Science – ICCS 2023AbstractThrough an overview of the history of the disease, Narrative Medicine (NM) aims to define and implement an effective, appropriate and shared treatment path. In the context of COVID-19, several blogs were produced, among those the “Sindrome Post ...
Generate domain-specific sentiment lexicon for review sentiment analysis
Lexicon-based approaches for review sentiment analysis have attracted significant attention in recent years. Lots of sentiment lexicon generation methods have been proposed. However, the generation of domain-specific lexicon with unlabeled data has not ...
Joint sentiment/topic model for sentiment analysis
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge managementSentiment analysis or opinion mining aims to use automated tools to detect subjective information such as opinions, attitudes, and feelings expressed in text. This paper proposes a novel probabilistic modeling framework based on Latent Dirichlet ...
Comments