Disambiguating context-dependent polarity of words: An information retrieval approach

https://doi.org/10.1016/j.ipm.2017.03.007Get rights and content

Abstract

The paper introduces PolaritySim – a novel approach to disambiguating context-dependent sentiment polarity of words. The task of resolving the polarity of a given word instance as positive or negative is addressed as an information retrieval problem. At the pre-processing stage, a vector of context features is built for each word w based on all its occurrences in the positive polarity corpus (consumer reviews with high ratings) and another vector – on its contexts in the negative polarity corpus (reviews with low ratings). Lexico-syntactic context features are automatically generated from dependency parse graphs of the sentences containing the word. These two vectors are treated as “documents”, one with positive and one with negative polarity. To resolve the contextual polarity of a specific instance of the word w in a given sentence, its context feature vector is built in the same way, and is treated as the “query”. An information retrieval (IR) model is then applied to calculate the similarity of the “query” to each of the two “documents”, with the polarity of the best matching “document” attributed to the “query”. The method uses no prior polarity sentiment lexicons or purposefully annotated training datasets. The only external resource used is a readily available corpus of user-rated reviews. Evaluation on different domains shows more effective performance compared to state-of-the-art baselines, Support Vector Machines (SVM) and Multinomial Naive Bayes (MNB) classifiers, on three out of four datasets. PolaritySim, SVM and MNB were also evaluated with an out-of-domain training corpus. The results indicate that PolaritySim is more effective and robust when used with an out-of-domain corpus compared to SVM and MNB. We conclude that an IR based approach can be an effective and robust alternative to machine learning approaches for disambiguating word-level polarity using either within-domain, or out-of-domain training corpora.

Introduction

The popularity of online review sites has led to an abundance of content written by consumers. For example, a recently released Amazon corpus (McAuley, Targett, Shi, & van den Hengel, 2015) contains 142.8 million reviews across a wide range of categories covering the period from 1996 to 2014. Most consumer reviews have overall ratings, representing the reviewer’s satisfaction with the product or service. Rated reviews are readily available sources of rich contextual information representing how words are used in positive and negative contexts. We propose PolaritySim – an extensible method for identifying the context-dependent polarity of words expressing an opinion about another word or phrase (opinion target). The only external resource required is a review corpus with user-assigned numerical ratings. The method determines sentiment valence of words with ambiguous (e.g. “small”) or unambiguous (e.g. “beautiful”) sentiment, as well as words that do not carry sentiment valence on their own, but acquire it through context. For example, it correctly determines the negative polarity of “eat” in “This camcorder eats up tape”. The task of disambiguating the polarity of a given word instance as positive or negative is addressed as an information retrieval (IR) problem. At the pre-processing stage, we build one vector of all contexts of the word w in the positive set (i.e. reviews with high ratings) and another vector – of its contexts in the negative set (reviews with low ratings). The lexico-syntactic context features are automatically generated from the dependency parse graphs of all the sentences containing the word w in the positive or the negative corpus. The resulting positive and negative vectors are treated as “documents”. At run time, to determine the polarity of a specific instance of w in an unlabeled review, a context vector is built, which is treated as the “query”. The context features for this vector are derived only from the current sentence containing this instance of w. An information retrieval model is then applied to calculate the similarity of the “query” to each of the two “documents”.

The PolaritySim method is extensible in a number of ways. For example, the words in the context features could be expanded with related words or the feature set can be expanded with co-occurring patterns from adjacent sentences. Section 4.3 describes one such extension, whereby words in the context features are expanded with related words generated using a Word2Vec model.

The rest of the paper is organized as follows: Section 2 outlines the motivations and contributions of this work, Section 3 discusses related work, Section 4 presents the method, Section 5 describes the datasets and evaluation experiments, Section 6 contains the analysis of results, and Section 7 concludes the paper and suggests future research directions.

Section snippets

Motivation and contributions of the work

Most research efforts in the sentiment analysis field have been directed at identifying sentiment and its polarity at the sentence or document level. Two major sentiment analysis approaches to date have been: (a) lexicon-based and (b) machine learning based. In the first approach, the polarity of individual words is first determined by using a prior polarity lexicon, then possible polarity shifters are identified, usually by applying hand-crafted rules. Sentence or document level polarities are

Related work

Sentiment analysis has received considerable attention over the past fifteen years. The body of research in this field can be grouped into three categories based on the linguistic units for which sentiment is predicted: words/phrases, sentences and documents. The majority of research effort has been focused on detecting sentence- and document-level sentiment and its polarity. There exist a number of comprehensive surveys that summarize and describe approaches in each of the three categories (

Methodology

The overall system architecture is presented in Fig. 1, and the detailed description of each stage is given in the following sections. In Stage 1 (Section 4.1) the system pre-processes the positive and negative corpora to generate a positive (posV) and negative (negV) vectors of context features for each word. In Stage 2 (Section 4.1), the system is given a sentence from an unlabeled document, and for each word instance, it builds a context feature vector (EvalV) using only the content of this

Evaluation

The evaluation was conducted on five datasets described in this section. Four datasets (Sections 5.3–5.5) were created by us to evaluate word-level polarity1. We also report evaluation on the dataset from the SemEval Aspect-Based Sentiment Analysis (ABSA) shared task in Section 5.6.

Results and discussion

The results in Table 3, Table 4, Table 5, Table 6, Table 7, Table 8 show that PolaritySimQACW and PolaritySimTF.IDF are more effective model variants than PolaritySimCosine. Both PolaritySimTF.IDF and PolaritySimQACW outperformed all MNB and SVM baselines on the Restaurant and Photography datasets, as well as the official SVM baseline on the ABSA dataset. On the AmbAdj dataset PolaritySimQACW outperformed the best MNB and SVM variants with unigram and bigram features by 2.4%, whereas PolaritySim

Conclusion and future work

We described an effective method called PolaritySim for determining word-level contextual polarity that uses readily available consumer rated reviews as the only external resource. The advantage of PolaritySim is that it does not require manually constructed sentiment lexicons or corpora annotated at word or sentence level, which are labour-intensive resources to build. We approach the problem of word-level polarity determination as an IR problem, whereby the context vector representing the

Acknowledgments

The author would like to thank the following annotators: Mohamad Ahmadi, Kaheer Suleman, Stuart Sullivan, Jack Thomas and Andrew Toulis. This work has been supported by the NSERC Discovery grant (no. RGPIN 261439-2013).

References (52)

  • K. Sparck Jones et al.

    A probabilistic model of information retrieval: development and comparative experiments: Part 2

    Information Processing & Management

    (2000)
  • O. Vechtomova et al.

    A domain-independent approach to finding related entities

    Information Processing & Management

    (2012)
  • R. Xia et al.

    Polarity shift detection, elimination and ensemble

    Information Processing \& Management

    (2016)
  • S. Baccianella et al.

    Sentiwordnet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining

    Proceedings of LREC

    (2010)
  • Bradley, M. M., Lang, P. J., Bradley, M. M., & Lang, P. J. (1999). Affective norms for english words (ANEW):...
  • C. Brun

    Learning opinionated patterns for contextual opinion detection

    24th International conference on computational linguistics

    (2012)
  • C. Brun et al.

    XRCE at SemEval-2016 task 5: Feedbacked ensemble modeling on syntactico-semantic knowledge for aspect based sentiment analysis

    Proceedings of the 10th international workshop on semantic evaluation (SemEval-2016)

    (2016)
  • M. Chernyshevich

    IHS-RD-Belarus at SemEval-2016 task 5: Detecting sentiment polarity using the heatmap of sentence

    Proceedings of the 10th international workshop on semantic evaluation (SemEval-2016)

    (2016)
  • I. Chetviorkin et al.

    Two-step model for sentiment lexicon extraction from twitter streams

    Proceedings of the 5th workshop on computational approaches to subjectivity, sentiment and social media analysis

    (2014)
  • X. Ding et al.

    A holistic lexicon-based approach to opinion mining

    Proceedings of the 2008 international conference on web search and data mining

    (2008)
  • A. Esuli et al.

    Sentiwordnet: A publicly available lexical resource for opinion mining

    Proceedings of the 5th conference on language resources and evaluation (LREC’06)

    (2006)
  • A. Fahrni et al.

    Old wine or warm beer: Target-specific sentiment analysis of adjectives

    Proceedings of the symposium on affective language in human and machine, AISB

    (2008)
  • V. Hatzivassiloglou et al.

    Predicting the semantic orientation of adjectives

    Proceedings of the 35th annual meeting of the association for computational linguistics and eighth conference of the European Chapter of the Association for Computational Linguistics

    (1997)
  • HuM. et al.

    Mining and summarizing customer reviews

    Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining

    (2004)
  • D. Ikeda et al.

    Learning to shift the polarity of words for sentiment classification

    (2008)
  • JiangL. et al.

    Target-dependent twitter sentiment classification

    Proceedings of the 49th annual meeting of the Association for Computational Linguistics: Human language technologies-volume 1

    (2011)
  • M. Joshi et al.

    Generalizing dependency features for opinion mining

    Proceedings of the ACL-IJCLP 2009 conference short papers

    (2009)
  • A. Kennedy et al.

    Sentiment classification of movie reviews using contextual valence shifters

    Computational Intelligence

    (2006)
  • H. Kessler et al.

    Classification of inconsistent sentiment words using syntactic constructions

    24th International conference on computational linguistics

    (2012)
  • T. Khalil et al.

    NileTMRG at SemEval-2016 task 5: Deep convolutional neural networks for aspect category and sentiment extraction

    Proceedings of the 10th international workshop on semantic evaluation (SemEval-2016)

    (2016)
  • A. Kumar et al.

    IIT-TUDA at SemEval-2016 task 5: Beyond sentiment lexicon: Combining domain dependency and distributional semantics features for aspect based sentiment analysis

    Proceedings of the 10th international workshop on semantic evaluation (SemEval-2016)

    (2016)
  • R.Y.K. Lau et al.

    Leveraging web 2.0 data for scalable semi-supervised learning of domain-specific sentiment lexicons

    Proceedings of the 20th ACM international conference on information and knowledge management

    (2011)
  • R.Y.K. Lau et al.

    Web 2.0 environmental scanning and adaptive decision support for business mergers and acquisitions

    MIS Quarterly

    (2012)
  • LiS. et al.

    Sentiment classification with polarity shifting detection

    Proceedings of 2013 International conference on Asian Language Processing (IALP)

    (2013)
  • B. Liu

    Sentiment analysis - Mining opinions, sentiments, and emotions

    (2015)
  • Y. Lu et al.

    Automatic construction of a context-aware sentiment lexicon: an optimization approach

    Proceedings of the 20th international conference on world wide web

    (2011)
  • Cited by (25)

    • Enhancing Optimized Personalized Therapy in Clinical Decision Support System using Natural Language Processing

      2022, Journal of King Saud University - Computer and Information Sciences
      Citation Excerpt :

      In programming languages lexicons called as sentiment polarity libraries are used. For example (Vechtomova, 2017) AFINN has 2477 valence labels, ANEW has 1034 words, MPQA has 8222 words. These are labelled as sentiment strengths (weak and strong) with prior polarity scores.

    • Polarity Classification of Twitter Messages using Audio Processing

      2020, Information Processing and Management
      Citation Excerpt :

      Vocabulary can contain word combinations, such as n-grams, where an n-gram simply refers to a combination of n words. If a word itself does not carry any particular information but acquire it through the context, it is more practical to use n > 1 (Vechtomova, 2017). Traditional feature definitions for this model are the count of n-grams observed both in the given text and in the vocabulary, or a binary definition which implies the presence of the input n-gram in the vocabulary (Wang, Liu, Luo, & Wang, 2018).

    • What makes a review a reliable rating in recommender systems?

      2020, Information Processing and Management
    • An ensemble method with sentiment features and clustering support

      2019, Neurocomputing
      Citation Excerpt :

      Context is a factor for determining the polarity of a word (e.g., cheap design (negative) vs. cheap price (positive)). To take into account this fact, Vechtomova [34] applies reference corpora with sentiment annotated documents for disambiguating sentiment polarity. This information retrieval method is efficient at word-level but sentence-level.

    • Automated estimation of item difficulty for multiple-choice tests: An application of word embedding techniques

      2018, Information Processing and Management
      Citation Excerpt :

      In 2013, Mikolov et al. proposed Word2vec (Mikolov, Chen, Corrado, & Dean, 2013), which uses recurrent neural network language models to train the vector representations of words. In recent years, Word2vec has achieved major success in semantic similarity and information retrieval applications (Fernández-Reyes, Hermosillo-Valadez, & Montes-y-Gómez, 2018; Vechtomova, 2017). This section describes our proposed method for the automated estimation of MCI difficulty in social studies tests.

    View all citing articles on Scopus
    View full text