Charles Explorer logo
🇬🇧

Extending general sentiment lexicon to specific domains in (semi-)automatic manner

Publication

Abstract

This paper describes an approach to the construction of a sentiment analysis system that uses both automatic and manual processes. The system includes a domain-specific sentiment lexicon, modifier patterns and rules that are used to derive the sentiment values of sentences in new texts.

The lexicon that includes single words (unigrams) is obtained in an automatic manner from the distribution of ratings for all words in the labelled training data. The sentiment values of phrases is derived from a list of modifier patterns, built/developed manually.

These include a modifier and a focal element. The modifiers can be of different types, depending on whether the operation is intensification, downtoning or reversal.

This approach was applied to texts on economics and finance in European Portuguese. In our view, this line of work deserves more attention in the community, as the system not only has reasonable performance, but also can provide understandable explanations to the user.