A significant portion of news and social media coverage contains opinions with clear economic relevance – product and travel reviews, for example, or the articles of well-known and respected bloggers who influence consumer purchase decisions. Sentiment analysis and other opinion mining techniques to act upon this user-generated content are becoming imperative for decision makers who aim to engage with large user communities.
Turning Data into Opinions
The ever-increasing amount of such user-generated content and the limits of human cognition require automated approaches to analyzing the sentiment expressed in online content. As part of opinion mining, sentiment analysis identifies and aggregates polar opinions – i.e., positive or negative statements about facts.
The results can be provided as data services or in the form of visual tools. These tools color-code the results to indicate the automated classification. Colors in the following examples range from red (negative) to grey (neutral) and green (positive). They vary in saturation, depending on the degree of polarity. Vivid colors hint at emotionally charged issues, less saturated ones at a more neutral coverage.
Multilingual Sentiment Analysis
Parlez-vous français? The user-generated content gathered from various social media platforms is noisy, unstructured, and multilingual. Processing this type of content and extracting opinions from multilingual resources requires robust text mining techniques. This ranges from simple spelling corrections to more complex tasks such as sentiment analysis or the identification of named entities (locations, persons, organizations).
Multilingual capabilities are becoming increasingly important when applying these techniques to global networks. This goes beyond merely detecting the language(s) used, a comparably straightforward task. It requires specific language resources, for example sentiment lexicons, and a system that is capable of processing grammatical structures and consider the idiosyncrasies of specific language communities.
Dealing with Ambiguity
Is it Apple or an apple? Does the tweet refer to the multinational technology company or an edible fruit? Many automated systems cannot resolve such ambiguities. webLyzard addresses this problem through domain-specific knowledge, context awareness and the ability to identify relations between semantic concepts.
Recent articles published in Knowledge-Based Systems and IEEE Intelligent Systems, two prestigious scientific journals, present webLyzard’s novel approach to contextualization. These articles describe how the underlying methods were developed and point out their advantages over existing approaches. They include detailed evaluation sections based on user-generated online reviews of movies (IMDb), hotels (TripAdvisor) and products (Amazon).
The 2013 IEEE article focuses on disambiguation and the generation of contextualized sentiment lexicons. The follow-up article in Knowledge-Based Systems extends the method, using knowledge graph enrichment techniques to integrate external affective resources into the opinion mining process. The 2017 IEEE article then increases the granularity of the approach by focusing on specific aspects. This includes product features (e.g., a digital camera’s resolution) or the perceptions of a specific event (e.g., as part of a sponsorship agreement). Another important extension is emotion detection, using more fine-grained affect models to distinguish a range of human emotions such as Joy, Trust, Anger and Fear.
- Weichselbraun, A., Gindl, S., Fischer, F., Vakulenko, S. and Scharl, A. (2017). “Aspect-Based Extraction and Analysis of Affective Knowledge from Social Media Streams“, IEEE Intelligent Systems, 32(3): 80-88.
- Weichselbraun, A., Gindl, S. and Scharl, A. (2014). “Enriching Semantic Knowledge Bases for Opinion Mining in Big Data Applications”, Knowledge-Based Systems, 69:78-85.
- Weichselbraun, A., Gindl, S. and Scharl, A. (2013). “Extracting and Grounding Contextualized Sentiment Lexicons”, IEEE Intelligent Systems, 28(2): 39-46.
- Scharl, A. and Weichselbraun, A. (2008). “An Automated Approach to Investigating the Online Media Coverage of US Presidential Elections”, Journal of Information Technology & Politics, 5(1): 121-132.