Measures of bias in news and social media coverage are essential when investigating trends and differing perceptions of various interest groups. webLyzard uses sentiment information to (i) classify and sort search results, (ii) enrich visualizations such as tag clouds, geographic maps and information landscapes, and (iii) provide data services for accurately tagging third-party content.
A significant portion of news and social media coverage contains opinions with clear economic relevance – product and travel reviews, for example, or the articles of well-known and respected bloggers who influence consumer purchase decisions. Analyzing and acting upon user-generated content is becoming imperative for decision makers who aim to engage with large user communities.
Extracting Opinions from Multilingual Resources
The user-generated content gathered from various social media platforms is noisy, unstructured, and multilingual. Processing this type of content requires robust text mining techniques, ranging from simple spelling corrections to more complex tasks such as sentiment analysis or the identification of named entities (locations, persons, organizations). Multilingual capabilities are becoming increasingly important when applying these techniques to global networks. This goes beyond merely detecting the language(s) used, a comparably straightforward task. It requires specific language resources (sentiment lexicons), and a system that is capable of processing grammatical structures and consider the idiosyncrasies of specific language communities.
Context-Awareness and Dealing with Ambiguity
The ever-increasing amount of articles and the limits of human cognition require automated approaches to analyzing the sentiment expressed in online content. As part of opinion mining, sentiment analysis identifies and aggregates polar opinions – i.e., positive or negative statements about facts. For achieving accurate results, one needs to deal with the inherent ambiguities of the human language. webLyzard’s method of automatically determining has been continuously optimized since 2003, giving particular attention to the context of opinionated terms when resolving such ambiguities.
Recent articles published in Knowledge-Based Systems and IEEE Intelligent Systems, two prestigious scientific journals, present webLyzard’s novel approach to contextualization. These articles describe how the underlying methods were developed, point out their advantages over existing approaches, and include detailed evaluation sections based on user-generated online reviews of movies (IMDb), hotels (TripAdvisor), and products (Amazon).
- Weichselbraun, A., Gindl, S., Fischer, F., Vakulenko, S. and Scharl, A. (2017). “Aspect-Based Extraction and Analysis of Affective Knowledge from Social Media Streams“, IEEE Intelligent Systems, 32(3): 80-88.
- Weichselbraun, A., Gindl, S. and Scharl, A. (2014). “Enriching Semantic Knowledge Bases for Opinion Mining in Big Data Applications”, Knowledge-Based Systems, 69:78-85.
- Weichselbraun, A., Gindl, S. and Scharl, A. (2013). “Extracting and Grounding Contextualized Sentiment Lexicons”, IEEE Intelligent Systems, 28(2): 39-46.
While the first IEEE article (2013) focuses on disambiguation and the generation of contextualized sentiment lexicons, the Knowledge-Based Systems article extends the presented methods in order to enrich semantic knowledge bases and integrate external affective resources into the opinion mining process. The second IEEE article (2017) outlines how to increase the granularity of the approach by focusing on aspects to which emotional values apply, for example product features (e.g., a digital camera’s resolution), or perceptions of a specific event (e.g., as part of a sponsorship agreement).
Other Selected Publications
- Scharl, A., Sabou, M., Gindl, S., Rafelsberger, W., Weichselbraun, A. (2012). “Leveraging the Wisdom of the Crowds for the Acquisition of Multilingual Language Resources“, Language Resources and Evaluation Conference (LREC-2012). Istanbul, Turkey.
- Weichselbraun, A., Gindl, S. and Scharl, A. (2011). “Using Games with a Purpose and Bootstrapping to Create Domain-Specific Sentiment Lexicons“, 20th ACM Conference on Information and Knowledge Management (CIKM-2011). Glasgow, UK: Association for Computing Machinery: 1053-1060.
- Gindl, S., Weichselbraun, A. and Scharl, A. (2010). “Cross-Domain Contextualization of Sentiment Lexicons”, 19th European Conference on Artificial Intelligence (ECAI-2010). H. Coelho et al. Lisbon, Portugal: IOS Press: 771-776.
- Scharl, A. and Weichselbraun, A. (2008). “An Automated Approach to Investigating the Online Media Coverage of US Presidential Elections”, Journal of Information Technology & Politics, 5(1): 121-132.
- Scharl, A., Pollach, I. and Bauer, C. (2003). “Determining the Semantic Orientation of Web-based Corpora”, Intelligent Data Engineering and Automated Learning, 4th International Conference (IDEAL-2003), Hong Kong (LNCS, Vol. 2690). Ed. J. Liu et al. Berlin: Springer. 840-849.