Lie Detector for Social Media

Barack Obama was not born in the USA! Or was he? Social networks are rife with lies and deception, half-truths and myths. But irrespective of whether a story turns out to be fact or fake, its rapid spread can have unexpected and far-reaching consequences.

PHEME Research Project

Assessing the truthfulness of stories that go viral on social media platforms is the aim of PHEME, a research initiative carried out within the European 7th Framework Programme. With a duration of three years and a budget of EUR 2.9 million, the project enables researchers from seven different countries to tackle the problem with a multi-disciplinary approach that combines big data analytics with advanced linguistic and visual methods. The results will be generic and applicable across domains, but first evaluated in three use cases: medicine, digital journalism, and climate science communication. In May 2015, leading international researchers met at the 24th International World Wide Conference in Florence, Italy, where the PHEME consortium organized the RDSM-2015 Workshop on Rumors and Deception in Social Media.

Rumor Mill 2.0

Traditional media channels do no longer act as the sole gatekeepers who select newsworthy events and developments. Social media have an increasing influence on information diffusion processes – an ant quickly becomes an elephant, or a sneeze inflates into the threat of a global pandemic.

Coined by the evolutionary biologist Richard Dawkins in 1976 [1], the term “meme” stands for an idea or behavior that spreads among members of a community. Viral effects can amplify the spread of memes in virtual communities, which poses new challenges for policy makers and corporate decision makers. These stakeholders will be the main beneficiaries of the new technologies to be developed within the PHEME project.

Four ‘Vs’ of Big Data Analysis

Volume, variety and velocity – the three ‘Vs’ of big data analysis – represent obstacles for the automated analysis of digital content from social sources. Obstacles that the webLyzard Web intelligence platform has successfully addressed. The PHEME project will focus on the fourth ‘V’: the veracity of the acquired knowledge.

Novel visualization techniques embedded in the webLyzard dashboard will help to identify and track four types of dubious truths or rumors: speculation, controversy, misinformation and disinformation.

Building upon previous research [2] into the diffusion of information across online media, a combination of text mining and social network analysis will reveal veracity in three consecutive steps: (i) analyze the information contained in the document, (ii) cross-reference extracted facts with trustworthy data sources, and (iii) trace the propagation of information among network nodes.

Use Cases

The social context of storytelling has a significant impact on the diffusion of memes. The described set of “rumor intelligence” methods will therefore be tested in three specific domains (the figure below shows a screenshot of the current webLyzard dashboard that will be used for visualizing Pheme results):

  • Medicine – e.g., outbreak and spread of a contagious disease such as swine flu;
  • Digital Journalism – e.g., workflow improvements in the case of a breaking story;
  • Climate Science – e.g., filter for the veracity of statements about the causes and impacts of climate change (extending the advanced search capabilities of the Media Watch on Climate Change).

webLyzard Dashboard

Project Consortium

The PHEME research project is funded by the European Commission (EC) within the 7th Framework Programme (FP7) under Project No. 611233. It started on 01 Jan 2014 and will run for 36 months. The consortium includes the following organizations: University of Sheffield (UK), MODUL University Vienna (AT), Saarland University (DE), King’s College (UK), University of Warwick (UK), Ontotext (BG), ATOS (ES), iHub (KE), and Swissinfo (CH).


[1] Dawkins, R. (1976). The Selfish Gene. Oxford: Oxford University Press.
[2] Scharl, A., Weichselbraun, A. and Liu, W. (2007). “Tracking and Modelling Information Diffusion across Interactive Online Media”, International Journal of Metadata, Semantics and Ontologies, 2(2): 136-145.