Story detection identifies and describes groups of related documents (= stories) from digital content streams. webLyzard extracts a rich set of metadata for each story identified, including the origin of the story in terms of publication time and author, its impact on the public debate, the temporal distribution of related publications, and the best keywords to summarize the story’s content.
The results are shown as an interactive Story Graph together with a list of top stories – each including a headline with the keywords and size of the cluster, a characteristic lead article, and a list of related documents. A short video tutorial shows how the Story Graph and other visual tools of the InVID project have been integrated into webLyzard’s visual analytics dashboard (see below for a summary of interactive graph features).
Clustering Digital Content Streams
The story detection component builds on highly scalable methods to cluster documents in real time, across multiple content sources and languages (English, French, German and Spanish). The methods are robust vis-à-vis noisy data (e.g. user postings from social media platforms, or results from speech-to-text conversion) and produce high-quality results even when applied to documents of very different structure and length. Three keywords per cluster serve as a label to describe its contents.
- Tooltips shown when hovering individual stories indicate their duration, the number of documents that belong to a particular story, and the associated keywords. A synchronization mechanism automatically highlights the corresponding story in the Story View as well. On click, users can use the tooltip to either focus on this particular story, or to exclude its content from the query.
- Labels can be deactivated via the settings icon in the upper right corner, which also provides other graph rendering options including the underlying metric (document count vs. weight) and methods to stack (silhouette, expand, zero, wiggle) and sort (default, inside-out, reverse) the stories.
- Clicking on the title or snippet of an article activates its full-text view.
- Clicking on the name of the source opens a separate window with the original article or posting.
- The play button selects the video to be displayed in the upper right corner.
- The number of articles in the grey headline triggers a search for those articles, while the arrow down expands the list of shown articles.
Officially launched with release 2018-12 (Queensland Lizard) and developed in collaboration with MODUL Technology as part of the InVID Horizon 2020 Project.