• Link to LinkedIn
  • Link to X
  • Link to Facebook
  • Link to Youtube
  • Link to Mail
Web Intelligence and Visual Analytics
webLyzard technology
  • Home
  • Solutions
    • Product Portfolio
    • Technology Showcases
  • Platform
    • Dashboard Overview
    • Visualization Tools
    • Data Services
  • Research
    • Research Projects
    • Horizon Europe Funding
    • Horizon Europe Dissemination
    • Publications
  • News
    • Latest Updates
    • Release History
    • Newsletter
  • About
    • Contact Details
    • Partners and Clients
    • Privacy Policy
  • Menu Menu

eWRT – Extensible Web Retrieval Toolkit

Knowledge capture in the age of massive Web data requires robust and scalable mechanisms to acquire, consolidate and pre-process large amounts of heterogeneous data. The Extensible Web Retrieval Toolkit (eWRT) is a modular open-source Python API that addresses this requirement. It retrieves data from social media sources such as Twitter, Facebook, Google+ and YouTube. eWRT also includes various helper classes for effective caching and data management.

github octocat

Available via GitHub, the eWRT toolkit provides components for (i) content acquisition and caching, (ii) converting doc, pdf and html files into text documents, (iii) natural language processing functions such as language detection and string similarity measures including Levenshtein and Soundex distances, (iv) comparing and visualizing ontologies, (v) text cleanup and string normalization, and (vi) streamlining Python programming tasks.

webLyzard Open Source ProjectsAccess the Source Codehttps://github.com/weblyzard–GitHub Repository

eWRT has been jointly developed by researchers from MODUL University Vienna, webLyzard technology, the University of Applied Sciences Chur, and the Vienna University of Economics and Business. The library is currently being extended as part of the uComp Project, which investigates Embedded Human Computation for Knowledge Extraction and Evaluation.

EWRT References

  • Weichselbraun, A., Scharl, A. and Lang, H.-P. (2013). Knowledge Capture from Multiple Online Sources with the Extensible Web Retrieval Toolkit (eWRT). Seventh International Conference on Knowledge Capture (K-CAP 2013). Banff, Canada.
Search Search

CATEGORIES

  • News & Events
  • Use Cases
  • Data Services
  • Visualizations
  • Research Projects

Recent Updates

  • AI Visibility Tracking – Monitoring Generative Engine ResultsJanuary 19, 2026 - 5:14 am
  • TRANSMIXR Presentation at IBC 2025 - Newsroom AI Toolbox
    Newsroom of the Future at IBC 2025September 29, 2025 - 9:42 pm
  • Sustainability Reporting with Generative AIJuly 20, 2025 - 11:59 am
  • CLAIM Project - Thumbnail
    Hybrid AI Models to Detect DisinformationApril 21, 2025 - 9:22 pm
  • Generative AI (GenAI) Thumbnail
    Generative AI for Content LifecyclesMarch 18, 2025 - 8:22 pm

About

webLyzard technology is an Austrian SME founded in 2008. The unique capabilities of its big data platform are based on a strong R&D track record in the fields of knowledge extraction, artificial intelligence, visualization and the integration of geospatial and semantic Web technologies.

web·Lyz·ard

Function: intelligence platform; Etymology: composed from web (as in World Wide Web) and lyzard (as in analyzer). 1 : (broadly) enriches digital content; identified by its speed, accuracy and scalability. 2 : predicts trends to gain a deeper understanding of information flows.

Visual Tools

  • Trend Chart Thumbnail
    Trend Chart – Dynamic Content MetricsOctober 18, 2020 - 9:00 am
  • Story Graph / Streamgraph Thumbnail
    Story Detection and Story Graph VisualizationApril 10, 2020 - 9:02 am
  • Geographic Map of Europe
    Geographic Map – Geospatial AnalyticsOctober 18, 2019 - 11:15 am

Data Services

  • AI Visibility Tracking – Monitoring Generative Engine ResultsJanuary 19, 2026 - 5:14 am
  • Wildcard Search and Regular Expressions
    Wildcard Search and Regular ExpressionsJanuary 9, 2025 - 4:46 am
  • Knowledge Graph - SKB - Thumbnail
    Knowledge Graph – Semantic Knowledge BaseNovember 28, 2024 - 10:00 am
Link to: Email Alerts and Clipping Services Link to: Email Alerts and Clipping Services Email Alerts and Clipping Servicesemail alerts thumbnailLink to: Be the Movement – Connect4Climate Link to: Be the Movement – Connect4Climate thumbnail for cop19 - be the movementBe the Movement – Connect4Climate
Scroll to top Scroll to top Scroll to top