• Link to LinkedIn
  • Link to X
  • Link to Facebook
  • Link to Youtube
  • Link to Mail
Web Intelligence and Visual Analytics
webLyzard technology
  • Home
  • Solutions
    • Product Portfolio
    • Technology Showcases
  • Platform
    • Dashboard Overview
    • Visualization Tools
    • Data Services
  • Research
    • Research Projects
    • Horizon Europe Funding
    • Horizon Europe Dissemination
    • Publications
  • News
    • Latest Updates
    • Release History
    • Newsletter
  • About
    • Contact Details
    • Partners and Clients
    • Privacy Policy
  • Menu Menu
Wildcard Search and Regular Expressions

Wildcard Search and Regular Expressions

In specific situations, searching for simple terms may not be precise enough to cover search results for your topic of interest. This is especially noticeable when seeking results for ambiguous concepts, such as brand names like “Apple” and “Amazon”. Simple searches for these words will probably contain documents not referring to the brands, but to fruit or rivers in South America. For such cases, the webLyzard platform provides wildcard search capabilities and supports regular expressions (RegExp).

What is a Wildcard Search?

Wildcards are essential for state-of-the-art analytics solutions as they enable users to obtain precise results and disambiguate search terms effectively. By supporting symbols like asterisks (*) and question marks (?), wildcard searches enhance the accuracy of data retrieval, accommodates spelling variations, and allows for more comprehensive and nuanced searches across large datasets.

Defining Boomarks and Filters

Wildcard search and regular expressions (RegExp) help users define bookmarks and search filters. They grant the ability to disambiguate the search query and increase the precision of returned results. Regular expressions allow you to search for complex text patterns. They offer wildcard searches to represent unknown or variable characters in a search query. For example, wildcards can help you cover multiple variations of a phrase and search for unknown prefixes/suffixes. For the sake of simplicity and accessibility, the platform supports a limited set of RegExp notations, which are available in the Phrase Editor.

Wildcard Search in the webLyzard Dashboard

The supported RegEx set consists of the following:

  • ? – Question marks represent a single character; ‘networks?’, for example, considers both the singular (‘network’) as well as the plural (‘networks’) of the term.
  • () – Brackets define a character set or range to match. You can use brackets to mark more than one character as optional; e.g. ‘network(ed)?’ includes both ‘network’ and ‘networked’.
  • | – Vertical bars represent ”or” operator, used to specify alternative matches; e.g. ‘network(s|ed)?’ matches ‘network’, ‘networks’, and ‘networked’.

* Is the Star of Search

The notations listed above help avoid defining a new line for each word variation. By contrast, use the *-notation to find prefixes and suffixes that you might be missing. You can also place an asterisk in the middle of a word to see different variations. Note that you can only use this specific wildcard when searching for single terms and not multi-word phrases.

Examples:

  • Suffix matches: ‘network*’ will match ‘network’, ‘networks’, ‘networking’, ‘networked’, etc.
  • Prefix matches: ‘*work’ will match ‘work’, ‘network’, ‘framework’, etc.
  • In-Word matches: ‘colo*r*’ will match ‘color’, ‘colors’, ‘colours’, ‘colored’, ‘coloured’, etc.

Use asterisks sparingly to prevent unwanted overmatches. For example, ‘colo*r*’ will also match ‘Colorado’. Nonetheless, this technique can help explore variations you may not have thought of yet. Subsequently,  these variations can form the bases for additional list entries.

Regular Expressions in the Phrase Editor

Instead of having to define one line each per RegExp or searched phrase, select ‘list of phrases’ to activate the Phrase Editor. This editor allows you to define and manage lists of terms. Each line can contain either a single word, a phrase, or a regular expression (RegExp).

Enter at least one word or phrase per line. Press ENTER to add a new line. By default, at least one line has to match to include a document in the set of search results. There is no need to use quotation marks to identify phrases such as big data. The column on the right shows the number of documents matching the query defined by this particular line. It considers the currently selected content source(s) and time interval. You can sort the lines alphabetically or by the number of matching documents.

webLyzard Topic Editor with examples of wildcard search

Additional Options

Below the text input lines, you can specify the minimal number of RegExp lines that a document must match for inclusion in the search results. This can improve the precision of the query at the cost of lower results. It makes sense for terms that are ambiguous without additional context information (like Amazon the company vs. Amazon the river).

You can export both the regular expression list and the expanded term list as comma-separated values (CSV) or plain text files. To import existing term lists, just copy/paste text into the editor, which automatically creates the required number of lines.

On mouse-over, the editor provides an expand/shrink option to preview a list of all phrases matching the regular expression. The gear icon opens a tooltip to access the Topic Wizard. This visual editor for regular expressions helps you explore phrase variations. If the line is not in a valid RegExp format, it will show a brief help text and disable the wizard for this particular line.

Reserved and Special Characters

When searching for the reserved characters listed below in full-text fields (text, title), the system treats any of these special characters like a space without any inherent meaning. Therefore, it is not possible to search for them directly.

  • List of reserved characters:
    + – = && || > < ! ( ) { } [ ] ^ ” ~ * ? . , : \ /
  • Example: When searching for “eco-innovation”, the system treats “-” as a word separator (similar to a white space). It will also match documents containing “eco innovation”, “eco&innovation”, “eco/innovation”, etc.

Supported Fields

In addition to the Phrase Editor, the following fields support Regular Expressions: text, title, text or title, text (same sentence) and URL. Note that only the Phrase Editor supports the full set of regular expressions in the form listed above. The search box and any other fields only support the general wildcard (‘*’) token and the ?-token, which in these cases stands for a single, unknown character. The only exception applies when searching for an ‘exact phrase’ or using quotation marks (“”), which then treats it as a phrase. If this input consists of multiple words, the system will strip any wildcards and not consider them in the search.

When using the global filter to search, you can manually enter text to search within text and titles with the above-mentioned limited set of regular expression possibilities. The same search limitations apply when creating “pills” in the Lite Version of the webLyzard dashboard.

We invite you to try out these features with our UN Dashboard or to contact us for other domains or tailored solutions.

https://www.weblyzard.com/data/sites/21/wildcard-search-regular-expressions.png 280 280 haslauer https://www.weblyzard.com/data/sites/21/weblyzard-logo-2020.png haslauer2025-01-09 04:46:322025-07-22 09:09:14Wildcard Search and Regular Expressions
Search Search

CATEGORIES

  • News & Events
  • Use Cases
  • Data Services
  • Visualizations
  • Research Projects

Recent Updates

  • AI Visibility Tracking – Monitoring Generative Engine ResultsJanuary 19, 2026 - 5:14 am
  • TRANSMIXR Presentation at IBC 2025 - Newsroom AI Toolbox
    Newsroom of the Future at IBC 2025September 29, 2025 - 9:42 pm
  • Sustainability Reporting with Generative AIJuly 20, 2025 - 11:59 am
  • CLAIM Project - Thumbnail
    Hybrid AI Models to Detect DisinformationApril 21, 2025 - 9:22 pm
  • Generative AI (GenAI) Thumbnail
    Generative AI for Content LifecyclesMarch 18, 2025 - 8:22 pm

About

webLyzard technology is an Austrian SME founded in 2008. The unique capabilities of its big data platform are based on a strong R&D track record in the fields of knowledge extraction, artificial intelligence, visualization and the integration of geospatial and semantic Web technologies.

web·Lyz·ard

Function: intelligence platform; Etymology: composed from web (as in World Wide Web) and lyzard (as in analyzer). 1 : (broadly) enriches digital content; identified by its speed, accuracy and scalability. 2 : predicts trends to gain a deeper understanding of information flows.

Visual Tools

  • Trend Chart Thumbnail
    Trend Chart – Dynamic Content MetricsOctober 18, 2020 - 9:00 am
  • Story Graph / Streamgraph Thumbnail
    Story Detection and Story Graph VisualizationApril 10, 2020 - 9:02 am
  • Geographic Map of Europe
    Geographic Map – Geospatial AnalyticsOctober 18, 2019 - 11:15 am

Data Services

  • AI Visibility Tracking – Monitoring Generative Engine ResultsJanuary 19, 2026 - 5:14 am
  • Wildcard Search and Regular Expressions
    Wildcard Search and Regular ExpressionsJanuary 9, 2025 - 4:46 am
  • Knowledge Graph - SKB - Thumbnail
    Knowledge Graph – Semantic Knowledge BaseNovember 28, 2024 - 10:00 am
Link to: Social Services Transformation – BENEFITS Project Link to: Social Services Transformation – BENEFITS Project Social Services Transformation – BENEFITS ProjectBENEFITS - Transforming Social ServicesLink to: Generative AI for Content Lifecycles Link to: Generative AI for Content Lifecycles Generative AI (GenAI) ThumbnailGenerative AI for Content Lifecycles
Scroll to top Scroll to top Scroll to top