Search, Classify, Profile, and Manage Disparate Data Sources

Home » Products » IRI Workbench (GUI) » Discover Data

Quick Links

Overview Data Sources Discover Data Design & Run Jobs Edit & Share Projects FAQs Eclipse Facilities

IRI Workbench supports the discovery and definition of disparate data sources in both local and remote systems.

Built-in data scanning, data profiling, data classification, and search results reporting -- along with field-metadata creation and management facilities -- directly support data integration, data masking, data migration, data quality, and related activities in the award-winning data processing and protection products front-ended in IRI Workbench.

ER Diagramming

Data Classification

Metadata Discovery

Metadata Imports

Database Profiling

Flat-File Profiling

Schema Data Class Search

Directory Data Class Search

Dark Data Search

Data Quality Assessment

ER Diagramming

Define enterprise-wide data class libraries, automatically search your sources and catalog the data in them, and then apply transformation and protection rules that you matched to your classes.

IRI Library Data Class Autoclassification

Data Classification

Define enterprise-wide data class libraries, automatically search your sources and catalog the data in them, and then apply transformation and protection rules that you matched to your classes.

Metadata Discovery

Connect to structured and semi-structured files and relational databases. Define or re-define column names, offsets, and data types so you can save, share, and re-use the metadata for your data sources in central data definition files (DDFs) that are compatible with every IRI software application.

Metadata Imports

Integrated COBOL copybook and JCLsort parrmconverters, ASN.1 CDR and XLS/X file readers, XML and JSONparsers, and support for external data classification and discovery results also produce IRI-compatible data definitions or job configurations to save you time. In addition, third-party tools like Quest Mapping Manager, MITI MIMB and DataSwitch can automatically generate the same IRI DDF from a wide variety of ETL, BI and data modeling applications so you can be that much closer to processing the same data in the IRI Voracity platform to save time and money. Read More

Database Profiling

Compile statistics, check referential integrity, and search for lookup, string-, pattern-, and fuzzy-matching values in any JDBC-connected data source.

Flat-File Profiling

Compile statistics, and search for lookup, string-, pattern-, and fuzzy-matching values in any sequential file format that IRI supports.

Schema Data Class Search

Find and leverage all data schema-wide that matches attributes of your data classes or data class groups. Automatically scan through every column in the schema rather than one table at a time. Use this in conjunction with the Data Class DB Masking wizard.

There is also a Directory Data Class Search (and corresponding Data Class File Masking) wizard in the IRI FieldShield menu in IRI Workbench to find and de-identify PII in one or more flat-files distributed across a LAN. Note that IRI DarkShield also supports the same data classes and masking functions for RDB schema as well; see the differences between FieldShield and DarkShield for RDB search/mask operations here.

Directory Data Class Search

The Directory Data Class Search wizard in IRI Workbench (WB) matches data in structured files within one or more directories to configured data classes. The search process compares the matchers in the data classes with the data in those files to determine the best match, if any. The matchers can be either patterns or set file lookups. If only a few, selected structured files need to be searched, use the Data Class Library editor for faster results.

Dark Data Discovery

In addition to structured data, IRI DarkShield can also serach and report on data in semi-structured and unstructured sources on-premise or in the cloud, including: MS Office and PDF files, Parquet and audio files, relational and NoSQL databases, EDI files in FHIR, HL7, JSON, X12 and XML formats, raw text, and images in many formats including DICOM. Leverage one or more search methods at the same time, including metadata/location or Regex pattern matches, exact or fuzzy matches to lookup values, signature detection, and multiple NER model frameworks using semi-supervised machine learning. You can also extract values from dark data and its associated metadata into flat, query-ready DDF files and simultaneously mask or replace it with IRI DarkShield or perform textual ETL with Voracity.

Data Quality Assessment

Use pattern definition and computational validation scripts to locate and verify the formats and values of data you define in data classes or groups (catalogs) for the purposes of discovery and function-rule assignment (e.g., in Voracity cleansing, transformation, or masking jobs). You can also use SortCL field-level if-then-else logic and "iscompare" functions to isolate null values and incorrect data formats in DB tables and flat files. Or, use outer joins to silo source values that do not conform to master (reference) data sets. Use data formatting templates and their date validation capabilities, for example, to check the correctness of input days and dates

Frequently Asked Questions (FAQs)

1. What is data discovery in IRI Workbench?

Data discovery in IRI Workbench refers to the process of scanning, profiling, classifying, and cataloging data across various sources. It allows users to understand their data landscape, identify sensitive elements, convert legacy metadata in supported formats into IRI (SortCL) format, and prepare for data management tasks like integration, masking, migration, and quality assessment.

2. How does IRI Workbench support data classification?

IRI Workbench supports data classification through customizable Data Classes and Data Class Groups. These classifications help identify and categorize sensitive data elements, enabling users to apply appropriate masking, cleansing, or transformation rules across structured and unstructured data. Learn more about IRI data classification in this article.

3. What kinds of data sources do IRI data classification and masking software support?

Structured, semi-structured, and unstructured data sources on-premise or in the cloud (AWS, Azure, GCI, OCI, SharePoint Online, OneDrive). This includes data in relational databases via JDBC, fixed and delimited flat files (e.g., CSV) and sequential COBOL files, semi-structured NoSQL databases and files in LDIF, JSON, XML, FHIR, HL7, and X12 formats, Parquet and .SQL files, PDF and MS Office documents, common and DICOM image formats, audio files, and files containing signatures and handwritten PII.

4. How can IRI Workbench help me find personally identifiable information (PII)?

IRI Workbench uses search matchers as it scans data in the sources listed above to find defined data classes. AI-powered and traditional discovery methods include pattern, signature, handwriting, RDB data classification, and named entity recognition (NER) models, plus exact and fuzzy matching lookup values and metadata-based location parsers. Users can search across entire RDB schemas, NoSQL clusters, and directories in on-premise and cloud folder structures to find and classify PII, which can then be masked using FieldShield or DarkShield.

5. What is the difference between Schema Data Class Search and Directory Data Class Search?

The IRI FieldShield Schema Data Class Search wizard in IRI Workbench scans all columns across a database schema to match attributes in configured data classes, while the Directory Data Class Search finds the same classes of predefined data in all flat files in one or more directories on-premise or in cloud stores.

6. How do I find (discover), import, and use legacy metadata?

IRI Workbench can discover and define metadata like column names, data types, and offsets. It also supports imports from COBOL copybooks, JCL, XML, JSON, ASN.1 CDRs, XLS/X files, and more. Additionally, it can use external discovery results or generate metadata through third-party tools for IRI job use.

7. What tools are there for data profiling and database profiling?

IRI Workbench includes profiling wizards for both databases and flat files. You can compile statistics, validate referential integrity, and run lookup, pattern, string, and fuzzy matching operations to evaluate data quality and consistency.

8. How can I find PII in unstructured (dark) data sources?

IRI Workbench, through IRI DarkShield, can discover sensitive information in dark data sources like PDFs, Office documents, images, audio, EDI formats, and unstructured text. It supports multiple search methods, including pattern matching, NER models, lookup values, and signature detection.

9. How do I assess data quality using IRI Workbench?

Yes. IRI Workbench includes pattern validation and computational rule-checking to assess whether data conforms to expected formats and values. You can also use joins, conditional logic, and data formatting templates to detect nulls, outliers, or inconsistencies. For more information, see this page.

10. What is a Data Class?

A Data Class in IRI Workbench defines a category of data (e.g., first name, SSN, credit card number) and associates it with specific search matchers and rules (usually for data masking). These data classes standardize how specific kinds of data will be found and processed regardless of its location or format.

11. What are Data Class Groups and how are they used?

In the IRI Voracity data classification infrastructure front-ended in the IRI Workbench IDE, users can organize one or more related data classes into broader categories called Data Class Groups. For example, you can group classes of PII like names, addresses, emails, account numbers, and phone numbers into a data class group called “Customer Data” or “GDPR.” These groups can inherit default masking rules and be prioritized by sensitivity level, helping streamline rule application and data protection strategies.

12. What is the benefit of using sensitivity levels in Data Class Groups?

Sensitivity levels determine rule precedence when multiple Data Classes match the same data element. This ensures that the most restrictive or appropriate masking rule is applied based on the data’s criticality, enhancing compliance and risk management.

13. Can I use IRI Workbench with both FieldShield and DarkShield?

Yes. IRI Workbench is the graphical front-end for both the IRI FieldShield and IRI DarkShield data masking tools. It provides a unified environment for configuring data classification, discovery, and data masking tasks for structured, semi-structured, and unstructured sources.

14. What formats are supported for pattern and fuzzy matching?

IRI Workbench supports regex patterns, lookup file values, and fuzzy-matching techniques for both string and numeric data types. These methods enhance the accuracy of sensitive data discovery in diverse datasets.

15. How do IRI metadata and data definitions support reusability?

Structured data definitions discovered or imported in IRI Workbench are stored in DDF (Data Definition Format) files. These can be reused across different IRI jobs, reducing setup time and maintaining consistency across projects. For more information, see this page.

Share this page

Request More Information

Live Chat

* indicates a required field.
IRI does NOT share your information.

The IRI Workbench GUI

Data Management Starts with Data Discovery

Quick Links

IRI Workbench supports the discovery and definition of disparate data sources in both local and remote systems.

ER Diagramming

Data Classification

Metadata Discovery

Metadata Imports

Database Profiling

Flat-File Profiling

Schema Data Class Search

Directory Data Class Search

Dark Data Discovery

Data Quality Assessment

Frequently Asked Questions (FAQs)

Request More Information

Solutions

Products

Customers

Services

Company

Support

News

Partners

The IRI Workbench GUI

Data Management Starts with Data Discovery

Quick Links

IRI Workbench supports the discovery and definition of disparate data sources in both local and remote systems.

ER Diagramming

Data Classification

Metadata Discovery

Metadata Imports

Database Profiling

Flat-File Profiling

Schema Data Class Search

Directory Data Class Search

Dark Data Discovery

Data Quality Assessment

Frequently Asked Questions (FAQs)

Request More Information

Follow us on

Get the IRI Newsletter