The Best Way to Find and Mask Unstructured Data

With so much personally identifiable information (PII) and other sensitive data hidden in unstructured (or so-called "dark data") files, you need a way to find, report on, and mask that PII efficiently, wherever it is. IRI DarkShield does all this so you can adhere to your business rules and privacy laws pertaining to this data. Consider some of the advantages of using DarkShield:

LAN-wide PII searching and masking

Just as you can use IRI FieldShield or IRI Voracity to find, classify and mask PII at the column, table, or schema level, you can use DarkShield or Voracity on PII in one more files, folders and network-mounted drives.

Whether you search, mask, or do both at once, you have the option to secure data surgically, or en masse to save job specification and execution time.

Support for multiple file formats, search methods, and masking functions

Rather than requiring separate learning, specification or execution steps for each file type you have, DarkShield handles proprietary formatting differences automatically. This means you can search, classify, extract, mask, and report on all DarkShield-supported file types at once -- from log files to emails to pdf's-- all in the same design spec.

DarkShield also supports multiple search methods (see below), and many of the same data masking functions available to FieldShield users.

You can also combine all of these features in the same DarkShield job. That is, you can locate the PII as you've defined it in data classes wherever it is on your LAN using multiple search techniques, and at the same time or later, automatically applying the masking functions you've assigned.

Use popular PII patterns, or your own

Save time by choosing from many pre-defined RegEx patterns for your searches, including credit cards, phone numbers, email addresses and national ID formats. Or, define your own pattern for any custom data format you need to classify, find, and mask.

Lookup value searches

Find PII values that exactly match -- and soon roughly match (using fuzzy search algorithms) -- values in a look up-file that IRI includes (like common American first and last names), or that you provide (like employees, products, formulas, or places).

For sensitive information that does not conform to a pattern or has too many members for a literal pattern definition or poorly-trained NER model, set lookups are an especially good search option.

NER model searches

Natural Language Processing (NLP) and Machine Learning (ML) technology in DarkShield supports Named Entity Recognition (NER) searches so you can find names, addresses, and other sensitive information in the context of your documents. This is especially valuable for locating, and then redacting or pseudonymizing, people. Their names do not always match patterns (or especially vice versa), nor values in a lookup set.

Semi-supervised NLP model training

DarkShield also uniquely includes user-friendly NLP model builders and trainers that make use of your documents for machine learning. This improves the relevance, and thus accuracy, of your NER searches (e.g., for people's names).

Integrated data classification & masking

Save time and trouble as you define and catalog PII and other sensitive information in data classes or class groups using simple graphical wizards. More specifically, you match a chosen masking function to each data class (or group), so that mask will automatically be applied to that data in the remediation phase.

The same data classes you define for DarkShield jobs can also be used in DB, flat-file and Excel PII search and mask operations in IRI FieldShield and CellShield. Eventually, they will also be supported in IRI Voracity data management operations.

All these activities -- from data class definition, saving, re-use, and application, and from data masking to data quality to data integration -- use the same pane of glass. All IRI data 'shield' and data management tools are supported in the Voracity platform, and share its free graphical IDE built on Eclipse;®, IRI Workbench.

Simultaneous searching and remediation

Saves time and passes through your data by combining these processes in the same operation.

Separate searching and remediation

Allows you to just find, report on, and/or use the data you're searching for, without necessarily masking it. This saves on time and the storage space needed for (potentially multiple) masked copies of the data.

Serialized searching and remediation

Supports external and automated (scheduled) runs of either combined or separated data searching and masking jobs. Also makes manual or graphical modification of these jobs possible, as your job specifications are saved in a single, self-documenting, and easily-modifable XML script.

Multiple, interoperable masking functions

Gives you a range of options to use on each class of data data based on your business needs and the ability of the file format to support it. You can also use other IRI shield tools to mask or reveal the data if comes from or moves into a more structured environment.

For example, you can export a MongoDB column containing floating PII values out to a JSON file via Voracity, FieldShield, or NextForm so that DarkShield can mask that PII (and you can re-import it). Conversely, you could use DarkShield to find and encrypt names in unstructured documents, then extract and structure those masked names in a delimited file. You can then import that file into an Excel spreadsheet, where an authorized CellShield user can later decrypt the names.

Text file (TXT) search results

Exports and structures your search results and selected file-related metadata into a delimited file that you can use for auditing, analytic, and data delivery purposes. This feature also enables compliance with GDPR data portability requirement by allowing you to provide the information you have on an requester in any format required.

If you use Voracity, all of this, including the format and disposition of the report, can be built into an visual, automated work flow.

Executive results dashboard

DarkShield's dynamic, exportable html report incorporates the TXT results into visualizations of the number and types of files containing PII, and which of the file types containing sensitive data were also masked.

IRI Data Definition File (DDF) metadata creation

Creates DDF metadata for the .TXT results file for use in IRI Voracity ETL or CDC jobs, CoSort or BIRT reporting, NextForm data migration/replication operations, or FieldShield structured data masking jobs.

Excel Interchange File (EIF) metadata creation

Creates an Excel-compatible, CellShield EE-ready import file containing the results of .XLS and .XLSX file PII searches for spreadsheet recording, and localized sheet-level or bulk remediation operations, respectively.

