Take a Closer Look at DarkShield


Next Steps
DarkShield Overview Features & Benefits Technical Details Platforms & Pricing Why It's Better Resources

Define, Discover, and De-Identify PII in Dark Data

With IRI DarkShield, you can classify, find, and erase or otherwise mask sensitive information in multiple structured, semi-structured, and unstructured sources, including: text, PDF and MS Office documents, Parquet and image files, relational and NoSQL database collections.

DarkShield uses shared data classes, custom search combinations, and consistent masking functions across on-premise and cloud sources. With DarkShield, you can also extract, share, and display job results (and attendant file metadata) in its Eclipse, or your SIEM, environment.

With DarkShield, you can comply with Right to Be Forgotten requests, deliver specific data extracts to those requesting record portability, and facilitate data quality in data rectification requests. You can also save the masked files in either the original, or same-named target files and folders on your network or in the cloud.

Examine the functions and formats available in DarkShield on this site. Then, arrange a free online demo to see how DarkShield can work there, and to get answers to your questions. 

How DarkShield Works

DarkShield leverages data classification dialogues and dark data discovery wizards in the free IRI Workbench IDE, built-on Eclipse™, to catalog the data you care about, and to configure search and masking specifications in metadata files that are easy to share, secure and modify. At runtime, the saved configurations can launch from Workbench or any application via CLI or RPC API.

You can run DarkShield just to search for hidden values and report on their locations and attendant file metadata. Or you can run it with remediation enabled, to obfuscate personally identifiable information (PII) for compliance with data privacy laws using a variety of masking functions. Your search and mask operations can run separately or simultaneously.

Large jobs can be load balanced through the API using an NGINX reverse proxy. Images can be pre-processed to improve scanning accuracy.

For optimal security and control, DarkShield runs on-premise by default (though it can be installed in containers or cloud VMs that you control -- we do not receive or host any data). You can also use NGINX to authenticate users and a key vault like Azure to manage encryption keys to differentiate data restoration access.

Click on the buttons below to learn more about each operation.

An extraction icon Extract
An icon showing text being redacted Mask
a checkmark icon Audit


Search multi-threaded through dark data repositories system- or LAN-wide (via SMB) -- and in Amazon S3, GCP, Azure BLOB and Sharepoint stores -- to ensure that data you're concerned about, or values you're specifically looking for, are found. Many other cloud, application, and proprietary platform connectors (e.g., Kafka, Facebook, Google, MINA, JPA, Sharepoint, etc.) are or can be supported.

Define your data classes and masking rules, and match them with six different search techniques:

  1. CSV, DB, JSON, XML, or Excel column/path filters
  2. RegEx pattern matching (with off-the shelf or customizable computational validation)
  3. Exact or fuzzy matches to values in dictionary / lookup (set) files
  4. ML-facilitated Named Entity recognition (NER) models using OpenNLP, Tensorflow, or PyTorch
  5. Bounding boxes drawn around fixed areas of images
  6. Facial detection and recognition (module on request)

You can reuse and share your data classes, search criteria, set files, and rule matchers in project or cloud repositories. And, because DarkShield runs in IRI Workbench alongside other IRI and Eclipse tools, you can do many other things with your DarkShield search results; see Extract next.

Dark Data Discovery Wizard


Generate the results of your search in a flat file that also contains forensically useful metadata attendant to each file containing the values you searched for. The search report can be used for e-discovery and delivery to EU citizens requesting "data portability" or for deletion proof where you are granting their "right to be forgotten" from these repositories.

If you license DarkShield as part of an IRI Voracity data management platform subscription, you can further manipulate and manage this data in ETL, analytic, and notification work flows, typically without seeing the PII result (shown optionally below):

DarkShield Extraction


Apply width-preserving or other static data masking functions, including format-preserving encryption:

  1. Format-preserving (or not) encryption
  2. Lookup pseudonymization
  3. Redaction / obfuscation
  4. String manipulation
  5. Randomization
  6. Bit scrambling
  7. Synthesis
  8. Encoding
  9. Deletion
  10. Hashing
  11. Blurring

to de-identify sensitive information and comply with data privacy laws. The files are visually identical to their unmasked counterparts, except for the masked strings. You can also write output to the same-named files in cloned directory trees to ease the reconciliation process.

Masking jobs are easy to modify and schedule. Subsequent search/mask operations will automatically cover new files in the source folders as well as those updated since the last search.

DarkShield redaction samples


As DarkShield runs, it reports overall job status in a real-time progress bar. When each job completes, DarkShield generates a report of the values it found, along with the accompanying file metadata you wanted to see.

If you told DarkShield to mask, it will also report on the files that were masked, and those that were not completely masked. Of course all the search and masking job configuration details, including data classes and rule matchers, are saved and available for inspection locally or in secure repositories.

Easily query, analyze, and format the results of your search and mask operations through built-in reporting and visualization functionality. After DarkShield runs, right click on the results file to display information about the searching and masking operations. Where data could not be masked after an earlier search, you'll know, and can look at the DarkShield error log and data model to learn why and solve the problem.

DarkShield redaction report

Alternatively, you can forward or send DarkShield log data directly into:

  1. a SIEM/SOC tool (see the SplunkES example below) for custom display or alert requirements
  2. custom 2D reports from the data using the CoSort SortCL data mapping program in Voracity; DarkShield creates metadata for SortCL use in custom log query and reporting operations.
  3. Another cloud dashboard or KNIME -- both in the same Eclipse UI -- for BI or analytic needs, respectively.
Splunk DarkShield redaction report

What DarkShield Supports

File Formats

Text Documents Images
.asc .doc/x .bmp
.eml& .html .ppt/x .gif
.hl7 & .x12 .xls/x .jpg/x/2
.json& .xml .pdf .png
.txt .rtf (scan only) .tif/f
.log Parquet DICOM

Data Silos & Databases*

LAN, Related Amazon More Clouds/Apps Additional Sources
Local & SMB CloudWatch Box & SalesForce Couchbase, Redis, Solr
FTP/HTTP/MINA DynamoDB Elasticsearch Cassandra, CosmosDB, MongoDB
Azure BLOB RDS Facebook & LinkedIn Google BigTable & HBASE
GCP Storage Redshift Google Apps JDBC (RDBs) & JPA
Sharepoint / OneDrive S3 Buckets jclouds Kafka & MQTT

DarkShield supports files accessible directly in local or SMB-compatible LAN systems, cloud-mounted drives like Dropbox and OneDrive, Google and Azure Cloud Storage and Amazon S3 buckets, plus RDB tables and NoSQL collections / clusters. However, the other connection protocols listed in italics above, along with several others, can be developed.

Please email darkshield@iri.com about your use case, or complete the information request form below.

Share this page

Request More Information

Live Chat

* indicates a required field.
IRI does NOT share your information.