Define, Discover, and De-Identify PII in Dark Data
With IRI DarkShield, you can classify, find, and erase or otherwise mask sensitive information in multiple structured, semi-structured, and unstructured sources, including: text, PDF and MS Office documents, Parquet and image files, relational and NoSQL collections, and even faces.
DarkShield uses shared data classes, custom search combinations, and consistent masking functions across on-premise and cloud sources. With DarkShield, you can also extract, share, and display job results (and attendant file metadata) in its Eclipse, or your SIEM environment.
With DarkShield, you can comply with Right to Be Forgotten requests, deliver specific data extracts to those requesting record portability, and facilitate data quality in data rectification Requests. You can also save the masked files in either the original, or same-named target files and folders on your network or in the cloud.
Examine the functions and formats available in DarkShield on this site. Then, arrange a free online demo to see how DarkShield can work there, and to get answers to your questions.
How DarkShield Works
DarkShield leverages data classification dialogues and dark data discovery wizards in the free IRI Workbench IDE, built-on Eclipse™, to catalog the data you care about, and to configure search and masking specifications in metadata files that are easy to share, secure and modify. At runtime, the saved configurations can launch from Workbench or any application via CLI or RPC API.
You can run DarkShield just to search for hidden values and report on their locations and attendant file metadata. Or you can run it with remediation enabled, to obfuscate personally identifiable information (PII) for compliance with data privacy laws using a variety of masking functions. Your search and mask operations can run separately or simultaneously.
You can also define data classes for facial detection (e.g., blur all faces) or recognition to obfuscate specific faces in your model library.
For optimal security and control, DarkShield runs on-premise, regardless of where the data lives. Click on the buttons below to learn more about each operation.
Search multi-threaded through dark data repositories system- or LAN-wide (via SMB) -- and in Amazon S3 -- to ensure that data you're concerned about, or values you're specifically looking for, are found. Many other cloud, application, and proprietary platform connectors (e.g., Kafka, Facebook, Google, MINA, JPA, Sharepoint, etc.) are or can be supported.
Define your data classes and masking rules, and match them with six different search techniques:
- CSV, DB, JSON, XML, or Excel column/path filters
- RegEx pattern matching (with custom validators)
- Matches to values in a dictionary / lookup (set) file
- Named-entity recognition (NER) NLP models
- Bounding boxes drawn around fixed areas of images
- Facial detection and recognition (module on request)
You can reuse and share your data classes, search criteria, set files, and rule matchers in project or cloud repositories. And, because DarkShield runs in IRI Workbench alongside other IRI and Eclipse tools, you can do many other things with your DarkShield search results; see Extract next.
Generate the results of your search in a flat file that also contains forensically useful metadata attendant to each file containing the values you searched for. The search report can be used for e-discovery and delivery to EU citizens requesting "data portability" or for deletion proof where you are granting their "right to be forgotten" from these repositories.
If you license DarkShield as part of an IRI Voracity data management platform subscription, you can further manipulate and manage this data in ETL, analytic, and notification work flows.
Apply width-preserving or other static data masking functions, including:
- Format-preserving (or not) encryption
- Lookup pseudonymization
- Redaction / obfuscation
- String manipulation
- Bit scrambling
to de-identify sensitive information and comply with data privacy laws. The files are visually identical to their unmasked counterparts, except for the masked strings. You can also write output to the same-named files in cloned directory trees to ease the reconciliation process.
Masking jobs are easy to modify and schedule. Subsequent search/mask operations will automatically cover new files in the source folders as well as those updated since the last search.
As DarkShield runs, it reports overall job status in a real-time progress bar. When each job completes, DarkShield generates a report of the values it found, along with the accompanying file metadata you wanted to see.
If you told DarkShield to mask, it will also report on the files that were masked, and those that were not completely masked. Of course all the search and masking job configuration details, including data classes and rule matchers, are saved and available for inspection locally or in secure repositories.
Easily query, analyze, and format the results of your search and mask operations through built-in reporting and visualization functionality. After DarkShield runs, right click on the results file to display information about the searching and masking operations. Where data could not be masked after an earlier search, you'll know, and can look at the DarkShield error log and data model to learn why and solve the problem.
Alternatively, you can forward or send DarkShield log data directly into:
- a SIEM/SOC tool (see the Splunk ES example below) for custom display or alert requirements
- custom 2D reports from the data using the CoSort SortCL data mapping program in Voracity; DarkShield creates metadata for SortCL use in custom log query and reporting operations.
- Another cloud dashboard or KNIME -- both in the same Eclipse UI -- for BI or analytic needs, respectively.
What DarkShield Supports
|.eml & .html||.ppt/x||.gif|
|.hl7 & .x12||.xls/x||.jpg/x/2|
|.json & .xml||.png|
|.txt||.rtf (scan only)||.tif/f|
|LAN, Related||Amazon||More Clouds/Apps||Additional Sources|
|Local & SMB||CloudWatch||Box & SalesForce||Apache CXF & Ignite|
|FTP/HTTP/MINA||Dynamo||Elasticsearch||Cassandra, Couch, MongoDB|
|Dropbox||EC2 & S3||Facebook & LinkedIn||HBASE & HDFS|
|Google Drive||SES & SNS||Google Apps||JDBC (RDBs) & JPA|
|Sharepoint||SQS & SWF||jclouds||Kafka & MQTT|
As of Version 3, DarkShield supports files accessible directly in local or SMB-compatible LAN systems, cloud-mounted drives like Dropbox and OneDrive, and Amazon S3 buckets, plus RDB tables and NoSQL collections / clusters. However, the other connection protocols listed in italics above, along with several others, can be developed. If you need the faces module, let us know.
Please email firstname.lastname@example.org about your use case, or complete the information request form below.