Finding & Masking PII in Files

 

Next Steps
DarkShield GUI Data Classification PII Discovery File Masking NoSQL DB Masking RDB Masking CLI & API Specs DarkShield Audit Logs

The New File Search/Mask Job wizard in IRI Workbench is the first option in the DarkShield V5 toolbar menu, and as its name implies, takes you through the steps to configure PII searching and/or masking operations for files on-premise, or in the cloud. It will use the search methods and masking functions you specified previously through data classification.

The DarkShield file wizard in Workbench leverages the back-end DarkShield API for Files introduced in DarkShield V4 at runtime to find and protect data in:

  • Text files (raw or structured)
  • JSON and XML files
  • HL7, X12 and FHIR EDI formats
  • MS Word (.DOC/X) and .PDF documents
  • MS Excel (.XLS/X) and Powerpoint (.PPT/X)
  • Parquet files
  • .BMP, DICOM, .GIF, .JPG, .PNG, and .TIF/F images, standalone or embedded in the above

You can specify the location of your file sources and masked targets in any combination of SMB LAN (local) network drives and accessible folders, OneDrive and Sharepoint online, as well as buckets in Amazon S3, Azure Blob, and Google Cloud Platform storage.



The DarkShield file wizard also supports advanced options for filtering searches to save time, and for configuring advanced masking settings for specfic file types:


Frequently Asked Questions (FAQs)

1. What is the DarkShield File Masking Job Wizard?
The DarkShield File Masking Job Wizard is a guided tool in IRI Workbench that walks you through the process of configuring PII search and masking operations for files stored locally, across networks, or in the cloud. It uses the search methods and masking rules defined during data classification to ensure consistent remediation.
2. How does the File Masking Wizard simplify data protection?
The wizard automates job creation by letting you select file sources, specify target locations, and apply predefined masking functions without manual scripting. This reduces setup time and helps ensure accuracy in large-scale or repetitive masking projects.
3. What file types can DarkShield search and mask?
DarkShield supports a wide range of file formats, including raw text and log files, JSON, XML, HL7, X12, and FHIR EDI, Word documents, PDFs, Excel sheets, PowerPoint slides, .SQL, audio and Parquet files, and image formats such as BMP, DICOM, GIF, JPG, PNG, and TIF/F.
4. Can DarkShield process both on-premise and cloud files?
Yes. You can configure the wizard to search and mask files on SMB LAN drives, local directories, OneDrive, SharePoint Online, and/or major cloud storage platforms including Amazon S3, Azure Blob, and Google Cloud buckets.
5. How does DarkShield maintain consistency across multiple files and formats?
DarkShield uses the data classes and associated deterministic data masking functions you defined earlier to ensure the same value is masked consistently across all files, regardless of format or location. This approach preserves referential integrity across enterprise data sources – files, and databases, on-premise and in the cloud.
6. Can I filter searches to improve performance?
Yes. The DarkShield File Masking Wizard allows you to set filters that limit searches to specific file types, directories, or content patterns, which speeds up job execution and reduces unnecessary processing.
7. How does DarkShield handle images and embedded PII?
DarkShield can find and mask PII within standalone image files or images embedded in documents. This includes support for formats like DICOM used in healthcare, making it a strong solution for compliance with HIPAA and other regulations.
8. Can I customize masking behavior for different file types?
Yes. The wizard supports advanced configuration options for file-type-specific masking (e.g., PDFs and JSON) allowing you to control how PII is obfuscated depending on the format or compliance requirements.
9. How do search and mask operations run after configuration?
Once configured, jobs created in the wizard can be run immediately, scheduled for later execution, or automated within CI/CD pipelines. The jobs leverage the DarkShield API for Files to execute efficiently at runtime.
10. Can results from file masking jobs be audited?
Yes. The wizard’s jobs produce logs that capture which files were scanned, which PII values were found, and what masking actions were applied. These logs can be used to verify compliance and support audit requirements.
Share this page

Request More Information

Live Chat

* indicates a required field.
IRI does NOT share your information.