PII Classification in DarkShield

 

Next Steps
DarkShield GUI Data Classification PII Discovery File Masking NoSQL DB Masking RDB Masking CLI & API Specs DarkShield Audit Logs

Integrated data classification infrastructure inside IRI Workbench unifies the definition, discovery, and masking of specific kinds, or classes, of data regardless of their location and formatting. The process of specifying masking jobs for DarkShield, FieldShield and CellShield Enterprise Edition in Workbench all begin with this process

Some of the classes of sensitive data (PII, PHI, PI, CSI, etc.) which need to be found and masked might include:

  • email, street and IP addresses
  • credit card, ID, VIN, and account numbers
  • home, office, or cell phone numbers
  • first, middle and/or last names
  • company names and/or locations
  • birth, death, admit, discharge, or service dates
  • medical conditions or treatments

The actual values for these data classes can be anywhere in the multiple file, document, image, and/or database (relational and NoSQL) source types -- throughout multiple local and cloud silos -- which DarkShield supports. Thus as you name and describe these data classes for centralized reference, you are also associating each class with one or more search methods and a masking rule (function) so they can be found anywhere and remediated consistently.

During data discovery (see next tab), DarkShield will search for every instance of your defined data class values to produce and log their location information ... which you can report on or leverage in simultaneous or subsequent data masking operations. 

If you apply a deterministic masking function like (format-preserving) encryption or recoverable pseudonymization to your data class(es), the values in every source in which they are discovered will be masked the same way. For example, every instance of phone # 390-551-2389 would be encrypted to 108-462-3417 and every occurrence of John Smith will be pseudonymized to Harv Jones in every target, preserving data and referential integrity enterprise-wide.

Once you name and associate these data classes with search methods and masking functions (rules), you do not have to do so again. IRI ships DarkShield with several of these classes pre-defined for you (and provides default patterns and set file values to match against), but you can modify our, or add your own, data class definitions and search/mask associations.

Frequently Asked Questions (FAQs)

1. What is PII classification in IRI DarkShield?
PII classification in DarkShield is the process of defining specific categories of sensitive data such as names, addresses, credit card numbers, medical details, and other personally identifiable information. These classes are then associated with search methods and masking functions to ensure they can be discovered and consistently protected across all sources.
2. How does data classification improve data masking?
By defining data classes and linking them to masking functions, DarkShield ensures consistent remediation across all files, databases, and cloud storage locations. This guarantees that sensitive values like phone numbers or IDs are always masked in the same way, preserving data integrity and compliance.
3. What kinds of data can be classified in DarkShield?
You can classify a wide range of sensitive data, including email addresses, physical and IP addresses, credit card and ID numbers, phone numbers, personal names, company names, dates of birth or admission, and medical conditions or treatments. These classes apply to structured, semi-structured, and unstructured data sources.
4. How does DarkShield find and log classified data?
Once data classes are defined, DarkShield searches for every instance of those values using pattern matching, lookup sets, AI models, and other search matchers. It logs their locations in detail, which you can review, report on, or use for simultaneous or subsequent masking jobs.
5. How does deterministic masking work in DarkShield?
Deterministic data masking ensures that the same input value is always replaced with the same masked output value across all sources. For example, every instance of a phone number can be encrypted to a consistent format-preserved result, and every occurrence of a name can be pseudonymized to the same alias, maintaining referential integrity.
6. Can I customize data classes in DarkShield?
Yes. While DarkShield includes pre-defined data classes with default patterns and lookup values, you can edit them or create entirely new ones. This allows you to adapt the classification process to meet the unique data privacy and compliance needs of your organization.
7. How do data classes simplify future discovery and masking operations?
Once you define and associate data classes with search and masking rules, they are saved for reuse in future jobs. Data class and rule library information persists in files you can also secure and share through Git, for example. This eliminates the need to reconfigure search criteria for every new project and ensures consistency across ongoing data governance efforts.
8. Can DarkShield apply different masking rules to different data classes?
Yes. Each data class can be assigned its own masking function based on your security and compliance requirements. For example, you may encrypt credit card numbers, redact addresses, and pseudonymize names while leaving other non-sensitive fields clear.
9. How does centralized classification support enterprise-wide compliance?
Centralized classification enables uniform policies across multiple systems and silos. This ensures that PII is discovered and remediated consistently regardless of where it resides, making compliance audits more reliable and easier to document.
10. What about data class groups, privacy laws and sensitivity levels?
The IRI Data Class and Rules Library allows you to organize one or more data classes into data class groups, and associate both classes and groups with pre-existing (out of the box) or custom new categories associated with various data privacy laws like the GDPR and HIPAA. In addition, you can tag data groups with specific levels of sensitivity (e.g., public, restricted) so higher levels are given stronger masking functions if the same data class is defined in more than one data class group.
Share this page

Request More Information

Live Chat

* indicates a required field.
IRI does NOT share your information.