PII Redaction Tools

 

Next Steps
Overview Anonymize Custom Encode Encrypt Hash Pseudonymize Randomize Redact Scramble Twiddle Shift Tokenize

Challenges


Personally identifiable information (PII) that is not needed for downstream use or recovery should be redacted (covered over with masking characters, or removed) to prevent misuse. Meanwhile, other data elements or selected parts of the PII value itself may still need to be exposed.

At the same time, the masking should preserve the original storage format and overall field appearance so that the platform (e.g., DB table) structure or application need not be altered. Masking all but the last four digits of a credit card or social security number with asterisks is a common data redaction requirement.

While DBAs can remove columns in tables, there are few alternatives for masking data in different ways across multiple databases and file sources.

Solutions


To redact data at rest, use the built-in functions in PII redaction software from IRI:

To redact data in motion (dynamic data masking), use the replace_char API function in the FieldShield SDK

In each case, you can use an IRI data masking tool to classify (locate) and redact a specified number of bytes, defined (sub-)strings, entire fields, or one or more rows. Choose the type and start/stop locations of the masking characters. Determine whether to apply the redaction based on column names or data classes, pattern matches, field value conditions, NER models, etc.

Character masking is just one of the non-reversible protection functions provided in IRI software. Others include randomization, external source pseudonymization, and possibly hashing. Reversible protections include encryption, encoding, a proprietary ASCII de-ID function, same-source pseudonymization, expression logic, and string functions.

Workbench redact

SortCL users have the additional option to transform and report on data as they redact it in whole or part.

All IRI software is supported in a free, familiar GUI built on Eclipse called IRI Workbench, which exposes and profiles multiple data sources, helps configure masking jobs, and manages the projects for team-sharing and compliance verification.

Frequently Asked Questions (FAQs)

1. What is data redaction and how does it differ from other masking methods?
Data redaction involves masking or removing parts of personally identifiable information (PII) that are not required for downstream processing. Unlike encryption or pseudonymization, redaction is typically irreversible and intended to permanently hide sensitive components.
2. How can I redact only part of a sensitive value, like a credit card or SSN?
IRI tools allow you to define custom redaction rules to preserve certain digits—such as showing only the last four digits of a credit card—while masking the rest with asterisks or other characters. This helps maintain data utility while enhancing privacy.
3. What data sources can I redact using IRI software?
You can redact PII in relational databases, flat files, Excel spreadsheets, ASN.1 CDR files, structured and unstructured documents (like .pdf, .docx, .txt), and even big data sources such as HDFS and NoSQL using IRI FieldShield, CellShield, DarkShield, and Voracity.
4. How does IRI handle redaction for unstructured data?
IRI DarkShield uses pattern matching, NLP models, and dictionary lookups to identify and redact sensitive data in unstructured files such as PDFs, emails, Word documents, and HTML files.
5. What customization options are available for redaction?
IRI software allows users to define the type of masking character, the range of characters to redact, and conditions under which redaction should occur. You can apply rules based on column names, patterns, NER tags, or even specific value matches.
6. Can I apply redaction rules across multiple databases or file formats?
Yes. IRI supports consistent rule application across different sources using reusable job scripts or rule files, helping enforce uniform redaction policies across the organization.
7. What are the advantages of redacting data instead of deleting it?
Redaction preserves the format and structure of the data, which maintains compatibility with database schemas and applications. It also allows non-sensitive parts of the data to remain useful for analytics or reporting.
8. Can I combine redaction with other data masking methods?
Yes. Redaction can be used alongside encryption, pseudonymization, or other obfuscation techniques within the same IRI job, giving you flexible and layered data protection.
9. How does IRI support redaction in Excel?
IRI CellShield provides built-in redaction functions for Microsoft Excel that allow users to mask full or partial values directly within their spreadsheets, using simple dialog-based controls.
10. Can I redact data during transformation or migration processes?
Yes. With IRI Voracity or the SortCL program, you can redact fields as part of ETL, data migration, or reporting jobs—without requiring separate masking steps.
11. What options exist for redacting national ID numbers or SSNs?
IRI FieldShield provides specific masking dialogs for SSNs, national ID numbers, and other common formats. These allow you to apply standard or custom redaction schemes easily across different datasets.
12. How does redaction help with compliance?
By irreversibly masking PII, redaction helps meet data protection requirements in regulations like GDPR, HIPAA, and PCI DSS, especially when data is shared externally or used for reporting.
Share this page

Request More Information

Live Chat

* indicates a required field.
IRI does NOT share your information.