Home » Solutions » Data Masking » Static Data Masking » Pseudonymize

Quick Links

Overview Anonymize Custom Encode Encrypt Hash Pseudonymize Randomize Redact Scramble Twiddle Shift Tokenize

Challenges

While masking data, or producing useful test data, you need output values that look real, but do not reveal personally identifiable information (PII). This is particularly true with the names of people, places, and things.

Encryption, scrambling, redaction, hashing and many data obfuscation functions protect data at risk, but do not provide the level of realism certain recipients require. You need an easier way to change the individualizing characteristics of data using a substitute, but realistic, output value. This is also referred to as data shuffling.

You must also ensure that the real name cannot be readily discovered through reversal or guesswork. And if you want to provide replacement names, or pseudonyms, for people in production or test data environments, the replacements need to be remain consistent for referential integrity, and the values need to stay updated as original names come and go.

Solutions

If you work with PII in tables or flat files, use IRI FieldShield -- or the SortCL program in the IRI CoSort product or IRI Voracity platform -- to replace that data with safe, but realistic replacement output stored in DB tables or external data sets called set files. If you need to do the same with ranges in Excel, use IRI CellShield, or IRI DarkShield. They support:

Recoverable Pseudonymization	Specify a lookup set where real and fake names are either pre-associated, or automatically associated at random. Use the restore set to recover the original names.
Unrecoverable Pseudonymization	Randomly select substitute names for the original value from a set file containing real or fake names. This way the original name value has no automatic basis for restoration.
Consistent, Self-Updating Pseudonymization	Choose from a hash set rule, or palette item in IRI Workbench to maintain updated, consistent pseudonyms that maintain uniqueness and referential integrity.
Deterministic Pseudonymization	Replaces non-unique substitute values from an original value, and fabricates associated PII using this rule in IRI FieldShield or DarkShield. The function creates a unique composite key which allows for natural determinism.

Specify the pseudonym method used in your output fields in simple 4GL job scripts, or use the pseudonymization dialog in the masking rules for FieldShield and DarkShield, in the same Eclipse™ IDE, or in CellShield, which also supports pseudonymous lookup replacements of values in Excel.

Pseudonymization is only one method you can use to shuffle the contents and thereby de-identify information in a record. You can also combine pseudonyms with other field-level data security functions.

Need Test Names?

In addition to pseudonymizing and otherwise masking production data, there is a standalone solution for producing safe, but realistic first and last names of either gender (or other nouns). IRI RowGen uses the same metadata as FieldShield (via CoSort SortCL) to create and format pseudonyms for use as test data values (or in formatted test data targets).

RowGen is especially helpful for providing anonymous, but real-looking, test data when production data is unavailable or insufficient. RowGen builds structurally and referentially correct test data into database, file, and report targets. RowGen is also included in Voracity.

Related Solutions

Product Links

Blog Links

FAQ

Frequently Asked Questions (FAQs)

1. What is data pseudonymization?

Data pseudonymization is a data masking technique that replaces original values – usually personally identifiable information (PII) such as names or places – with realistic substitutes that retain data utility but prevent identification of the original subject.

2. How does pseudonymization differ from encryption or redaction?

Unlike encryption or redaction, pseudonymization replaces the original value with a consistent, human-readable substitute that looks perfectly real. It protects identity without rendering the data unreadable or unusable.

3. What are the main types of pseudonymization supported by IRI?

IRI data masking tools support four types of pseudonymization:

Recoverable – using lookup sets to restore original values if needed
Unrecoverable – values are replaced without a way to reverse them
Self-updating/consistent – maintains referential integrity across datasets over time
Deterministic – via unique composite key for data fabrication instead of unique values

4. How can pseudonymization help with data privacy compliance?

Pseudonymization reduces the risk of re-identification, helping organizations meet GDPR, HIPAA, and other privacy regulations by protecting indirect identifiers while preserving data usefulness for testing or analytics.

5. Can I apply pseudonymization to Excel or unstructured data?

Yes. IRI CellShield supports pseudonymization in Excel ranges, while IRI DarkShield supports pseudonymization in Excel, as well as structured, semi-structured, and unstructured data sources on-premise and in the cloud. See this article on which masking tool is best for Excel.

6. How do I ensure referential integrity when pseudonymizing data?

Use a consistent IRI pseudonymization method that maps original values to stable replacements using hash sets or palette rules. This ensures that the same input always yields the same pseudonym across records.

7. What is a set file in IRI pseudonymization?

A set file in IRI is a list of predefined values—such as names or places—used to substitute real data during pseudonymization. You can use static sets or allow random associations depending on the method chosen.

8. Can I generate realistic pseudonyms for testing without using production data?

Yes. Use IRI RowGen to randomly select from sets of realistic first and last names (or other nouns) for testing purposes, eliminating dependency on live data.

9. What industries benefit most from pseudonymization?

Any industry that handles sensitive data—such as healthcare, finance, education, and government—can benefit from pseudonymization, especially when sharing data for development, QA, or analytics.

10. Can I combine pseudonymization with other masking methods?

Yes. Pseudonymization can be layered with encryption, redaction, or other field-level data protection techniques within the same job script or GUI wizard, enhancing security while preserving utility.

Share this page

Request More Information

Live Chat

* indicates a required field.
IRI does NOT share your information.

Data Pseudonymization

De-Identifying & Shuffling Names or Nouns with Realism

Quick Links

Frequently Asked Questions (FAQs)

Request More Information

Solutions

Products

Customers

Services

Company

Support

News

Partners

Data Pseudonymization

De-Identifying & Shuffling Names or Nouns with Realism

Quick Links

Frequently Asked Questions (FAQs)

Request More Information

Follow us on

Get the IRI Newsletter