IRI/CoSort: Fast Data Transformation, PII Search Mask & Test

Home » Solutions » Data Masking » Static Data Masking » Anonymize

Quick Links

Overview Anonymize Custom Encode Encrypt Hash Pseudonymize Randomize Redact Scramble Twiddle Shift Tokenize

Blurring and Bucketing

Indirect identifiers, or "quasi-identifying values" like age and date of birth, as well as descriptors like occupation and marital status, can all be used to re-identify people if there are enough of these attributes in the data set and/or they can be joined to a supersetpopulation with similar values.

For this reason, your jobs in the IRI FieldShield or IRI DarkShield data masking tool (or both within the IRI Voracity data management platform) can apply one or more techniques to anonymize these values while still keeping them realistic and accurate enough for research or marketing purposes. This is also akin to differential privacy because you are replacing individual (direct) identifiers with more general anonymizing demographic information or approximate values (indirect identifiers).

By way of specific IRI-supported techniques, you can perform:

Numeric blurring functions to create random noise for specified age and date ranges.
Bucketing functions that generalize the values into broader categories also anonymize quasi-identifiers.

In the example job specification shown below, specific ages are bucketed into decade groups, multiple marital status attributes are combined into two broader categories in a defined condition, educational attainments are simplified through a new set lookup file, and all occupations were explicitly redacted in place.

data blurring capabilities in IRI Workbench

These job specifications can be generated automatically in fit-for-purpose graphical wizards and function-specific dialogs. The new result set can now be re-run through the risk scoring wizard to produce another determination of re-identification risk based on now less distinct quasi-identifying attributes.

Frequently Asked Questions (FAQs)

1. What is data blurring in data anonymization?

Data blurring is the process of adding random noise to data values, such as age or date of birth, to reduce the risk of re-identifying individuals while keeping the data useful for analysis.

2. How does bucketing help anonymize quasi-identifiers?

Bucketing, also known as binning or generalization, turns specific data values into broader attributes —for example, replacing specific ages with their assigned bucket value (the actual age column value of 62, 73, or 91 in a record gets replaced with the value ‘senior’). Generalized demographic labels, or tiers, help protect identity by making attributes still true but less specific.

3. What types of data can be anonymized using blurring and bucketing?

Quasi-identifiers such as age, date of birth, marital status, occupation, diagnosis, drug therapy, and education level can be anonymized using these techniques to lower re-identification risk.

4. How do IRI FieldShield and Voracity implement data anonymization?

Both tools offer graphical wizards that let you apply blurring and bucketing techniques using predefined rules, lookup tables, or conditional logic in a no-code or low-code environment.

5. Can anonymized data still be used for research or marketing?

Yes. By generalizing data instead of removing it, anonymization retains enough value for statistical analysis, segmentation, and non-personalized marketing purposes.

6. How is this approach related to differential privacy?

This method is conceptually similar to differential privacy in that it reduces data precision to prevent re-identification, while still allowing useful patterns to emerge from the dataset.

7. What is the difference between blurring and redaction?

Blurring adds randomized variation or generalization to keep data analyzable, while redaction removes or obfuscates data entirely to make it hidden or unreadable.

8. How do you validate that data is sufficiently anonymized?

After applying blurring or bucketing, you can re-run IRI’s re-ID Risk Scoring wizard to evaluate the statistical likelihood of re-identification and ensure compliance with data privacy laws like HIPAA and the GDPR.

9. What are some real-world examples of bucketing?

Examples include grouping specific ages like 31 or 38 into ranges like 30–39 (or adult), categorizing marital statuses into "Married" or "Not Married” (vs. divorced, widowed, etc.), and simplifying education levels like AA or BS into "College," and MS, MBA, JD into "Graduate."

10. Can I customize my own anonymization rules?

Yes. IRI FieldShield allows you to define custom rules using lookup files, conditional logic, or external procedures to tailor blurring and bucketing strategies to your specific needs.

Share this page

Request More Information

Live Chat

* indicates a required field.
IRI does NOT share your information.

Data Anonymization

Adding Noise or Generality to PHI/PII

Quick Links

Blurring and Bucketing

Frequently Asked Questions (FAQs)

Request More Information

Solutions

Products

Customers

Services

Company

Support

News

Partners

Data Anonymization

Adding Noise or Generality to PHI/PII

Quick Links

Blurring and Bucketing

Frequently Asked Questions (FAQs)

Request More Information

Follow us on

Get the IRI Newsletter