
Applying Field Rules Using Classification
The IRI Voracity data management platform (and IRI FieldShield data masking product within) now allow you to auto-define data classes and groups based on your business glossaries or domain ontologies and apply transformation rules to those classes across multiple data sources and fields. In this article, I will demonstrate how to apply field-level protection rules against a data class library.
We will use the data class library created in my first article on Data Classification in IRI Workbench for Voracity and FieldShield. Here is the data class library that will be used:
You can see that I used one CSV file and two Oracle tables. In this rules example, I will only be transforming data in the two tables.
Using the FieldShield Multi Table Protection Job Wizard, I select ODBC as extractor, nothing for the loader (so the output will be a flat file), and the two tables referenced above. On the Field Modification Rules page, I click Create to add a new Masking Function rule for my SSN field as such:
I then add a rule matcher using the PIN_US data class that I have in my library:
I can add as many matchers as I like using AND/OR logic. Please note AND takes precedence. The operator of the last rule matcher is not used in the logic.
I create another masking rule using the pre-defined Whole Field and the data group NAMES as a matcher. Clicking the Test button shows that it found three field matches. Because the NAMES data class group contains FIRSTNAME, LAST_NAME, and FULL_NAME data classes, this is the correct output based on the data class library above. There are three maps with a type of NAME in their data class. Classes and groups are differentiated with icons in the matcher details dialog and preferences pages.
Clicking Next displays the summary screen where it includes the fields that will have a rule applied.
Clicking Finish creates a folder with the job results included.
Here are the two job scripts (one for each table) showing the applied rules in the output sections. Four fields were masked in two different ways: names are wholly masked, and SSNs have only the first five digits masked, skipping the dashes.
When these jobs are run alone or as part of a job, they produce these results:
Being able to use data classes as rule matchers lets you select a greater number of fields with fewer steps. In this example, I masked four fields in two tables with only two rules.
If you would like more information, or to offer feedback, on the use of data classification and/or the application of rules, please contact voracity@iri.com.
1 COMMENT
[…] Data Classification – grouping discovered data into logical categories for global masking […]