IRI Blog Articles

Diving Deeper into Data Management



Post image for How to Mask Tables and Preserve Referential Integrity

How to Mask Tables and Preserve Referential Integrity

by Claudia Irvine

The Multi Table Protection Job wizard in IRI FieldShield can mask personally identifiable information (PII) in database columns that are part of a foreign key relationship, and thus preserve referential integrity between the tables. This ensures that the records remain linked even after they are de-identified.

FieldShield users achieve this in the wizard by defining field-level masking rules that are automatically, and consistently, applied to like columns. Any of the 12 categories of data masking functions  available to FieldShield users — including encryption, pseudonymization, and redaction — can be applied on the basis of such rules.

This example uses three Oracle tables – Departments, Employees, and Job_History. When the tables were originally designed, the Employee’s Social Security Number was used for their ID. This creates a security risk when running any reports displaying the ID field.


The above E-R diagram for those tables, and the SQL query and its results below are all shown  in various IRI Workbench GUI for FieldShield views. See this article re: ERD creation in IRI Workbench. The query joined information about employees, managers, and departments but exposes social security number (SSN) values in the Employee SID and Manager SID columns. See this article on coding and running SQL jobs in IRI Workbench.

Using the FieldShield Multi Table Protection Job wizard, these fields can be encrypted (or otherwise de-identified) so that the real SSNs are masked in the tables and subsequent queries. Referential integrity is retained because the same encryption is applied to all tables using one rule.

On the setup page of the wizard, ODBC is selected as the loader. The three aforementioned tables are selected on the Data Extraction page. The next page is the Field Modification Rules page. On this page, the rules to be applied to all the selected extracted tables can be designed.

Clicking Create will open the Field Rule Matcher page. This is where the details of the matcher are entered. Start by entering a matcher name.  

After clicking Create next to Rule Name, the New Protection Field Rule Wizard Selection page is displayed. Select Encryption or Decryption Functions. This selection ensures that the same protection algorithm applies to all data, ensuring referential integrity.

The next page is where the type of encryption is selected. In this case, enc_fp_aes256_ascii is used. This format-preserving encryption algorithm uses the ASCII character set to replace the real data. It is used in this demonstration so that the encryption is noticeable in the output. A more realistic choice would normally be the alphanum version, which would also preserve the actual appearance of SSNs (9 numbers in this case).

Although this example uses an embedded passphrase, a password file can also be used for the encryption key, as can an environment variable.

Clicking Finish will enter this rule into the matcher. Next, the matcher itself must be created. Click Add in the Matchers section. This will open the Field Rule Matcher Details page. Here either a pattern or a data class can be used. See the article Applying Field Rules Using Classification for details on the second option.

In this example, Pattern is selected and .*SID is entered in the details. This regex will match against all column names ending in SID.

The matcher finishes with the details displayed below. The Test button can be used to test the matchers to make sure they are matching to all of the intended columns. Multiple matcher details can be entered and AND/OR logic can be utilized to produce fine-grained matchers. For example, there is an additional column named VP_SSN. The same matcher above could be used with another matcher with a pattern of .*SSN and the operator AND to match on this additional column but with the same rule.

Clicking OK here returns to the Rules page where each rule matcher is displayed. Different matchers can be used for different fields so that only one transformation pass is necessary even if the rules were to mask different columns in different ways.

Clicking Next will display the Data Loading Stage page. Here is where the output table and options are selected. In this example, the same tables as the input tables are selected. Additionally, output mode is changed to Create to truncate the tables before loading so that unique keys are not violated.

After clicking Finish, a folder is created with multiple scripts that will be executed with the included batch file.

To see how the rule will transform the field, and give us a chance to hand-modify things, the SCOTT_EMPLOYEES.fcl script can be reviewed in the Workbench editor. In the output, both the EMPLOYEE_SID and the MANAGER_SID show the encryption algorithm applied.

After executing the batch file, the same SQL query shows the same joined results, but with the Employee_SID and Manager_SID now encrypted. Furthermore, referential integrity is preserved. The original employee-manager relationships are kept and the IDs for the manager on line 2 and the employee on line 26 are the same.

This example demonstrates how one encryption rule can be used across multiple columns in  multiple tables while retaining referential integrity. Any rules created during the job wizards are saved into a rule library. This allows them to be reused and even shared with colleagues so that the same results are ensured on the same data.

If you have any questions about FieldShield data masking rules, or need help using the wizard, contact your IRI representative.

Print Friendly

{ 0 comments… add one now }

Leave a Comment

Previous post:

Next post: