Selecting Personally Identifiable Information for Secure Queries (FieldShield Filters)

by David Friedland

In the course of protecting personally identifiable information (PII) moving into and out of databases, FieldShield and CoSort typically externalize protection of the full table(s) they connect to. That approach does not tax DB performance or security, and is especially useful when a separate, safe table or protected subset of rows or columns needs to be created. But how can these tools be used to protect only updated rows and secure queries?

Following are two major ways IRI tools fine-tune your data protections. Think about which of these approaches work best for your environment, and ask IRI how you can integrate these capabilities into your database application:

Only Protect (Updated) Rows that Need Protecting (Selection):

Table Pre-Selection
RDBMS table data to be protected in FieldShield operations can be filtered during extraction through SQL WHERE clauses that include only those rows meeting specific delivery criteria.
“Dumb” (Unconditional) Phase-Level Filtering
Both FieldShield and CoSort’s Sort Control Language (SortCL) programs have the ability to skip or collect a specified number of rows at the input, action, and/or output phases of any job script.
“Smart” (Conditional) Selection
Rows can be included or omitted from input, processing, and/or output based on the change in a column value or on the basis of whether that column meets the test of boolean logic with a relational operator, expressed in SQL (/QUERY) or /INCLUDE (or OMIT) syntax inside the scripts.
Conditional Field Logic
The protection function assigned to any given field can be applied conditionally based on the ‘smart’ criteria above; i.e. only protect the field if its value passes the protection test your business rules dictate.
Application-Level Filtering
Programs polling or populating tables or files for data can direct only those rows meeting their protection criteria into an API or system call to FieldShield or CoSort’s SortCL program. Conversely, data protected either en masse or conditionally by IRI software can be filtered and redirected by the calling application according to targeting requirements; i.e. which targets require which data, and in which form (e.g. protected or unprotected).

Only Protect Columns Containing PII (Projection):

Column Pre-Selection
RDBMS table data to be protected can be vertically filtered during extraction through SQL SELECT WHERE clauses that extract and pass in only those columns requiring FieldShield or SortCL protection.
Target-Phase Specification
FieldShield and SortCL programs can specify the protection and thus output of only those columns that are to be protected, so there would be no need to process, write out, or import the other columns.

In either scenario, if you can identify and filter out only the data that needs to be protected, you can reduce or eliminate the unnecessary CPU and I/O overhead involved with processing and moving non-PII data.* IRI can also develop custom, in-situ encryption solutions, or otherwise provide protection libraries you can integrate directly into your SQL procedures.

* Protecting only the columns that need protecting is also an inherent computational efficiency advantage in securing personally identifiable information (PII). This data-centric approach taken by FieldShield or CoSort’s SortCL means that CPU cycles are not consumed in the protection of non-sensitive columns, or any other data outside the specified source(s). Just as protecting only fields is a performance benefit, so too is protecting only updated rows. Either way, you can save time encrypting, masking or otherwise de-identifying PII by reducing the amount of data feeding or leaving FieldShield or SortCL operations.

Data Masking: Obscuring Dates and Ages

Database data masking