There are times when it is necessary to test with or share data that has elements of personally identifiable information (PII). To comply with data privacy laws and prevent a data breach, you may need to provide data that reflects, and sometimes imparts, critical information, but still protects the PII.
This article focuses on IRI Workbench execution options for scripts based on the SortCL program language, which covers IRI Voracity ETL, CDC, SDC, pivoting and subsetting jobs, as well as its constituent product jobs; i.e.,
This article looks at sets from an informational processing perspective; what they are; how they are constructed; and, distinct ways in which data can be drawn from sets within IRI software products using the SortCL data definition and processing program; i.e.,
Update: Q2’16: In addition to the database profiling wizard in the data discovery menu group in IRI Workbench described below, IRI has introduced robust data classification that enables the application of field rules for multi-source data transformation and protection through data class libraries.
Data architects and data scientists, as well as DBAs and governance teams, may need to use or migrate data in legacy file formats and databases. Additionally, the ability to mash-up those sources with newer file and database repositories is important in data integration (ETL) and analytic projects, as well as in data profiling for data loss prevention and privacy law compliance.