The Schema Pattern Search in IRI Workbench (WB) can be used to retrieve data in an entire schema matching specific patterns. The search process compares the patterns to all the data in every column of the selected data types in every table in every schema selected. This processes is helpful in complying with PCI storage retention requirements to find all locations of credit card numbers, or with the GDPR right to be forgotten by finding every database record associated with the requestor’s name.
Unlike the Database Profile wizard in IRI Workbench which can search one table at a time, or the Flat-File Profile wizard which searches only through a selected file, the Schema Pattern Search wizard searches entire schemas for patterns. If only selected tables or structured files need to be searched, use the Data Class Library editor for faster results.
This example searches a single Oracle schema using a credit card regular expression pattern. To use the wizard, select the schema or schemas to search. Start by selecting the scan option. The choices are full scan to scan all rows and output the data matching. To limit the number of row to scan, select the limit option and enter the number of rows. If the matched data is not important, only the columns where results are found, select to not include the data results.
On the next page, select the schema or schemas to search.
Select the data types of the columns that contain the data in question. If the suspect data is only in text type fields, selecting only those types of columns will reduce the amount of data that the underlying search engine has to process.
Enter the pattern by browsing for existing patterns or creating a new one. Multiple patterns can be added. The AND and OR operators are available to fine tune the patterns to use.
Clicking Finish will start the search/report process. Depending on how much data is being scanned, the wizard may run for a significant amount of time. Because the process may run for a long time, a file called tableSearchResults is created and records every table that has been fully processed. In case of a failure during the search, this file will show the last table that was successfully searched.
When the search reaches the end of a column, if any matches were found, a file named columnSearchResults is appended with the name of the column. At the same time, the schema_pattern_search file (or custom named file on the setup page) is appended with the results of the search and contains both the column name and the data matched.
The data is recorded in a tab-delimited file and can by used by other processes as needed.
The columnSearchResults.csr file can be used by the Schema Pattern Search to Data Class Association wizard in WB that takes the column results and assigns data classes to them. This can speed the process of data protection during other IRI jobs.