This is the second of a two-part blog series detailing data class validation in IRI Workbench. The first article, here, provided an overview of the validation scripts and how to use them in a data discovery or classification job. This article shows how to create a custom validation script for a special data class or group.
In this article, we will create and format a credit card validation script for use in a custom data class. It should be noted that IRI Workbench already provides a credit card data class and validation script for your convenience.
I will be using Visual Studio Code, an open source IDE from Microsoft for this section of the tutorial. Although I won’t be going into detail on how to setup Visual Studio Code, you can find more information about the setup process here and here.
How does IRI Workbench interpret and use the code?
Limitations of the Java ScriptingEngine API
- The engine only implements the ECMAScript 5.1 Specification. ES6 syntax is not supported.
- The nashorn engine does not have a console object. Running a script with console.log(“Hello World”) will throw an error. Use the nashorn print function instead. For example, using print(“Hello”, “World”) will print its arguments to standard out.
Step 1: Create the File
Step 2: Define the Validate Method
You can consider this the most important function within the script since it will be the one invoked by IRI Workbench. Thus all validation logic should be contained in this function.
Step 3: Write the Logic
Logic will vary depending on the data your working on. For credit cards, the only validation logic that will need to be performed is a simple checksum using the Luhn Algorithm.
I won’t be going into detail on how to implement this algorithm but a good example can be found here. In the image below, you can see I implemented the validation logic using a helper function.
A few things to note:
- The input argument will always be a String
- The return value must be either true or false.
You may be wondering why the function is void of any pattern matching. That’s because the IRI Workbench has a separate field for uploading patterns (more on this in the next section). It will run your provided pattern first and then run the validation script.
Adding a Validation Script to a Custom Data Class
This section uses some elements of Data Classification, an integrated data cataloging paradigm for defining the search methods used for finding PII independently from the source of the data. While this section provides a small introduction to Data Classification, you may find it useful to read this article that explores the topic in depth.
Now that the validation script is finished, let’s create a new data class so we can add the script to the IRI Workbench. To get started, open up the IRI preferences screen. Select the IRI Menu dropdown and select IRI Preferences. Then select the dropdown for IRI (within the preferences window) and select Data Classes and Groups.
Select Add and it will bring you to this window (below).
Fill in the relevant fields and select Add in the Matchers section.
In the Data Class Matcher window, I added a regular expression pattern to the Details field. This will check that the credit card number matches a specified pattern.
In the Validator Script field, I added the file path to our validation script created in the previous section. Select OK and then Apply And Close to save the new data class into the IRI Workbench.
Doing so creates a new Data Class that can be used for any future data classification or data discovery job. If you have any questions about how to classify data for IRI Workbench-supported software like FieldShield, DarkShield or Voracity, email email@example.com.