{"id":16157,"date":"2022-07-13T17:23:15","date_gmt":"2022-07-13T21:23:15","guid":{"rendered":"https:\/\/www.iri.com\/blog\/?p=16157"},"modified":"2022-10-12T15:05:00","modified_gmt":"2022-10-12T19:05:00","slug":"creating-set-files-in-iri-workbench","status":"publish","type":"post","link":"https:\/\/www.iri.com\/blog\/iri\/iri-workbench\/creating-set-files-in-iri-workbench\/","title":{"rendered":"Creating Set Files in IRI Workbench"},"content":{"rendered":"<p><span style=\"font-weight: 400;\">As discussed in <\/span><a href=\"https:\/\/www.iri.com\/blog\/test-data\/all-about-iri-set-files-a-primer\/\"><span style=\"font-weight: 400;\">our primer article<\/span><\/a><span style=\"font-weight: 400;\">, set files are used to furnish data for a variety of IRI Voracity-compatible applications, like CoSort, FieldShield, DarkShield, NextForm, and RowGen.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Set files are usually text files with rows of single- or multi-byte characters or numeric values in one or more tab-separated columns, where each row is separated by a new line character.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Alternatively, set files can contain one more literal range of values (e.g., [100-999]). Values are selected from a set file randomly by default, and optionally, there are additional selection methods such as ALL, ONCE, or PERMUTE. If the value selected represents a range of numbers or date\/time values, a final value within the designated range will be drawn randomly.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">For the purposes of this article, we will discuss the more conventional set file with a list of values, and named with a<\/span><i><span style=\"font-weight: 400;\"> .set<\/span><\/i><span style=\"font-weight: 400;\"> extension. The IRI Workbench image below shows a two-column set file opened in the default editor:<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-16166 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2022\/07\/two-column-set-file-300x96.png\" alt=\"\" width=\"559\" height=\"179\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/two-column-set-file-300x96.png 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/two-column-set-file.png 583w\" sizes=\"(max-width: 559px) 100vw, 559px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">The wizards provided, and described in this article, are: <\/span><i><span style=\"font-weight: 400;\">Bucketing Values, Compound Data values, Date Range Generator, Email Generator, Pseudo Hash Set, Pseudo Set, Pseudo Set from Column, Range or Literal Values, <\/span><\/i><span style=\"font-weight: 400;\">and <\/span><i><span style=\"font-weight: 400;\">Set from Column<\/span><\/i><span style=\"font-weight: 400;\">:<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16168 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2022\/07\/NewSetFileWizardselection-300x167.png\" alt=\"\" width=\"471\" height=\"262\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/NewSetFileWizardselection-300x167.png 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/NewSetFileWizardselection.png 507w\" sizes=\"(max-width: 471px) 100vw, 471px\" \/><\/p>\n<h5><strong>Bucketing Values<\/strong><\/h5>\n<p><span style=\"font-weight: 400;\">The Bucketing Values wizard creates a set file that associates real values with more generalized replacement values that are still real enough, but safe for testing. This type of trait \u2018binning\u2019 helps reduce the risk of identifying people from the indirect or quasi-identifying (demographic) information in the same record.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">To use this feature, select <\/span><i><span style=\"font-weight: 400;\">Bucketing Values<\/span><\/i><span style=\"font-weight: 400;\"> from the <\/span><i><span style=\"font-weight: 400;\">New Set File \u2026<\/span><\/i><span style=\"font-weight: 400;\"> wizard and click <\/span><i><span style=\"font-weight: 400;\">Next &gt;<\/span><\/i><span style=\"font-weight: 400;\"> to open the <\/span><i><span style=\"font-weight: 400;\">Define Destination<\/span><\/i><span style=\"font-weight: 400;\"> screen to hold the new set file you will be creating. After giving the path and file name, click <\/span><i><span style=\"font-weight: 400;\">Next &gt;<\/span><\/i><span style=\"font-weight: 400;\"> to go to the Data Source screen.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">In this example, I am using a JSON input file where one of the keys pertains to educational levels for which I want to create generalized replacements. Note that this part of the wizard supports the option to use sources in other formats as well, including CSV, Delimited, Fixed, LDIF, MongoDB, XLS, XLSX, and XML files as well.<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16169\" src=\"\/blog\/wp-content\/uploads\/2022\/07\/NewBucketSetJob-300x176.png\" alt=\"\" width=\"537\" height=\"315\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/NewBucketSetJob-300x176.png 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/NewBucketSetJob.png 511w\" sizes=\"(max-width: 537px) 100vw, 537px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">On the Data Source screen (above), click <\/span><i><span style=\"font-weight: 400;\">Browse \u2026<\/span><\/i><span style=\"font-weight: 400;\"> to select the (input) file that contains the original data values to be bucketed into less specific categories. In this case, it\u2019s my JSON file with individualized key values for <\/span><i><span style=\"font-weight: 400;\">Age, Race, Marital_Status, Education, Native Country, WorkClass, Occupation, Salary, <\/span><\/i><span style=\"font-weight: 400;\">and <\/span><i><span style=\"font-weight: 400;\">ID<\/span><\/i><span style=\"font-weight: 400;\">:<\/span><\/p>\n<p style=\"text-align: left;\"><span style=\"font-weight: 400;\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-16171 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2022\/07\/JsonFileKeyValues.png\" alt=\"\" width=\"323\" height=\"195\" \/><\/span><\/p>\n<p style=\"text-align: left;\"><span style=\"font-weight: 400;\">SortCL-compatible Data Definition Format (DDF) file containing the \/FIELD layout (<\/span><a href=\"https:\/\/www.iri.com\/products\/cosort\/sortcl-metadata\"><span style=\"font-weight: 400;\">metadata<\/span><\/a><span style=\"font-weight: 400;\">) for this data source must be specified. The DDF can either be imported through the <\/span><i><span style=\"font-weight: 400;\">Browse \u2026 <\/span><\/i><span style=\"font-weight: 400;\">button if it exists, or created through the <\/span><i><span style=\"font-weight: 400;\">Discover \u2026<\/span><\/i><span style=\"font-weight: 400;\"> option. The latter runs the <\/span><a href=\"https:\/\/www.iri.com\/blog\/data-transformation2\/using-the-metadata-discovery-wizard\/\"><span style=\"font-weight: 400;\">Metadata Discovery Wizard<\/span><\/a><span style=\"font-weight: 400;\"> to build the DDF file.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Here is that DDF file:<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16173\" src=\"\/blog\/wp-content\/uploads\/2022\/07\/DDF_file-300x133.png\" alt=\"\" width=\"692\" height=\"307\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/DDF_file-300x133.png 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/DDF_file.png 657w\" sizes=\"(max-width: 692px) 100vw, 692px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">Note that DDFs auto-created for JSON fields automatically append an ordinal number to each field to make sure field names stay unique (as JSON key\/item names may repeat).<\/span><\/p>\n<p><span style=\"font-weight: 400;\">At this point, I could select the <\/span><b><i>Education5 <\/i><\/b><span style=\"font-weight: 400;\">field from the dropdown for FIELD, which was presented from my DDF for the JSON file. I then clicked <\/span><i><span style=\"font-weight: 400;\">Next &gt;<\/span><\/i><span style=\"font-weight: 400;\"> to take me to the <\/span><b>Options <\/b><span style=\"font-weight: 400;\">page:<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-16175 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2022\/07\/OptionNewBucketSetJob-300x269.png\" alt=\"\" width=\"526\" height=\"472\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/OptionNewBucketSetJob-300x269.png 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/OptionNewBucketSetJob.png 657w\" sizes=\"(max-width: 526px) 100vw, 526px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">Here I select the type of bucket set to create. I chose <\/span><b><i>Use set file as a group<\/i><\/b><span style=\"font-weight: 400;\"> in order to replace the original discrete values with more general values; for example, 9th, 10th, 11th, and 12th grade students are more anonymous in a general high school bucket.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">In this example I will create a few categories to generalize my original set file values; i.e., <\/span><i><span style=\"font-weight: 400;\">Not a High School Graduate, High School Graduate, Some College, Associate Degree, University Degree.<\/span><\/i><span style=\"font-weight: 400;\"> The source values are extracted from the input file.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Drag and drop the Source values into the group values box and name the group in the Replacement Field. Then add a Group result. Once done with all source Values, click <\/span><i><span style=\"font-weight: 400;\">Finish<\/span><\/i><span style=\"font-weight: 400;\">.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Once I name the Replacement field, I get Bachelors, Masters, and Doctorate students in the same College Graduate category (bucket), for example. I can also add additional replacement values. In the example above, I entered \u201cMBA\u201d as a <\/span><i><span style=\"font-weight: 400;\">Manual value<\/span><\/i><span style=\"font-weight: 400;\">: and clicked <\/span><i><span style=\"font-weight: 400;\">Add value<\/span><\/i><span style=\"font-weight: 400;\">: to append it to the list of <\/span><i><span style=\"font-weight: 400;\">Values<\/span><\/i><span style=\"font-weight: 400;\">.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Now if I click <\/span><i><span style=\"font-weight: 400;\">Add group to results<\/span><\/i><span style=\"font-weight: 400;\">,\u00a0 I will see College Graduate in the Replacement area of the Results section and the values Bachelors, Masters Doctorate, and MBA in the values section. After finishing with this wizard, the set file is built, and I can double-click on it in the file explorer to see these <\/span><i><span style=\"font-weight: 400;\">(<\/span><\/i><span style=\"font-weight: 400;\">automatically sorted) values:<\/span><\/p>\n<p style=\"text-align: center;\"><span style=\"font-weight: 400;\">\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 <img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16176\" src=\"\/blog\/wp-content\/uploads\/2022\/07\/GroupResult-300x255.png\" alt=\"\" width=\"288\" height=\"245\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/GroupResult-300x255.png 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/GroupResult.png 306w\" sizes=\"(max-width: 288px) 100vw, 288px\" \/><\/span><\/p>\n<p><span style=\"font-weight: 400;\">This two-column lookup set file can now be used in FieldShield pseudonymization or RowGen valid pairs selections to produce realistic, but more anonymous test values than my JSON file.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">For the second type of bucket set (<\/span><b><i>Use set file as range<\/i><\/b><span style=\"font-weight: 400;\">), I will use an ODBC-connected Oracle table called CLAIMS3. Below is its DDF layout; I want to bucket the values in the AGE column.\u00a0<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16177\" src=\"\/blog\/wp-content\/uploads\/2022\/07\/AgeColumn-300x146.png\" alt=\"\" width=\"734\" height=\"357\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/AgeColumn-300x146.png 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/AgeColumn.png 731w\" sizes=\"(max-width: 734px) 100vw, 734px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">As you can see in the Data Source dialog below, I selected CLAIMS3, added the above metadata, chose the AGE field, and clicked Next &gt; to bring up the options screen:<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16178\" src=\"\/blog\/wp-content\/uploads\/2022\/07\/DataSourceClaim3-300x133.png\" alt=\"\" width=\"514\" height=\"228\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/DataSourceClaim3-300x133.png 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/DataSourceClaim3.png 613w\" sizes=\"(max-width: 514px) 100vw, 514px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">The Options screen below displays my source values from the CLAIMS3 table:<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16179\" src=\"\/blog\/wp-content\/uploads\/2022\/07\/OptionClaim3-300x247.png\" alt=\"\" width=\"562\" height=\"463\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/OptionClaim3-300x247.png 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/OptionClaim3.png 716w\" sizes=\"(max-width: 562px) 100vw, 562px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">To assign general age-range terms to my values, I need to specify a minimum value for each, along with the age category in the replacement text. In this case, my age categories are <\/span><i><span style=\"font-weight: 400;\">Child, Teen, Young Adult, Adult, Middle Age,<\/span><\/i><span style=\"font-weight: 400;\"> and <\/span><i><span style=\"font-weight: 400;\">Senior<\/span><\/i><span style=\"font-weight: 400;\">, each with a manually specified lowest age (minimum limit value):<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16180\" src=\"\/blog\/wp-content\/uploads\/2022\/07\/age-range.png\" alt=\"\" width=\"209\" height=\"113\" \/><\/p>\n<h5><b>Compound Data Values<\/b><\/h5>\n<p><span style=\"font-weight: 400;\">The Compound Data Values wizard synthesizes values in custom-defined formats using a combination of literal strings and generated values in specified data types. In this example, I am using the wizard to generate a list of realistic phone numbers in standard US format.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The first step is to define the path and name of the set file I am creating. On this first page, I can also define the number of rows my set will contain and whether the values in it should be sorted:<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16181\" src=\"\/blog\/wp-content\/uploads\/2022\/07\/CompoundDataOutput-300x260.png\" alt=\"\" width=\"449\" height=\"389\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/CompoundDataOutput-300x260.png 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/CompoundDataOutput.png 513w\" sizes=\"(max-width: 449px) 100vw, 449px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">Once I have selected these options, I click <\/span><i><span style=\"font-weight: 400;\">Next &gt; <\/span><\/i><span style=\"font-weight: 400;\">to open the Compound Data Definition page of the wizard and begin to define the name of each part of the customs value I am designing.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">To define the first component, I click <\/span><i><span style=\"font-weight: 400;\">Add \u2026<\/span><\/i><span style=\"font-weight: 400;\">to define a literal \u2013 here a fixed (321) area code. After that, to create the dash before the prefix, I added the literal \u201c-\u201d.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">To specify a three-digit phone number PREfix, I clicked Add to generate a numeric value of size 3 and precision 0 with a range between 111 and 999. During that process, I can preview values.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">I then added another \u201c-\u201d literal and a randomly generated numeric called sub for the 4-digit remainder of the number. The following pages of the wizards reveal the steps I took:<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-16182 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2022\/07\/CompoundDataDef-300x262.png\" alt=\"\" width=\"464\" height=\"406\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/CompoundDataDef-300x262.png 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/CompoundDataDef.png 515w\" sizes=\"(max-width: 464px) 100vw, 464px\" \/><span style=\"font-weight: 400;\">\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span style=\"font-weight: 400;\">\u00a0 \u00a0 \u00a0 \u00a0 <img loading=\"lazy\" decoding=\"async\" class=\"wp-image-16184 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2022\/07\/RandomGenerationAttributes-300x149.png\" alt=\"\" width=\"465\" height=\"231\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/RandomGenerationAttributes-300x149.png 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/RandomGenerationAttributes.png 502w\" sizes=\"(max-width: 465px) 100vw, 465px\" \/><br \/>\n<\/span><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16186\" style=\"text-align: start;\" src=\"\/blog\/wp-content\/uploads\/2022\/07\/RangeSel-300x122.png\" alt=\"\" width=\"465\" height=\"189\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/RangeSel-300x122.png 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/RangeSel.png 500w\" sizes=\"(max-width: 465px) 100vw, 465px\" \/><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-16187 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2022\/07\/CompDataGenrtCompnt-300x261.png\" alt=\"\" width=\"461\" height=\"401\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/CompDataGenrtCompnt-300x261.png 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/CompDataGenrtCompnt.png 512w\" sizes=\"(max-width: 461px) 100vw, 461px\" \/><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-16188 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2022\/07\/RndmGenAttrb-300x128.png\" alt=\"\" width=\"464\" height=\"198\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/RndmGenAttrb-300x128.png 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/RndmGenAttrb.png 509w\" sizes=\"(max-width: 464px) 100vw, 464px\" \/><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-16189 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2022\/07\/CompdDataDef-300x263.png\" alt=\"\" width=\"464\" height=\"407\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/CompdDataDef-300x263.png 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/CompdDataDef.png 513w\" sizes=\"(max-width: 464px) 100vw, 464px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">When I click <\/span><i><span style=\"font-weight: 400;\">Finish<\/span><\/i><span style=\"font-weight: 400;\">, the RowGen\/SortCL code produced to generate the set file looks like this:<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16192\" src=\"\/blog\/wp-content\/uploads\/2022\/07\/RowGenSortCLSetFile-300x85.png\" alt=\"\" width=\"717\" height=\"203\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/RowGenSortCLSetFile-300x85.png 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/RowGenSortCLSetFile.png 713w\" sizes=\"(max-width: 717px) 100vw, 717px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">When I run the job to generate this set, I get these phone numbers:<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16193\" src=\"\/blog\/wp-content\/uploads\/2022\/07\/phone-98x300.png\" alt=\"\" width=\"108\" height=\"331\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/phone-98x300.png 98w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/phone.png 103w\" sizes=\"(max-width: 108px) 100vw, 108px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">which can now be used in other RowGen jobs to provide randomly selected test values in a phone number column, in FieldShield for pseudonymous phone number replacements, etc.<\/span><\/p>\n<h5><b>Date Range Generator<\/b><\/h5>\n<p><span style=\"font-weight: 400;\">The Data Range Generator wizard creates a set file containing a range of date or timestamp values in one of several default formats.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The Date Range Generator wizard dialog has three options: the data or timestamp format, the minimum and maximum date value range you can modify between 1900 and 2199. Note that you decide whether the minimum or maximum value will be included as a selectable value.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">When you click Finish, a ranged set file will be produced reflecting your specifications; e.g.,<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16194\" src=\"\/blog\/wp-content\/uploads\/2022\/07\/DataRangeGen-300x99.png\" alt=\"\" width=\"513\" height=\"169\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/DataRangeGen-300x99.png 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/DataRangeGen.png 637w\" sizes=\"(max-width: 513px) 100vw, 513px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">Will generate the following, single line ranged set in a named .set file:<\/span><\/p>\n<p style=\"text-align: center;\"><span style=\"font-weight: 400;\">[01\/01\/1900 12:00:00 AM,12\/31\/2199 11:59:59 PM]<\/span><\/p>\n<p><span style=\"font-weight: 400;\">At runtime, values in this format will be drawn from within this range, either at random, or randomly within specified Distribution frequencies, per <\/span><a href=\"https:\/\/www.iri.com\/blog\/iri\/iri-workbench\/data-generation-rules-workbench\/\"><span style=\"font-weight: 400;\">this article<\/span><\/a><span style=\"font-weight: 400;\"> on data generation rules.<\/span><\/p>\n<h5><b>Pseudo Hash Set<\/b><\/h5>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16195\" src=\"\/blog\/wp-content\/uploads\/2022\/07\/PseudoHashSet-300x195.png\" alt=\"\" width=\"471\" height=\"306\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/PseudoHashSet-300x195.png 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/PseudoHashSet.png 509w\" sizes=\"(max-width: 471px) 100vw, 471px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">The Pseudonym Hash Set Creation Wizard creates a specially formatted pseudonym replacement set file where the first column contains a lookup list that has been hashed. This specially formatted pseudo set file is meant to be used in conjunction with a Pseudonym Hash Replacement Rule.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Using a pseudo hash set file with a pseudo hash replacement rule holds a couple of advantages. First, because the lookup list is hashed, the lookup values do not need to be stored securely.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Second, a pseudonym hash replacement rule is more flexible than regular pseudonym replacement because it matches to an exact match but also the closest possible match. This frees users from the need of having to continually update and expand their lookup list.<\/span><\/p>\n<p style=\"text-align: center;\"><span style=\"font-weight: 400;\">\u00a0<img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16196\" src=\"\/blog\/wp-content\/uploads\/2022\/07\/PseudoHashPseudo-230x300.png\" alt=\"\" width=\"319\" height=\"416\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/PseudoHashPseudo-230x300.png 230w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/PseudoHashPseudo.png 357w\" sizes=\"(max-width: 319px) 100vw, 319px\" \/><\/span><\/p>\n<p><i><span style=\"font-weight: 400;\">For more information and an example of this type of set file, see <a href=\"https:\/\/www.iri.com\/blog\/data-protection\/pseudonym-hash-set-file-creation-wizard\/\">this article<\/a>.<\/span><\/i><\/p>\n<h5><b>Email Generator<\/b><\/h5>\n<p><span style=\"font-weight: 400;\">The Email Generator wizard synthesizes random email address values with custom-defined size ranges and domains. The emails can be used to populate test target columns or replace real email addresses during masking jobs.<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16197\" src=\"\/blog\/wp-content\/uploads\/2022\/07\/EmailGen-300x219.png\" alt=\"\" width=\"509\" height=\"371\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/EmailGen-300x219.png 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/EmailGen.png 608w\" sizes=\"(max-width: 509px) 100vw, 509px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">Set Size defines the number of records to be generated in a set file. Field size minimum and maximum define a range of the number of characters that the emails generated will be within, inclusive.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">There are several options for the mail server, including Gmail and ATT. For the final part of the Domain, you can select .com, .ded, .net, etc. A country domain can also be added; e.g., .in, ae.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">To preview the address, click <\/span><i><span style=\"font-weight: 400;\">Generate<\/span><\/i><span style=\"font-weight: 400;\">. Click <\/span><i><span style=\"font-weight: 400;\">Finish <\/span><\/i><span style=\"font-weight: 400;\">to build an output (set) file like this:<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16198\" src=\"\/blog\/wp-content\/uploads\/2022\/07\/NewEmail_set-206x300.png\" alt=\"\" width=\"313\" height=\"456\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/NewEmail_set-206x300.png 206w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/NewEmail_set.png 361w\" sizes=\"(max-width: 313px) 100vw, 313px\" \/><\/p>\n<h5><b>Pseudo Set<\/b><\/h5>\n<p><span style=\"font-weight: 400;\">The Pseudo Set wizard creates a two-column set file that can be used for reversible pseudonymization The data in the lookup column serves as the original value, and the data in the results column is its consistent replacement in a static pseudonymization job.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The sources of data from each column can be different. For both the Lookups and the Results Column, you can <\/span><i><span style=\"font-weight: 400;\">Add \u2026<\/span><\/i><span style=\"font-weight: 400;\"> values from flat files, ODBC-connected database tables, or data streamed from a supported URL connection like a file in HDFS or a cloud bucket, MongoDB HTTPS, FTP, etc.<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16199\" src=\"\/blog\/wp-content\/uploads\/2022\/07\/TypeURL-300x67.png\" alt=\"\" width=\"443\" height=\"99\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/TypeURL-300x67.png 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/TypeURL.png 510w\" sizes=\"(max-width: 443px) 100vw, 443px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">A default value for a lookup can be specified. The default value is produced if there is no match from the field to the lookup column.<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16201\" src=\"\/blog\/wp-content\/uploads\/2022\/07\/SourceLocation-300x298.png\" alt=\"\" width=\"460\" height=\"457\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/SourceLocation-300x298.png 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/SourceLocation-150x150.png 150w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/SourceLocation-70x70.png 70w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/SourceLocation.png 532w\" sizes=\"(max-width: 460px) 100vw, 460px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">In this example, my source values will come from the PATIENT_FIRST_NAME column in Oracle table called CLAIMS3 (with a predefined DDF file).\u00a0<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16202\" src=\"\/blog\/wp-content\/uploads\/2022\/07\/PatientFirstName-300x286.png\" alt=\"\" width=\"460\" height=\"438\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/PatientFirstName-300x286.png 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/PatientFirstName.png 518w\" sizes=\"(max-width: 460px) 100vw, 460px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">The pseudonyms in the second column will be drawn from the MEMBER_FULL_NAME column in that same table.\u00a0<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16203\" src=\"\/blog\/wp-content\/uploads\/2022\/07\/MemberFullName-274x300.png\" alt=\"\" width=\"469\" height=\"514\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/MemberFullName-274x300.png 274w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/MemberFullName.png 493w\" sizes=\"(max-width: 469px) 100vw, 469px\" \/><\/p>\n<p style=\"text-align: left;\"><span style=\"font-weight: 400;\">When I click <\/span><i><span style=\"font-weight: 400;\">Finish<\/span><\/i><span style=\"font-weight: 400;\">, the wizard builds and runs a batch file with three task scripts that create the pseudo set file. My workspace below shows the workflow diagram of the batch file, one of the scripts and its corresponding mapping diagram, and the resulting lookup set file.\u00a0<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16204\" src=\"\/blog\/wp-content\/uploads\/2022\/07\/workflow-diagrambatch-file-300x163.png\" alt=\"\" width=\"804\" height=\"437\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/workflow-diagrambatch-file-300x163.png 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/workflow-diagrambatch-file-1024x555.png 1024w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/workflow-diagrambatch-file-768x416.png 768w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/workflow-diagrambatch-file-1536x832.png 1536w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/workflow-diagrambatch-file.png 1600w\" sizes=\"(max-width: 804px) 100vw, 804px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">Note that the saved batch file could be incorporated into a masking workflow too, such that the set file is recreated just before a masking job runs so that there would be no missing values. However, this will change the random scrambling of the values in the replacement column, so it would not be <\/span><a href=\"https:\/\/www.iri.com\/blog\/data-protection\/consistent-cross-table-data-pseudonymization\/\"><span style=\"font-weight: 400;\">consistent from job to job<\/span><\/a><span style=\"font-weight: 400;\">.<\/span><\/p>\n<h5><b><\/b><b>Pseudo Set from Column<\/b><\/h5>\n<p><span style=\"font-weight: 400;\">The Pseudo Set from Column wizard<\/span> <span style=\"font-weight: 400;\">is very similar to the Pseudo set wizard, in that it also produces a two-column set file suitable for pseudonymization. However, this wizard is more purpose-built and easier for creating a pseudo set for data in one or more source DB columns.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">This wizard builds the set by reading and sorting table data for the first (lookup) set file column and scrambles those same values for use as pseudonyms in the second column. It also allows you to add data from other tables (which have more of the same kind of values, like last names) and scramble those into the pseudonym column, too.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">On the Define Destination screen, type in the name for the job script that will create the set file when run. Check <\/span><i><span style=\"font-weight: 400;\">Save script to generate the set file <\/span><\/i><span style=\"font-weight: 400;\">and <\/span><i><span style=\"font-weight: 400;\">Execute script on finish <\/span><\/i><span style=\"font-weight: 400;\">to actually produce that file. By saving the scripts that generate the set file, the creation of the set file can easily be made a part of a data masking Flow later.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Since this wizard removes all duplicates from the selected columns, it produces replacements which are suitable for restoration later. Checking the box to <\/span><i><span style=\"font-weight: 400;\">Create restore set also<\/span><\/i><span style=\"font-weight: 400;\"> will create an inverse restore set which can be used to recover the original names after pseudonymization.\u00a0<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16205\" src=\"\/blog\/wp-content\/uploads\/2022\/07\/DefineDestination-300x209.png\" alt=\"\" width=\"588\" height=\"410\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/DefineDestination-300x209.png 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/DefineDestination.png 745w\" sizes=\"(max-width: 588px) 100vw, 588px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">Click next to get into the Pseudo Column Selection page. Select a Connection Profile (in my case DB2), and then one or more tables with columns to use in your set file. In this case, I chose the <\/span><i><span style=\"font-weight: 400;\">President<\/span><\/i><span style=\"font-weight: 400;\"> column from the <\/span><i><span style=\"font-weight: 400;\">Dems<\/span><\/i><span style=\"font-weight: 400;\"> table and <\/span><i><span style=\"font-weight: 400;\">Name <\/span><\/i><span style=\"font-weight: 400;\">from the <\/span><i><span style=\"font-weight: 400;\">Customers_Flow<\/span><\/i><span style=\"font-weight: 400;\"> table:<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16206\" src=\"\/blog\/wp-content\/uploads\/2022\/07\/PseudoColumnSel-300x208.png\" alt=\"\" width=\"580\" height=\"402\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/PseudoColumnSel-300x208.png 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/PseudoColumnSel.png 746w\" sizes=\"(max-width: 580px) 100vw, 580px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">When you click <\/span><i><span style=\"font-weight: 400;\">Finish <\/span><\/i><span style=\"font-weight: 400;\">a batch file is created. When run, it creates several artifacts: the job scripts: newPseudoSet_Lookup.scl and newPseudoSet_Pseudo.scl, a newPseudoSet_Pseudo.set, newPseudoset.bat, and new PseudoSet.flow.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">This Workbench screenshot shows all the job aspects, including the set file:<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16207\" src=\"\/blog\/wp-content\/uploads\/2022\/07\/WorkbchScrnshot-300x166.png\" alt=\"\" width=\"853\" height=\"472\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/WorkbchScrnshot-300x166.png 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/WorkbchScrnshot-1024x568.png 1024w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/WorkbchScrnshot-768x426.png 768w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/WorkbchScrnshot-1536x852.png 1536w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/WorkbchScrnshot.png 1110w\" sizes=\"(max-width: 853px) 100vw, 853px\" \/><\/p>\n<h5><b>Set Range or Literal<\/b><\/h5>\n<p><span style=\"font-weight: 400;\">This wizard creates a set file denoting a range of numeric values to select from, or a list of literally specified ASCII values. Either way, a test data synthesis job in RowGen would draw values at random when this set file is specified within an input \/FIELD.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">To include the endpoint values in a range of randomly generated numbers, check the option <\/span><i><span style=\"font-weight: 400;\">Including this value<\/span><\/i><span style=\"font-weight: 400;\">. For examples:<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16208\" src=\"\/blog\/wp-content\/uploads\/2022\/07\/SetFileEntries-300x43.png\" alt=\"\" width=\"474\" height=\"68\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/SetFileEntries-300x43.png 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/SetFileEntries.png 613w\" sizes=\"(max-width: 474px) 100vw, 474px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">The set file built in this case will just contain the item: [1,1000]. You can also specify a range between 1 and 1,000 which does not include endpoint values: (1,1000); a square bracket indicates <\/span><i><span style=\"font-weight: 400;\">inclusive<\/span><\/i><span style=\"font-weight: 400;\"> while a parenthesis indicates <\/span><i><span style=\"font-weight: 400;\">exclusive<\/span><\/i><span style=\"font-weight: 400;\">.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">More fully shown, a numeric set file can be specified in the wizard this way:<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16209\" src=\"\/blog\/wp-content\/uploads\/2022\/07\/CrtRangeLitVal-300x225.png\" alt=\"\" width=\"498\" height=\"374\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/CrtRangeLitVal-300x225.png 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/CrtRangeLitVal.png 611w\" sizes=\"(max-width: 498px) 100vw, 498px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">An example of a numeric set file is below. It illustrates a requirement where you might want numbers in a range of one to twenty, with a majority of the values being centered around, and close to ten.<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16210\" src=\"\/blog\/wp-content\/uploads\/2022\/07\/newIntRangset.png\" alt=\"\" width=\"146\" height=\"146\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/newIntRangset.png 128w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/newIntRangset-70x70.png 70w\" sizes=\"(max-width: 146px) 100vw, 146px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">For each row of data, one of the five rows will be selected. If the value of that row is a single number, then that number is used. If the one row with the range is selected, a random value between one and twenty, inclusive, will be selected.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">An example with specified literal values on the other is shown here:<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16211\" src=\"\/blog\/wp-content\/uploads\/2022\/07\/RangeFileAscii-300x225.png\" alt=\"\" width=\"507\" height=\"380\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/RangeFileAscii-300x225.png 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/RangeFileAscii.png 611w\" sizes=\"(max-width: 507px) 100vw, 507px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">The example below shows a set file with ASCII, or string, entries. With ASCII, each row is a single value; there are no ranges.<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16212\" src=\"\/blog\/wp-content\/uploads\/2022\/07\/setfileAscii.png\" alt=\"\" width=\"160\" height=\"162\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/setfileAscii.png 133w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/setfileAscii-70x70.png 70w\" sizes=\"(max-width: 160px) 100vw, 160px\" \/><\/p>\n<h5 style=\"text-align: left;\"><b>Set from Column<\/b><\/h5>\n<p><span style=\"font-weight: 400;\">This wizard can create a set file using the values in one or more database columns:<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16213\" src=\"\/blog\/wp-content\/uploads\/2022\/07\/SetFileNewSet-300x279.png\" alt=\"\" width=\"475\" height=\"442\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/SetFileNewSet-300x279.png 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/SetFileNewSet.png 612w\" sizes=\"(max-width: 475px) 100vw, 475px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">After specifying the set file folder and name you are building, click <\/span><i><span style=\"font-weight: 400;\">Next <\/span><\/i><span style=\"font-weight: 400;\">&gt; to specify the source table and column for the data set. In this case, I want the values from four columns in my (tab-delimited) set file to be drawn from the CLAIMS3 table in my Oracle Database; i.e., <\/span><i><span style=\"font-weight: 400;\">Claim_Number<\/span><\/i><span style=\"font-weight: 400;\">,<\/span><i><span style=\"font-weight: 400;\"> Patient_First_Name<\/span><\/i><span style=\"font-weight: 400;\">,<\/span><i><span style=\"font-weight: 400;\"> Patient_Last_Name<\/span><\/i><span style=\"font-weight: 400;\">,<\/span> <span style=\"font-weight: 400;\">and <\/span><i><span style=\"font-weight: 400;\">Date_Of_Service<\/span><\/i><span style=\"font-weight: 400;\">.<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16214\" src=\"\/blog\/wp-content\/uploads\/2022\/07\/SetFilefromColumnt-300x279.png\" alt=\"\" width=\"502\" height=\"467\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/SetFilefromColumnt-300x279.png 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/SetFilefromColumnt.png 629w\" sizes=\"(max-width: 502px) 100vw, 502px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">The connection profile dropdown menu displays active databases in the Data Connection Registry in Workbench Preferences. From there, the available tables will be shown, and then the columns in that table.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">In this case, I sequentially added columns from the CLAIMS3 table and see the data in them previewed below. The Row Limit option refers to the number of rows to select from the table.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">On <\/span><i><span style=\"font-weight: 400;\">Finish<\/span><\/i><span style=\"font-weight: 400;\">, a new set file like this one is built:<\/span><\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-16215\" src=\"\/blog\/wp-content\/uploads\/2022\/07\/NewSetFileFinish-300x291.png\" alt=\"\" width=\"564\" height=\"547\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/NewSetFileFinish-300x291.png 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/NewSetFileFinish-768x745.png 768w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/NewSetFileFinish.png 813w\" sizes=\"(max-width: 564px) 100vw, 564px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">If you have any questions or need help building or using set files with IRI Voracity-compatible software, please contact <\/span><a href=\"mailto:support@iri.com\"><span style=\"font-weight: 400;\">support@iri.com<\/span><\/a><span style=\"font-weight: 400;\">.<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>As discussed in our primer article, set files are used to furnish data for a variety of IRI Voracity-compatible applications, like CoSort, FieldShield, DarkShield, NextForm, and RowGen. Set files are usually text files with rows of single- or multi-byte characters or numeric values in one or more tab-separated columns, where each row is separated by<\/p>\n<div><a class=\"btn-filled btn\" href=\"https:\/\/www.iri.com\/blog\/iri\/iri-workbench\/creating-set-files-in-iri-workbench\/\" title=\"Creating Set Files in IRI Workbench\">Read More<\/a><\/div>\n","protected":false},"author":53,"featured_media":16165,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_exactmetrics_skip_tracking":false,"_exactmetrics_sitenote_active":false,"_exactmetrics_sitenote_note":"","_exactmetrics_sitenote_category":0,"footnotes":""},"categories":[8,1,91,29],"tags":[850,785],"class_list":["post-16157","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-data-protection","category-data-transformation2","category-iri-workbench","category-test-data","tag-iri-workbench","tag-set-files"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v23.4 (Yoast SEO v23.4) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Creating Set Files in IRI Workbench - IRI<\/title>\n<meta name=\"description\" content=\"As discussed in our primer article, set files are used to furnish data for a variety of IRI Voracity-compatible applications, like CoSort, FieldShield, DarkShield, NextForm, and RowGen. Set files are usually text files with rows of single- or multi-byte characters or numeric values in one or more tab-separated columns, where each row is separated by a new line character.\u00a0Alternatively, set files can contain one more literal range of values (e.g., [100-999]). For the purposes of this article, we will discuss the more conventional set file with a list of values, and named with a .set extension.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.iri.com\/blog\/data-transformation2\/creating-set-files-in-iri-workbench\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Creating Set Files in IRI Workbench\" \/>\n<meta property=\"og:description\" content=\"As discussed in our primer article, set files are used to furnish data for a variety of IRI Voracity-compatible applications, like CoSort, FieldShield, DarkShield, NextForm, and RowGen. Set files are usually text files with rows of single- or multi-byte characters or numeric values in one or more tab-separated columns, where each row is separated by a new line character.\u00a0Alternatively, set files can contain one more literal range of values (e.g., [100-999]). For the purposes of this article, we will discuss the more conventional set file with a list of values, and named with a .set extension.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.iri.com\/blog\/data-transformation2\/creating-set-files-in-iri-workbench\/\" \/>\n<meta property=\"og:site_name\" content=\"IRI\" \/>\n<meta property=\"article:published_time\" content=\"2022-07-13T21:23:15+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2022-10-12T19:05:00+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/FtrImgCrtSetFile.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1238\" \/>\n\t<meta property=\"og:image:height\" content=\"594\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Chaitali Mitra\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Chaitali Mitra\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"20 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.iri.com\/blog\/data-transformation2\/creating-set-files-in-iri-workbench\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.iri.com\/blog\/data-transformation2\/creating-set-files-in-iri-workbench\/\"},\"author\":{\"name\":\"Chaitali Mitra\",\"@id\":\"https:\/\/www.iri.com\/blog\/#\/schema\/person\/9bae14a309616863b027c2d56f532caf\"},\"headline\":\"Creating Set Files in IRI Workbench\",\"datePublished\":\"2022-07-13T21:23:15+00:00\",\"dateModified\":\"2022-10-12T19:05:00+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.iri.com\/blog\/data-transformation2\/creating-set-files-in-iri-workbench\/\"},\"wordCount\":2607,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/www.iri.com\/blog\/#organization\"},\"image\":{\"@id\":\"https:\/\/www.iri.com\/blog\/data-transformation2\/creating-set-files-in-iri-workbench\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/FtrImgCrtSetFile.png\",\"keywords\":[\"IRI Workbench\",\"set files\"],\"articleSection\":[\"Data Masking\/Protection\",\"Data Transformation\",\"IRI Workbench\",\"Test Data\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/www.iri.com\/blog\/data-transformation2\/creating-set-files-in-iri-workbench\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.iri.com\/blog\/data-transformation2\/creating-set-files-in-iri-workbench\/\",\"url\":\"https:\/\/www.iri.com\/blog\/data-transformation2\/creating-set-files-in-iri-workbench\/\",\"name\":\"Creating Set Files in IRI Workbench - IRI\",\"isPartOf\":{\"@id\":\"https:\/\/www.iri.com\/blog\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.iri.com\/blog\/data-transformation2\/creating-set-files-in-iri-workbench\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.iri.com\/blog\/data-transformation2\/creating-set-files-in-iri-workbench\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/FtrImgCrtSetFile.png\",\"datePublished\":\"2022-07-13T21:23:15+00:00\",\"dateModified\":\"2022-10-12T19:05:00+00:00\",\"description\":\"As discussed in our primer article, set files are used to furnish data for a variety of IRI Voracity-compatible applications, like CoSort, FieldShield, DarkShield, NextForm, and RowGen. Set files are usually text files with rows of single- or multi-byte characters or numeric values in one or more tab-separated columns, where each row is separated by a new line character.\u00a0Alternatively, set files can contain one more literal range of values (e.g., [100-999]). For the purposes of this article, we will discuss the more conventional set file with a list of values, and named with a .set extension.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.iri.com\/blog\/data-transformation2\/creating-set-files-in-iri-workbench\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.iri.com\/blog\/data-transformation2\/creating-set-files-in-iri-workbench\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.iri.com\/blog\/data-transformation2\/creating-set-files-in-iri-workbench\/#primaryimage\",\"url\":\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/FtrImgCrtSetFile.png\",\"contentUrl\":\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/FtrImgCrtSetFile.png\",\"width\":1238,\"height\":594},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.iri.com\/blog\/data-transformation2\/creating-set-files-in-iri-workbench\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.iri.com\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Creating Set Files in IRI Workbench\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.iri.com\/blog\/#website\",\"url\":\"https:\/\/www.iri.com\/blog\/\",\"name\":\"IRI\",\"description\":\"Total Data Management Blog\",\"publisher\":{\"@id\":\"https:\/\/www.iri.com\/blog\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.iri.com\/blog\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.iri.com\/blog\/#organization\",\"name\":\"IRI\",\"url\":\"https:\/\/www.iri.com\/blog\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.iri.com\/blog\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png\",\"contentUrl\":\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png\",\"width\":750,\"height\":206,\"caption\":\"IRI\"},\"image\":{\"@id\":\"https:\/\/www.iri.com\/blog\/#\/schema\/logo\/image\/\"}},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.iri.com\/blog\/#\/schema\/person\/9bae14a309616863b027c2d56f532caf\",\"name\":\"Chaitali Mitra\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.iri.com\/blog\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/95a11f3d0b709c00df3262bab0152f3a?s=96&d=blank&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/95a11f3d0b709c00df3262bab0152f3a?s=96&d=blank&r=g\",\"caption\":\"Chaitali Mitra\"},\"sameAs\":[\"http:\/\/www.iri.com\"],\"url\":\"https:\/\/www.iri.com\/blog\/author\/chaitalim\/\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Creating Set Files in IRI Workbench - IRI","description":"As discussed in our primer article, set files are used to furnish data for a variety of IRI Voracity-compatible applications, like CoSort, FieldShield, DarkShield, NextForm, and RowGen. Set files are usually text files with rows of single- or multi-byte characters or numeric values in one or more tab-separated columns, where each row is separated by a new line character.\u00a0Alternatively, set files can contain one more literal range of values (e.g., [100-999]). For the purposes of this article, we will discuss the more conventional set file with a list of values, and named with a .set extension.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.iri.com\/blog\/data-transformation2\/creating-set-files-in-iri-workbench\/","og_locale":"en_US","og_type":"article","og_title":"Creating Set Files in IRI Workbench","og_description":"As discussed in our primer article, set files are used to furnish data for a variety of IRI Voracity-compatible applications, like CoSort, FieldShield, DarkShield, NextForm, and RowGen. Set files are usually text files with rows of single- or multi-byte characters or numeric values in one or more tab-separated columns, where each row is separated by a new line character.\u00a0Alternatively, set files can contain one more literal range of values (e.g., [100-999]). For the purposes of this article, we will discuss the more conventional set file with a list of values, and named with a .set extension.","og_url":"https:\/\/www.iri.com\/blog\/data-transformation2\/creating-set-files-in-iri-workbench\/","og_site_name":"IRI","article_published_time":"2022-07-13T21:23:15+00:00","article_modified_time":"2022-10-12T19:05:00+00:00","og_image":[{"width":1238,"height":594,"url":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/FtrImgCrtSetFile.png","type":"image\/png"}],"author":"Chaitali Mitra","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Chaitali Mitra","Est. reading time":"20 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.iri.com\/blog\/data-transformation2\/creating-set-files-in-iri-workbench\/#article","isPartOf":{"@id":"https:\/\/www.iri.com\/blog\/data-transformation2\/creating-set-files-in-iri-workbench\/"},"author":{"name":"Chaitali Mitra","@id":"https:\/\/www.iri.com\/blog\/#\/schema\/person\/9bae14a309616863b027c2d56f532caf"},"headline":"Creating Set Files in IRI Workbench","datePublished":"2022-07-13T21:23:15+00:00","dateModified":"2022-10-12T19:05:00+00:00","mainEntityOfPage":{"@id":"https:\/\/www.iri.com\/blog\/data-transformation2\/creating-set-files-in-iri-workbench\/"},"wordCount":2607,"commentCount":0,"publisher":{"@id":"https:\/\/www.iri.com\/blog\/#organization"},"image":{"@id":"https:\/\/www.iri.com\/blog\/data-transformation2\/creating-set-files-in-iri-workbench\/#primaryimage"},"thumbnailUrl":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/FtrImgCrtSetFile.png","keywords":["IRI Workbench","set files"],"articleSection":["Data Masking\/Protection","Data Transformation","IRI Workbench","Test Data"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.iri.com\/blog\/data-transformation2\/creating-set-files-in-iri-workbench\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.iri.com\/blog\/data-transformation2\/creating-set-files-in-iri-workbench\/","url":"https:\/\/www.iri.com\/blog\/data-transformation2\/creating-set-files-in-iri-workbench\/","name":"Creating Set Files in IRI Workbench - IRI","isPartOf":{"@id":"https:\/\/www.iri.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.iri.com\/blog\/data-transformation2\/creating-set-files-in-iri-workbench\/#primaryimage"},"image":{"@id":"https:\/\/www.iri.com\/blog\/data-transformation2\/creating-set-files-in-iri-workbench\/#primaryimage"},"thumbnailUrl":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/FtrImgCrtSetFile.png","datePublished":"2022-07-13T21:23:15+00:00","dateModified":"2022-10-12T19:05:00+00:00","description":"As discussed in our primer article, set files are used to furnish data for a variety of IRI Voracity-compatible applications, like CoSort, FieldShield, DarkShield, NextForm, and RowGen. Set files are usually text files with rows of single- or multi-byte characters or numeric values in one or more tab-separated columns, where each row is separated by a new line character.\u00a0Alternatively, set files can contain one more literal range of values (e.g., [100-999]). For the purposes of this article, we will discuss the more conventional set file with a list of values, and named with a .set extension.","breadcrumb":{"@id":"https:\/\/www.iri.com\/blog\/data-transformation2\/creating-set-files-in-iri-workbench\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.iri.com\/blog\/data-transformation2\/creating-set-files-in-iri-workbench\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.iri.com\/blog\/data-transformation2\/creating-set-files-in-iri-workbench\/#primaryimage","url":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/FtrImgCrtSetFile.png","contentUrl":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/FtrImgCrtSetFile.png","width":1238,"height":594},{"@type":"BreadcrumbList","@id":"https:\/\/www.iri.com\/blog\/data-transformation2\/creating-set-files-in-iri-workbench\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.iri.com\/blog\/"},{"@type":"ListItem","position":2,"name":"Creating Set Files in IRI Workbench"}]},{"@type":"WebSite","@id":"https:\/\/www.iri.com\/blog\/#website","url":"https:\/\/www.iri.com\/blog\/","name":"IRI","description":"Total Data Management Blog","publisher":{"@id":"https:\/\/www.iri.com\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.iri.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.iri.com\/blog\/#organization","name":"IRI","url":"https:\/\/www.iri.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.iri.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png","contentUrl":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png","width":750,"height":206,"caption":"IRI"},"image":{"@id":"https:\/\/www.iri.com\/blog\/#\/schema\/logo\/image\/"}},{"@type":"Person","@id":"https:\/\/www.iri.com\/blog\/#\/schema\/person\/9bae14a309616863b027c2d56f532caf","name":"Chaitali Mitra","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.iri.com\/blog\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/95a11f3d0b709c00df3262bab0152f3a?s=96&d=blank&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/95a11f3d0b709c00df3262bab0152f3a?s=96&d=blank&r=g","caption":"Chaitali Mitra"},"sameAs":["http:\/\/www.iri.com"],"url":"https:\/\/www.iri.com\/blog\/author\/chaitalim\/"}]}},"jetpack_featured_media_url":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2022\/07\/FtrImgCrtSetFile.png","_links":{"self":[{"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/posts\/16157"}],"collection":[{"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/users\/53"}],"replies":[{"embeddable":true,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/comments?post=16157"}],"version-history":[{"count":18,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/posts\/16157\/revisions"}],"predecessor-version":[{"id":17859,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/posts\/16157\/revisions\/17859"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/media\/16165"}],"wp:attachment":[{"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/media?parent=16157"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/categories?post=16157"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/tags?post=16157"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}