{"id":14403,"date":"2021-06-18T16:45:39","date_gmt":"2021-06-18T20:45:39","guid":{"rendered":"http:\/\/www.iri.com\/blog\/?p=14403"},"modified":"2026-02-23T18:04:09","modified_gmt":"2026-02-23T23:04:09","slug":"pseudonymize-new-values-and-minimize-re-id-risk","status":"publish","type":"post","link":"https:\/\/www.iri.com\/blog\/data-protection\/pseudonymize-new-values-and-minimize-re-id-risk\/","title":{"rendered":"How to Pseudonymize New Values and Minimize Re-ID Risk"},"content":{"rendered":"<p><b><i>Abstract<\/i><\/b><i>: It is common practice to mask sensitive production data for non-production purposes. Creating realistic values presents the challenge of handling new values introduced into production data sources and the risk of re-identification. In this post, we will look at a method available to IRI FieldShield, DarkShield, and CellShield users to address both issues at once.<\/i><\/p>\n<p>When masking production data used for non-production usage (e.g. testing), it is desirable for the data to be fake but realistic. One of the most common methods of masking data for this purpose is pseudonymization.<\/p>\n<p>Pseudonymization is a form of data substitution where data may or may not be restored to its original value. For more information on IRI methods of pseudonymization, please see <a href=\"https:\/\/www.iri.com\/solutions\/data-masking\/static-data-masking\/pseudonymize\">this page<\/a>.<\/p>\n<p>The masking approach covered in this post is based on a common pseudonymization method, but does not guarantee data can be restored to their original values. This may seem a little confusing, but the reasoning will become clearer later on.<\/p>\n<p>While the approach described below can be leveraged for a variety of data fields, this post focuses on the masking of last names.<\/p>\n<h5><b>Challenges to Address<\/b><\/h5>\n<p>Pseudonymizing data that results in realistic values brings with it the challenges of masking new values introduced into production, and the risk of being able to derive the original value from the masked value (re-identification). Before we discuss a method for addressing these issues, let\u2019s look at a common approach for performing data pseudonymization.<\/p>\n<h5><b>Basic Pseudonymization<\/b><\/h5>\n<p>One of the easiest ways to pseudonymize data is to use a substitution value contained in a lookup file or a database table with two columns like the <em>maskedLastMap.txt<\/em> file shown below.\u00a0 Masking names in this case involves finding a matching value in the first column, and using the second column as the output value.<\/p>\n<p><a href=\"\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-1.png\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-14405 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-1.png\" alt=\"\" width=\"218\" height=\"268\" \/><\/a><\/p>\n<p>The ability to do this type of search and replace is built into IRI tooling, which will be covered in more detail later in this post.<\/p>\n<p>With basic pseudonymization, what happens if there is a new input value from production that is not in the file above?\u00a0 A maintenance process could be built to update the file, but that will not work in a dynamic environment where masking capabilities are needed on-demand.<\/p>\n<p>What about the risk of determining the original value from a masked value? The file above provides a direct one-to-one mapping back to the production value. You could secure the file, but properly securing this type of file may be easier said than done.<\/p>\n<h5><b>Reducing Re-identification Risk<\/b><\/h5>\n<p>So, how can we reduce re-identification risk? One possibility is to use a file like the LastName.set file below instead of a file that has direct mapping like the one illustrated in <i>Basic Pseudonymization<\/i> above.<\/p>\n<p><a href=\"\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-2.png\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-14406 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-2.png\" alt=\"\" width=\"534\" height=\"319\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-2.png 534w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-2-300x179.png 300w\" sizes=\"(max-width: 534px) 100vw, 534px\" \/><\/a><\/p>\n<p>The first thing you will notice is that the first column in the file appears to be an arbitrary string of characters, but is in fact a hash value derived from actual names.\u00a0 Since the hash value is not reversible, mapping masked dataset values back to their production values is not possible.<\/p>\n<p>With this file, the input name must be hashed prior to searching the first column for a match.<\/p>\n<p>Introducing hashing into the process may seem to make things much more complicated, but don\u2019t worry, the process is pretty simple with <a href=\"https:\/\/www.iri.com\/products\/fieldshield\">IRI FieldShield<\/a> and is covered in <i>Pseudonymization with FieldShield<\/i> below. And in case you are wondering, the process for setting up the LastName.set file is covered at the end of the post.<\/p>\n<h5><b>Handling New Values<\/b><b><br \/>\n<\/b><\/h5>\n<p>Over a period of time, new values will likely be introduced into production and will need to be masked.\u00a0 As we saw earlier, a standard search and replace approach will not work if the input value doesn\u2019t exist in the file.<\/p>\n<p>One method for resolving this issue, using the LastName.set file from above, is to use the first row containing a value less than the hash value of the input name.<\/p>\n<p>For example, the hash value of Wonderkin is not contained in the LastName.set file and using the first row with a lesser value would result in CAMPER being the masked output value.<\/p>\n<p><a href=\"\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-3.png\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-14407 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-3.png\" alt=\"\" width=\"976\" height=\"349\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-3.png 976w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-3-300x107.png 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-3-768x275.png 768w\" sizes=\"(max-width: 976px) 100vw, 976px\" \/><\/a><\/p>\n<p>As illustrated above, using the less than method can result in multiple input names mapping to the same output name and will therefore not allow the data to be restored to its original value.<\/p>\n<p>Generally, multiple input names mapping to a single name for non-production testing should not be an issue.\u00a0 The frequency of this occurring can be minimized by having a file with a large number of names.<\/p>\n<h5><b>Pseudonymization with FieldShield<\/b><b><br \/>\n<\/b><\/h5>\n<p><a href=\"https:\/\/www.iri.com\/products\/fieldshield\">IRI FieldShield<\/a> is a software product for masking personally identifiable information (PII) using many different <a href=\"https:\/\/www.iri.com\/solutions\/data-masking\/static-data-masking\">methods<\/a>, including pseudonymization. Instructing FieldShield how to read and protect structured RDBs and flat files is specified via a scripting language called the FieldShield Control Language (FCL), which is based on the antecedent, broader Sort Control Language program, aka <a href=\"https:\/\/www.iri.com\/products\/cosort\/sortcl\">SortCL<\/a>; SortCL can thus run .scl or .fcl job scripts.<\/p>\n<p>Among other methods of pseudonymization, FieldShield supports the implementation of find and replace using what is referred to as a Set file and SEARCH function. The LastName.set file discussed earlier is an example of a Set file and will be used to demonstrate pseudonymization of last names with FieldShield.<\/p>\n<h5><b>Using a Set File<\/b><\/h5>\n<p>Using the Set file for search and replace is as simple as defining the name of a Set file and a search operation in a \/FIELD statement.<\/p>\n<p>Masking the last name is a three step process:<\/p>\n<p><a href=\"\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-4.png\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-14408 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-4.png\" alt=\"\" width=\"718\" height=\"140\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-4.png 718w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-4-300x58.png 300w\" sizes=\"(max-width: 718px) 100vw, 718px\" \/><\/a><\/p>\n<p><b>Step 1:<\/b> The input name is converted to upper case to ensure any difference in a name\u2019s case will result in the same search result.\u00a0 For example, Smith and SMITH will be converted to SMITH.<\/p>\n<p><b>Step 2:<\/b> The uppercase name is hashed using a built-in SHA-256 hashing algorithm.<\/p>\n<p><b>Step 3:<\/b> The Set file is searched for a value less than or equal to (LE) the value of the hashed name.\u00a0 Note: If the name\u2019s hash value is less than the first record of the Set file the masked last name will be assigned a value of \u201cNONAME\u201d.<\/p>\n<h5><b>Addressing the Challenges<\/b><\/h5>\n<p>As was mentioned at the beginning of this post, new values and risk of re-identification are two issues that need to be addressed when pseudonymizing names (or other data) for non-production uses.<\/p>\n<p>The use of LE for the \/FIELD statement search option addresses any new values that may not exist in the Set file.\u00a0 As mentioned earlier, any less than conditions could result in multiple input names being masked to the same name, but generally this shouldn\u2019t impact its usage.<\/p>\n<p>Using hash values in the masking processes greatly reduces the risk of re-identification.\u00a0 For someone to map a masked name back to its original production value, they would need execute authority for the FieldShield software and access to the encryption passphrase used to hash the names.<\/p>\n<h5><b>Complete Pseudonymization Script<\/b><\/h5>\n<p>Below is a complete example of the FieldShield job script that demonstrates the concepts discussed in this post to address new values and reduce the risk of re-identification. The script defines a source file (\/INFILE statement), performs the specialized pseudonymization work in the \/INREC section, and creates the target file with pseudonymized last names file (\/OUTFILE statement).<\/p>\n<p><a href=\"\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-5.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-14409 size-full\" src=\"http:\/\/www.iri.com\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-5.png\" alt=\"\" width=\"739\" height=\"478\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-5.png 739w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-5-300x194.png 300w\" sizes=\"(max-width: 739px) 100vw, 739px\" \/><\/a><\/p>\n<h6 style=\"text-align: center;\">Note again that the input sources and\/or output targets can also be RDB tables as well as files.<\/h6>\n<h5><b>Creating the LastName.set File<\/b><\/h5>\n<p>To minimize the many to one mappings that can occur using the method described earlier, it is important when creating the LastName.set file that it contains a large number of names.<\/p>\n<p>For purposes of this post, a Set file of names provided as part of the IRI Workbench install and a 2010 US Census file of last names were used.\u00a0 Some suggested additional sources of last names would be your company\u2019s master person data repository as well as numerous resources that can be found on the internet.<\/p>\n<p>The LastName.set file discussed earlier was created by a three step process depicted in the flowchart below.\u00a0 A detailed description follows.<\/p>\n<p><a href=\"\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-6.png\"><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-14410 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-6.png\" alt=\"\" width=\"443\" height=\"701\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-6.png 481w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-6-190x300.png 190w\" sizes=\"(max-width: 443px) 100vw, 443px\" \/><\/a><\/p>\n<h5><b>Step 1<\/b><\/h5>\n<p>The first step in creating the LastName.set file is to combine the IRI example last names Set file and a download of last names from the 2010 US Census.\u00a0 The names are converted to uppercase and sorted to order the records by last name and ensure any duplicates are removed.<\/p>\n<p>Step 1 Script:<\/p>\n<p><a href=\"\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-7.png\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-14411 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-7.png\" alt=\"\" width=\"633\" height=\"511\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-7.png 633w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-7-300x242.png 300w\" sizes=\"(max-width: 633px) 100vw, 633px\" \/><\/a><\/p>\n<p>Step 1 Input Sample:<\/p>\n<p><a href=\"http:\/\/www.iri.com\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-8.png\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-14412 aligncenter\" src=\"http:\/\/www.iri.com\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-8.png\" alt=\"\" width=\"645\" height=\"210\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-8.png 538w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-8-300x98.png 300w\" sizes=\"(max-width: 645px) 100vw, 645px\" \/><\/a><\/p>\n<p>Step 1 Output Sample:<\/p>\n<p><a href=\"\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-9.png\"><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-14413 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-9.png\" alt=\"\" width=\"652\" height=\"178\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-9.png 652w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-9-300x82.png 300w\" sizes=\"(max-width: 652px) 100vw, 652px\" \/><\/a><\/p>\n<h5><b>Step 2<\/b><\/h5>\n<p>In this step the hashed last names are sorted in sequence by hash value and any duplicates are removed.\u00a0 While it may be possible for duplicates to occur, this was not the case in the testing performed using the combined US Census and IRI files with more than 173,000 records.<\/p>\n<p><a href=\"\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-10.png\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-14414 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-10.png\" alt=\"\" width=\"668\" height=\"301\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-10.png 668w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-10-300x135.png 300w\" sizes=\"(max-width: 668px) 100vw, 668px\" \/><\/a><\/p>\n<p>Step 2 Output Sample<\/p>\n<p><a href=\"\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-11.png\"><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-14415 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-11.png\" alt=\"\" width=\"502\" height=\"203\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-11.png 440w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-11-300x121.png 300w\" sizes=\"(max-width: 502px) 100vw, 502px\" \/><\/a><\/p>\n<h5><b>Step 3<\/b><\/h5>\n<p>In the final step of creating the LastName.set file, the unique last names file from step 1 and the sorted hash file from step 2 are joined by sequence number. The resulting Set file contains a hash value in the first column, and a last name in the second column and is now ready to be used in a script.<\/p>\n<p><a href=\"\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-12.png\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-14416 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-12.png\" alt=\"\" width=\"554\" height=\"406\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-12.png 554w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-12-300x220.png 300w\" sizes=\"(max-width: 554px) 100vw, 554px\" \/><\/a><\/p>\n<p>Step 3 Output Sample:<\/p>\n<p><a href=\"\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-13.png\"><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-14417 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-13.png\" alt=\"\" width=\"509\" height=\"189\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-13.png 482w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-13-300x111.png 300w\" sizes=\"(max-width: 509px) 100vw, 509px\" \/><\/a><\/p>\n<h5><b>Summary<\/b><\/h5>\n<p>In this post, we discussed the challenges of new values being added to production and reducing the risk of re-identification when masking production data through pseudonymization for non-production usage. We conceptually addressed these issues using hashed name values stored in a .set file and demonstrated the solution using <a href=\"https:\/\/www.iri.com\/products\/fieldshield\">IRI FieldShield<\/a>. Following this article and interest in the subject, IRI developed a fit-for-purpose <a href=\"https:\/\/www.iri.com\/blog\/data-protection\/consistent-self-updating-and-secure-pseudonymization\/\">hash-based pseudonymization rule<\/a>, and <a href=\"https:\/\/www.iri.com\/blog\/data-protection\/pseudonym-hash-set-file-creation-wizard\/\">hashed set file creation wizard<\/a>, both now available in IRI Workbench for producing consistent, self-updating pseudonyms.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Abstract: It is common practice to mask sensitive production data for non-production purposes. Creating realistic values presents the challenge of handling new values introduced into production data sources and the risk of re-identification. In this post, we will look at a method available to IRI FieldShield, DarkShield, and CellShield users to address both issues at<\/p>\n<div><a class=\"btn-filled btn\" href=\"https:\/\/www.iri.com\/blog\/data-protection\/pseudonymize-new-values-and-minimize-re-id-risk\/\" title=\"How to Pseudonymize New Values and Minimize Re-ID Risk\">Read More<\/a><\/div>\n","protected":false},"author":151,"featured_media":14410,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_exactmetrics_skip_tracking":false,"_exactmetrics_sitenote_active":false,"_exactmetrics_sitenote_note":"","_exactmetrics_sitenote_category":0,"footnotes":""},"categories":[8,29,2255],"tags":[596,1388,520,850,1709,22,1345],"class_list":["post-14403","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-data-protection","category-test-data","category-archived-articles","tag-iri-cellshield","tag-iri-darkshield","tag-iri-fieldshield","tag-iri-workbench","tag-pseudonymisation","tag-pseudonymization","tag-re-id-risk-scoring"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v23.4 (Yoast SEO v23.4) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>How to Pseudonymize New Values and Minimize Re-ID Risk - IRI<\/title>\n<meta name=\"description\" content=\"Discover how pseudonymization can mask sensitive production data for non-production purposes, while ensuring realistic and fake values.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.iri.com\/blog\/data-protection\/pseudonymize-new-values-and-minimize-re-id-risk\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"How to Pseudonymize New Values and Minimize Re-ID Risk\" \/>\n<meta property=\"og:description\" content=\"Discover how pseudonymization can mask sensitive production data for non-production purposes, while ensuring realistic and fake values.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.iri.com\/blog\/data-protection\/pseudonymize-new-values-and-minimize-re-id-risk\/\" \/>\n<meta property=\"og:site_name\" content=\"IRI\" \/>\n<meta property=\"article:published_time\" content=\"2021-06-18T20:45:39+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2026-02-23T23:04:09+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-6.png\" \/>\n\t<meta property=\"og:image:width\" content=\"481\" \/>\n\t<meta property=\"og:image:height\" content=\"761\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Wade Donahue\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Wade Donahue\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"10 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.iri.com\/blog\/data-protection\/pseudonymize-new-values-and-minimize-re-id-risk\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.iri.com\/blog\/data-protection\/pseudonymize-new-values-and-minimize-re-id-risk\/\"},\"author\":{\"name\":\"Wade Donahue\",\"@id\":\"https:\/\/www.iri.com\/blog\/#\/schema\/person\/3c88af09f1d4fcdef7370a7abe64b732\"},\"headline\":\"How to Pseudonymize New Values and Minimize Re-ID Risk\",\"datePublished\":\"2021-06-18T20:45:39+00:00\",\"dateModified\":\"2026-02-23T23:04:09+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.iri.com\/blog\/data-protection\/pseudonymize-new-values-and-minimize-re-id-risk\/\"},\"wordCount\":1605,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/www.iri.com\/blog\/#organization\"},\"image\":{\"@id\":\"https:\/\/www.iri.com\/blog\/data-protection\/pseudonymize-new-values-and-minimize-re-id-risk\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-6.png\",\"keywords\":[\"IRI CellShield\",\"IRI DarkShield\",\"IRI FieldShield\",\"IRI Workbench\",\"pseudonymisation\",\"pseudonymization\",\"re-id risk scoring\"],\"articleSection\":[\"Data Masking\/Protection\",\"Test Data\",\"Archived Articles\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/www.iri.com\/blog\/data-protection\/pseudonymize-new-values-and-minimize-re-id-risk\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.iri.com\/blog\/data-protection\/pseudonymize-new-values-and-minimize-re-id-risk\/\",\"url\":\"https:\/\/www.iri.com\/blog\/data-protection\/pseudonymize-new-values-and-minimize-re-id-risk\/\",\"name\":\"How to Pseudonymize New Values and Minimize Re-ID Risk - IRI\",\"isPartOf\":{\"@id\":\"https:\/\/www.iri.com\/blog\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.iri.com\/blog\/data-protection\/pseudonymize-new-values-and-minimize-re-id-risk\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.iri.com\/blog\/data-protection\/pseudonymize-new-values-and-minimize-re-id-risk\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-6.png\",\"datePublished\":\"2021-06-18T20:45:39+00:00\",\"dateModified\":\"2026-02-23T23:04:09+00:00\",\"description\":\"Discover how pseudonymization can mask sensitive production data for non-production purposes, while ensuring realistic and fake values.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.iri.com\/blog\/data-protection\/pseudonymize-new-values-and-minimize-re-id-risk\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.iri.com\/blog\/data-protection\/pseudonymize-new-values-and-minimize-re-id-risk\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.iri.com\/blog\/data-protection\/pseudonymize-new-values-and-minimize-re-id-risk\/#primaryimage\",\"url\":\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-6.png\",\"contentUrl\":\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-6.png\",\"width\":481,\"height\":761},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.iri.com\/blog\/data-protection\/pseudonymize-new-values-and-minimize-re-id-risk\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.iri.com\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"How to Pseudonymize New Values and Minimize Re-ID Risk\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.iri.com\/blog\/#website\",\"url\":\"https:\/\/www.iri.com\/blog\/\",\"name\":\"IRI\",\"description\":\"Total Data Management Blog\",\"publisher\":{\"@id\":\"https:\/\/www.iri.com\/blog\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.iri.com\/blog\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.iri.com\/blog\/#organization\",\"name\":\"IRI\",\"url\":\"https:\/\/www.iri.com\/blog\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.iri.com\/blog\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png\",\"contentUrl\":\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png\",\"width\":750,\"height\":206,\"caption\":\"IRI\"},\"image\":{\"@id\":\"https:\/\/www.iri.com\/blog\/#\/schema\/logo\/image\/\"}},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.iri.com\/blog\/#\/schema\/person\/3c88af09f1d4fcdef7370a7abe64b732\",\"name\":\"Wade Donahue\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.iri.com\/blog\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/9cf07e37c128f0334168629cb154a3f8?s=96&d=blank&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/9cf07e37c128f0334168629cb154a3f8?s=96&d=blank&r=g\",\"caption\":\"Wade Donahue\"},\"url\":\"https:\/\/www.iri.com\/blog\/author\/waded\/\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"How to Pseudonymize New Values and Minimize Re-ID Risk - IRI","description":"Discover how pseudonymization can mask sensitive production data for non-production purposes, while ensuring realistic and fake values.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.iri.com\/blog\/data-protection\/pseudonymize-new-values-and-minimize-re-id-risk\/","og_locale":"en_US","og_type":"article","og_title":"How to Pseudonymize New Values and Minimize Re-ID Risk","og_description":"Discover how pseudonymization can mask sensitive production data for non-production purposes, while ensuring realistic and fake values.","og_url":"https:\/\/www.iri.com\/blog\/data-protection\/pseudonymize-new-values-and-minimize-re-id-risk\/","og_site_name":"IRI","article_published_time":"2021-06-18T20:45:39+00:00","article_modified_time":"2026-02-23T23:04:09+00:00","og_image":[{"width":481,"height":761,"url":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-6.png","type":"image\/png"}],"author":"Wade Donahue","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Wade Donahue","Est. reading time":"10 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.iri.com\/blog\/data-protection\/pseudonymize-new-values-and-minimize-re-id-risk\/#article","isPartOf":{"@id":"https:\/\/www.iri.com\/blog\/data-protection\/pseudonymize-new-values-and-minimize-re-id-risk\/"},"author":{"name":"Wade Donahue","@id":"https:\/\/www.iri.com\/blog\/#\/schema\/person\/3c88af09f1d4fcdef7370a7abe64b732"},"headline":"How to Pseudonymize New Values and Minimize Re-ID Risk","datePublished":"2021-06-18T20:45:39+00:00","dateModified":"2026-02-23T23:04:09+00:00","mainEntityOfPage":{"@id":"https:\/\/www.iri.com\/blog\/data-protection\/pseudonymize-new-values-and-minimize-re-id-risk\/"},"wordCount":1605,"commentCount":0,"publisher":{"@id":"https:\/\/www.iri.com\/blog\/#organization"},"image":{"@id":"https:\/\/www.iri.com\/blog\/data-protection\/pseudonymize-new-values-and-minimize-re-id-risk\/#primaryimage"},"thumbnailUrl":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-6.png","keywords":["IRI CellShield","IRI DarkShield","IRI FieldShield","IRI Workbench","pseudonymisation","pseudonymization","re-id risk scoring"],"articleSection":["Data Masking\/Protection","Test Data","Archived Articles"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.iri.com\/blog\/data-protection\/pseudonymize-new-values-and-minimize-re-id-risk\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.iri.com\/blog\/data-protection\/pseudonymize-new-values-and-minimize-re-id-risk\/","url":"https:\/\/www.iri.com\/blog\/data-protection\/pseudonymize-new-values-and-minimize-re-id-risk\/","name":"How to Pseudonymize New Values and Minimize Re-ID Risk - IRI","isPartOf":{"@id":"https:\/\/www.iri.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.iri.com\/blog\/data-protection\/pseudonymize-new-values-and-minimize-re-id-risk\/#primaryimage"},"image":{"@id":"https:\/\/www.iri.com\/blog\/data-protection\/pseudonymize-new-values-and-minimize-re-id-risk\/#primaryimage"},"thumbnailUrl":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-6.png","datePublished":"2021-06-18T20:45:39+00:00","dateModified":"2026-02-23T23:04:09+00:00","description":"Discover how pseudonymization can mask sensitive production data for non-production purposes, while ensuring realistic and fake values.","breadcrumb":{"@id":"https:\/\/www.iri.com\/blog\/data-protection\/pseudonymize-new-values-and-minimize-re-id-risk\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.iri.com\/blog\/data-protection\/pseudonymize-new-values-and-minimize-re-id-risk\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.iri.com\/blog\/data-protection\/pseudonymize-new-values-and-minimize-re-id-risk\/#primaryimage","url":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-6.png","contentUrl":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-6.png","width":481,"height":761},{"@type":"BreadcrumbList","@id":"https:\/\/www.iri.com\/blog\/data-protection\/pseudonymize-new-values-and-minimize-re-id-risk\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.iri.com\/blog\/"},{"@type":"ListItem","position":2,"name":"How to Pseudonymize New Values and Minimize Re-ID Risk"}]},{"@type":"WebSite","@id":"https:\/\/www.iri.com\/blog\/#website","url":"https:\/\/www.iri.com\/blog\/","name":"IRI","description":"Total Data Management Blog","publisher":{"@id":"https:\/\/www.iri.com\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.iri.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.iri.com\/blog\/#organization","name":"IRI","url":"https:\/\/www.iri.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.iri.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png","contentUrl":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png","width":750,"height":206,"caption":"IRI"},"image":{"@id":"https:\/\/www.iri.com\/blog\/#\/schema\/logo\/image\/"}},{"@type":"Person","@id":"https:\/\/www.iri.com\/blog\/#\/schema\/person\/3c88af09f1d4fcdef7370a7abe64b732","name":"Wade Donahue","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.iri.com\/blog\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/9cf07e37c128f0334168629cb154a3f8?s=96&d=blank&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/9cf07e37c128f0334168629cb154a3f8?s=96&d=blank&r=g","caption":"Wade Donahue"},"url":"https:\/\/www.iri.com\/blog\/author\/waded\/"}]}},"jetpack_featured_media_url":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2021\/06\/pseudonymize-and-re-id-6.png","_links":{"self":[{"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/posts\/14403"}],"collection":[{"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/users\/151"}],"replies":[{"embeddable":true,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/comments?post=14403"}],"version-history":[{"count":8,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/posts\/14403\/revisions"}],"predecessor-version":[{"id":18105,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/posts\/14403\/revisions\/18105"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/media\/14410"}],"wp:attachment":[{"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/media?parent=14403"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/categories?post=14403"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/tags?post=14403"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}