{"id":1649,"date":"2012-07-26T18:08:08","date_gmt":"2012-07-26T22:08:08","guid":{"rendered":"http:\/\/www.iri.com\/blog\/?p=1649"},"modified":"2026-01-08T15:28:27","modified_gmt":"2026-01-08T20:28:27","slug":"the-case-for-test-data","status":"publish","type":"post","link":"https:\/\/www.iri.com\/blog\/test-data\/the-case-for-test-data\/","title":{"rendered":"The Case for Safe, Intelligent Database Test Data"},"content":{"rendered":"<p><strong>Database Test Data Usage<\/strong> &#8211; This blog caught my eye\u00a0because of its title, <strong><a title=\"Do the Right Thing When Testing With Production Data from Security Active Blog\" href=\"http:\/\/blog.securityactive.co.uk\/2010\/02\/01\/do-the-right-think-when-testing-with-production-data\/\" target=\"_blank\" rel=\"noopener\">Do the right thing when testing with production data<\/a>. \u00a0<\/strong>It struck me as oxymoronic, since we know production data should not be used for testing at all &#8230;<\/p>\n<p>Of course we know how tempting it is to use production data for testing\u00a0applications, simulating databases, prototyping ETL operations, and just about anything else that needs to work with the real thing. However, it is not worth the legal risk of exposing that data to breaches. And, even if you were certain that you could lock that data down somehow, have you considered its potential inadequacy for testing purposes? If you only develop and test with data that&#8217;s real today, what happens when your platform has to work with different data tomorrow? Ideally, you need to be able to stress-test new platforms, and that means taking into account future data changes in values, formats, ranges, and\/or volumes.<\/p>\n<p>Security Active&#8217;s article did, however, acknowledge the data breach risk, and suggested that the right thing to do was to obfuscate the production data with appropriate protection tools and scripts &#8212; like <a title=\"FieldShield Product Page\" href=\"http:\/\/www.iri.com\/products\/FieldShield\" target=\"_blank\" rel=\"noopener\">IRI FieldShield <\/a>&#8212;\u00a0which would encrypt, mask, pseudonymize or otherwise de-identify the production data to make it safe for testing.<\/p>\n<p>And that is certainly a popular, though I submit not necessarily the best, approach. What happens if a security method is compromised (e.g. the decryption key is learned or the permutation algorithm reversed)? Or what if the\u00a0test data is safe, but ends up looking unrealistic or being referentially incorrect? And how would morphing 100,000 social security numbers in one fact table, for example, help you test an enterprise data warehouse that will need to handle 50,000,000 numbers distributed\u00a0and linked across multiple tables?<\/p>\n<p>Better if you could generate file, report, or <a title=\"Database Testing Information Wiki\" href=\"http:\/\/en.wikipedia.org\/wiki\/Database_testing\" target=\"_blank\" rel=\"noopener\">database\u00a0test data<\/a> without using production data at all, but where that test data has:<\/p>\n<ul>\n<li>realistic (but still not real) data with the right data types, layouts, and values<\/li>\n<li>referential integrity in that it preserves primary-foreign key associations<\/li>\n<li>the ability to scale to any volume and satisfy any data range requirement<\/li>\n<li>automatic, pre-sorted loads to multiple table or file targets simultaneously<\/li>\n<\/ul>\n<p><a href=\"http:\/\/www.iri.com\/blog\/wp-content\/uploads\/2012\/07\/testdata_types.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2390\" title=\"testdata_types\" src=\"http:\/\/www.iri.com\/blog\/wp-content\/uploads\/2012\/07\/testdata_types.jpg\" alt=\"\" width=\"685\" height=\"351\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2012\/07\/testdata_types.jpg 685w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2012\/07\/testdata_types-300x153.jpg 300w\" sizes=\"(max-width: 685px) 100vw, 685px\" \/><\/a><\/p>\n<p>Within the <a href=\"http:\/\/www.iri.com\/products\/workbench\/rowgen-gui\" target=\"_blank\" rel=\"noopener\">IRI Workbench<\/a>\u00a0GUI, built on Eclipse\u2122,\u00a0<a title=\"RowGen Product page\" href=\"http:\/\/www.iri.com\/products\/RowGen\" target=\"_blank\" rel=\"noopener\">IRI RowGen <\/a>combines\u00a0data model parsing, script generation, and target table loading of big, safe, intelligent test data that is also structurally and referentially correct. This\u00a0means that you can create test data that conforms to metadata specifications, privacy regulations, business rules, and stress-testing requirements.<\/p>\n<p>Watch this space for updates on\u00a0RowGen, and in the meantime,<\/p>\n<p>Do not use production data for testing.<\/p>\n<p>Do not use production data for testing.<\/p>\n<p>Do not use production data for testing.<\/p>\n<p>There, I said it three times.<\/p>\n<p style=\"text-align: center;\"><strong>Click to see an overview of IRI RowGen:<\/strong><\/p>\n<p style=\"text-align: center;\"><iframe loading=\"lazy\" width=\"1140\" height=\"641\" src=\"https:\/\/www.youtube.com\/embed\/9uu97KkxO4c?feature=oembed\" frameborder=\"0\" gesture=\"media\" allowfullscreen><\/iframe><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Database Test Data Usage &#8211; This blog caught my eye\u00a0because of its title, Do the right thing when testing with production data. \u00a0It struck me as oxymoronic, since we know production data should not be used for testing at all &#8230; Of course we know how tempting it is to use production data for testing\u00a0applications,<\/p>\n<div><a class=\"btn-filled btn\" href=\"https:\/\/www.iri.com\/blog\/test-data\/the-case-for-test-data\/\" title=\"The Case for Safe, Intelligent Database Test Data\">Read More<\/a><\/div>\n","protected":false},"author":5,"featured_media":2390,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_exactmetrics_skip_tracking":false,"_exactmetrics_sitenote_active":false,"_exactmetrics_sitenote_note":"","_exactmetrics_sitenote_category":0,"footnotes":""},"categories":[29,1030],"tags":[10,28,89,16,71,46,9,49,88],"class_list":["post-1649","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-test-data","category-vlog","tag-data-encryption","tag-data-warehousing","tag-database-test-data","tag-de-identify-data","tag-eclipse","tag-etl-tools","tag-fieldshield","tag-rowgen","tag-test-data-2"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v23.4 (Yoast SEO v23.4) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>The Case for Safe, Intelligent Database Test Data - IRI<\/title>\n<meta name=\"description\" content=\"Avoid using production data for testing; explore why test data is essential for secure and effective application development.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.iri.com\/blog\/test-data\/the-case-for-test-data\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"The Case for Safe, Intelligent Database Test Data\" \/>\n<meta property=\"og:description\" content=\"Avoid using production data for testing; explore why test data is essential for secure and effective application development.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.iri.com\/blog\/test-data\/the-case-for-test-data\/\" \/>\n<meta property=\"og:site_name\" content=\"IRI\" \/>\n<meta property=\"article:published_time\" content=\"2012-07-26T22:08:08+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2026-01-08T20:28:27+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2012\/07\/testdata_types.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"685\" \/>\n\t<meta property=\"og:image:height\" content=\"351\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Jason Koivu\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Jason Koivu\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"3 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.iri.com\/blog\/test-data\/the-case-for-test-data\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.iri.com\/blog\/test-data\/the-case-for-test-data\/\"},\"author\":{\"name\":\"Jason Koivu\",\"@id\":\"https:\/\/www.iri.com\/blog\/#\/schema\/person\/c60bc4ff5919427034376979fb2cc8df\"},\"headline\":\"The Case for Safe, Intelligent Database Test Data\",\"datePublished\":\"2012-07-26T22:08:08+00:00\",\"dateModified\":\"2026-01-08T20:28:27+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.iri.com\/blog\/test-data\/the-case-for-test-data\/\"},\"wordCount\":484,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/www.iri.com\/blog\/#organization\"},\"image\":{\"@id\":\"https:\/\/www.iri.com\/blog\/test-data\/the-case-for-test-data\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2012\/07\/testdata_types.jpg\",\"keywords\":[\"data encryption\",\"Data Warehousing\",\"database test data\",\"de-identify data\",\"Eclipse\",\"ETL tools\",\"FieldShield\",\"RowGen\",\"test data\"],\"articleSection\":[\"Test Data\",\"VLOG\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/www.iri.com\/blog\/test-data\/the-case-for-test-data\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.iri.com\/blog\/test-data\/the-case-for-test-data\/\",\"url\":\"https:\/\/www.iri.com\/blog\/test-data\/the-case-for-test-data\/\",\"name\":\"The Case for Safe, Intelligent Database Test Data - IRI\",\"isPartOf\":{\"@id\":\"https:\/\/www.iri.com\/blog\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.iri.com\/blog\/test-data\/the-case-for-test-data\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.iri.com\/blog\/test-data\/the-case-for-test-data\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2012\/07\/testdata_types.jpg\",\"datePublished\":\"2012-07-26T22:08:08+00:00\",\"dateModified\":\"2026-01-08T20:28:27+00:00\",\"description\":\"Avoid using production data for testing; explore why test data is essential for secure and effective application development.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.iri.com\/blog\/test-data\/the-case-for-test-data\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.iri.com\/blog\/test-data\/the-case-for-test-data\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.iri.com\/blog\/test-data\/the-case-for-test-data\/#primaryimage\",\"url\":\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2012\/07\/testdata_types.jpg\",\"contentUrl\":\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2012\/07\/testdata_types.jpg\",\"width\":\"685\",\"height\":\"351\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.iri.com\/blog\/test-data\/the-case-for-test-data\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.iri.com\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"The Case for Safe, Intelligent Database Test Data\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.iri.com\/blog\/#website\",\"url\":\"https:\/\/www.iri.com\/blog\/\",\"name\":\"IRI\",\"description\":\"Total Data Management Blog\",\"publisher\":{\"@id\":\"https:\/\/www.iri.com\/blog\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.iri.com\/blog\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.iri.com\/blog\/#organization\",\"name\":\"IRI\",\"url\":\"https:\/\/www.iri.com\/blog\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.iri.com\/blog\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png\",\"contentUrl\":\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png\",\"width\":750,\"height\":206,\"caption\":\"IRI\"},\"image\":{\"@id\":\"https:\/\/www.iri.com\/blog\/#\/schema\/logo\/image\/\"}},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.iri.com\/blog\/#\/schema\/person\/c60bc4ff5919427034376979fb2cc8df\",\"name\":\"Jason Koivu\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.iri.com\/blog\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/01e97234ff964558ca620a43a0506ef0?s=96&d=blank&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/01e97234ff964558ca620a43a0506ef0?s=96&d=blank&r=g\",\"caption\":\"Jason Koivu\"},\"url\":\"https:\/\/www.iri.com\/blog\/author\/jasonk\/\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"The Case for Safe, Intelligent Database Test Data - IRI","description":"Avoid using production data for testing; explore why test data is essential for secure and effective application development.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.iri.com\/blog\/test-data\/the-case-for-test-data\/","og_locale":"en_US","og_type":"article","og_title":"The Case for Safe, Intelligent Database Test Data","og_description":"Avoid using production data for testing; explore why test data is essential for secure and effective application development.","og_url":"https:\/\/www.iri.com\/blog\/test-data\/the-case-for-test-data\/","og_site_name":"IRI","article_published_time":"2012-07-26T22:08:08+00:00","article_modified_time":"2026-01-08T20:28:27+00:00","og_image":[{"width":685,"height":351,"url":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2012\/07\/testdata_types.jpg","type":"image\/jpeg"}],"author":"Jason Koivu","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Jason Koivu","Est. reading time":"3 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.iri.com\/blog\/test-data\/the-case-for-test-data\/#article","isPartOf":{"@id":"https:\/\/www.iri.com\/blog\/test-data\/the-case-for-test-data\/"},"author":{"name":"Jason Koivu","@id":"https:\/\/www.iri.com\/blog\/#\/schema\/person\/c60bc4ff5919427034376979fb2cc8df"},"headline":"The Case for Safe, Intelligent Database Test Data","datePublished":"2012-07-26T22:08:08+00:00","dateModified":"2026-01-08T20:28:27+00:00","mainEntityOfPage":{"@id":"https:\/\/www.iri.com\/blog\/test-data\/the-case-for-test-data\/"},"wordCount":484,"commentCount":0,"publisher":{"@id":"https:\/\/www.iri.com\/blog\/#organization"},"image":{"@id":"https:\/\/www.iri.com\/blog\/test-data\/the-case-for-test-data\/#primaryimage"},"thumbnailUrl":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2012\/07\/testdata_types.jpg","keywords":["data encryption","Data Warehousing","database test data","de-identify data","Eclipse","ETL tools","FieldShield","RowGen","test data"],"articleSection":["Test Data","VLOG"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.iri.com\/blog\/test-data\/the-case-for-test-data\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.iri.com\/blog\/test-data\/the-case-for-test-data\/","url":"https:\/\/www.iri.com\/blog\/test-data\/the-case-for-test-data\/","name":"The Case for Safe, Intelligent Database Test Data - IRI","isPartOf":{"@id":"https:\/\/www.iri.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.iri.com\/blog\/test-data\/the-case-for-test-data\/#primaryimage"},"image":{"@id":"https:\/\/www.iri.com\/blog\/test-data\/the-case-for-test-data\/#primaryimage"},"thumbnailUrl":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2012\/07\/testdata_types.jpg","datePublished":"2012-07-26T22:08:08+00:00","dateModified":"2026-01-08T20:28:27+00:00","description":"Avoid using production data for testing; explore why test data is essential for secure and effective application development.","breadcrumb":{"@id":"https:\/\/www.iri.com\/blog\/test-data\/the-case-for-test-data\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.iri.com\/blog\/test-data\/the-case-for-test-data\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.iri.com\/blog\/test-data\/the-case-for-test-data\/#primaryimage","url":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2012\/07\/testdata_types.jpg","contentUrl":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2012\/07\/testdata_types.jpg","width":"685","height":"351"},{"@type":"BreadcrumbList","@id":"https:\/\/www.iri.com\/blog\/test-data\/the-case-for-test-data\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.iri.com\/blog\/"},{"@type":"ListItem","position":2,"name":"The Case for Safe, Intelligent Database Test Data"}]},{"@type":"WebSite","@id":"https:\/\/www.iri.com\/blog\/#website","url":"https:\/\/www.iri.com\/blog\/","name":"IRI","description":"Total Data Management Blog","publisher":{"@id":"https:\/\/www.iri.com\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.iri.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.iri.com\/blog\/#organization","name":"IRI","url":"https:\/\/www.iri.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.iri.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png","contentUrl":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png","width":750,"height":206,"caption":"IRI"},"image":{"@id":"https:\/\/www.iri.com\/blog\/#\/schema\/logo\/image\/"}},{"@type":"Person","@id":"https:\/\/www.iri.com\/blog\/#\/schema\/person\/c60bc4ff5919427034376979fb2cc8df","name":"Jason Koivu","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.iri.com\/blog\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/01e97234ff964558ca620a43a0506ef0?s=96&d=blank&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/01e97234ff964558ca620a43a0506ef0?s=96&d=blank&r=g","caption":"Jason Koivu"},"url":"https:\/\/www.iri.com\/blog\/author\/jasonk\/"}]}},"jetpack_featured_media_url":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2012\/07\/testdata_types.jpg","_links":{"self":[{"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/posts\/1649"}],"collection":[{"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/users\/5"}],"replies":[{"embeddable":true,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/comments?post=1649"}],"version-history":[{"count":98,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/posts\/1649\/revisions"}],"predecessor-version":[{"id":18984,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/posts\/1649\/revisions\/18984"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/media\/2390"}],"wp:attachment":[{"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/media?parent=1649"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/categories?post=1649"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/tags?post=1649"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}