{"id":3284,"date":"2013-01-21T21:21:26","date_gmt":"2013-01-22T02:21:26","guid":{"rendered":"http:\/\/www.iri.com\/blog\/?p=3284"},"modified":"2017-11-06T08:32:05","modified_gmt":"2017-11-06T13:32:05","slug":"selectfilter-to-reduce-big-data-bulk","status":"publish","type":"post","link":"https:\/\/www.iri.com\/blog\/data-transformation2\/selectfilter-to-reduce-big-data-bulk\/","title":{"rendered":"Using Selection to Reduce Data Bulk (and Improve Data Quality)"},"content":{"rendered":"<p>One of the best ways to speed up big data processing operations is to not process so much data in the first place; i.e. to eliminate unnecessary data ahead of time. Data can be culled en masse by specifying the collection, processing, or output of only a certain number of rows, or more intelligently with selection criteria in which certain conditions must be met for the data to be included in, or omitted from, processing.<\/p>\n<p>In <a href=\"http:\/\/www.iri.com\/products\/cosort\" target=\"_blank\" rel=\"noopener\">IRI CoSort<\/a> (data transformation and reporting), <a href=\"http:\/\/www.iri.com\/products\/nextform\" target=\"_blank\" rel=\"noopener\">IRI NextForm<\/a> (data, file and DB migration), <a href=\"http:\/\/www.iri.com\/products\/fieldshield\" target=\"_blank\" rel=\"noopener\">IRI FieldShield<\/a> (data masking), and even <a href=\"http:\/\/www.iri.com\/products\/rowgen\" target=\"_blank\" rel=\"noopener\">IRI RowGen<\/a> (test data generation) job scripts, you can apply SQL-like WHERE filters at the input, action, and output phases of each job. These filters are defined, stored, and effected in the CoSort Sort Control Language (<a title=\"SortCL Program Product Page\" href=\"http:\/\/www.iri.com\/products\/cosort\/sortcl\" target=\"_blank\" rel=\"noopener\">SortCL<\/a>) program common to all IRI tools to avoid the reading, processing, and\/or targeting of data you simply do not need at each logical phase of a job.<\/p>\n<p>First let&#8217;s cover de-duplication. If you use CoSort or the <a href=\"http:\/\/www.iri.com\/products\/voracity\">IRI Voracity<\/a> platform that does, SortCL&#8217;s \/NODUPLICATES command will remove the second and successive rows\u00a0in any job that sorts\u00a0when\u00a0their key values match; i.e., only the key fields (not the records) must compare equally to be removed.\u00a0Conversely, you can specify \/DUPLICATESONLY\u00a0so only records containing key fields that compare equally are processed and output. You can see how either function can save time in processing large data sets.<\/p>\n<p>Conditions for the \/INCLUDE and \/OMIT statements in IRI software can be simple or elaborate, based on unary changes in data, or complex boolean logic. These conditions can be defined at the field (column) or record (row) level, and expressed in both SortCL text scripts or in the IRI Workbench GUI which creates and runs those scripts. In the GUI, built on Eclipse, you can either hand-write or create job scripts through a wizard or dialog editors that reflect your business rules for filtering data.<\/p>\n<p>The example input file below contains a list of transactions over a given year:<\/p>\n<p><a href=\"http:\/\/www.iri.com\/blog\/wp-content\/uploads\/2013\/01\/in_dat-A1.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-3467\" title=\"in_dat-A\" src=\"http:\/\/www.iri.com\/blog\/wp-content\/uploads\/2013\/01\/in_dat-A1.jpg\" alt=\"\" width=\"459\" height=\"325\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2013\/01\/in_dat-A1.jpg 459w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2013\/01\/in_dat-A1-300x212.jpg 300w\" sizes=\"(max-width: 459px) 100vw, 459px\" \/><\/a><\/p>\n<p>Conditions can be expressed explicitly in job scripts (for example, \/INCLUDE WHERE AGE GT 55 AND GENDER EQ &#8220;MALE&#8221; ) or implicitly through defined and named conditions like this one:<\/p>\n<p><a href=\"http:\/\/www.iri.com\/blog\/wp-content\/uploads\/2013\/01\/condition-A.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-3469\" title=\"condition-A\" src=\"http:\/\/www.iri.com\/blog\/wp-content\/uploads\/2013\/01\/condition-A.jpg\" alt=\"\" width=\"468\" height=\"483\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2013\/01\/condition-A.jpg 468w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2013\/01\/condition-A-290x300.jpg 290w\" sizes=\"(max-width: 468px) 100vw, 468px\" \/><\/a><\/p>\n<p>The particular filter in this case is an include action where the records meeting the condition &#8216;validrec&#8217; are selected for processing.<\/p>\n<p><a href=\"http:\/\/www.iri.com\/blog\/wp-content\/uploads\/2013\/01\/include-A.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter\" title=\"include-A\" src=\"http:\/\/www.iri.com\/blog\/wp-content\/uploads\/2013\/01\/include-A.jpg\" alt=\"\" width=\"390\" height=\"310\" \/><\/a><\/p>\n<p>The result of this filter transformation is automatically reflected in the overall job script shown below, and supported by its dynamic outline to the right:<\/p>\n<p style=\"text-align: center;\"><a href=\"http:\/\/www.iri.com\/blog\/wp-content\/uploads\/2013\/01\/entireWB-A1.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-3472\" title=\"entireWB-A\" src=\"http:\/\/www.iri.com\/blog\/wp-content\/uploads\/2013\/01\/entireWB-A1.jpg\" alt=\"\" width=\"738\" height=\"468\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2013\/01\/entireWB-A1.jpg 1110w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2013\/01\/entireWB-A1-300x189.jpg 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2013\/01\/entireWB-A1-1024x648.jpg 1024w\" sizes=\"(max-width: 738px) 100vw, 738px\" \/><\/a><\/p>\n<p>Only those records meeting the &#8216;validrec&#8217; condition are input to the sort and displayed on output:<\/p>\n<p><a href=\"http:\/\/www.iri.com\/blog\/wp-content\/uploads\/2013\/01\/out_dat-A.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-3471\" title=\"out_dat-A\" src=\"http:\/\/www.iri.com\/blog\/wp-content\/uploads\/2013\/01\/out_dat-A.jpg\" alt=\"\" width=\"459\" height=\"325\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2013\/01\/out_dat-A.jpg 459w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2013\/01\/out_dat-A-300x212.jpg 300w\" sizes=\"(max-width: 459px) 100vw, 459px\" \/><\/a><\/p>\n<p>The other conditions associated with that particular data source were not used in subsequent filtering, formatting, or aggregation statements in this job, but could be.<\/p>\n<p>See also:<\/p>\n<p><a title=\"Select-Filter Data Transformation Solution Page\" href=\"http:\/\/www.iri.com\/solutions\/data-transformation\/select-filter\" target=\"_blank\" rel=\"noopener\">http:\/\/www.iri.com\/solutions\/data-transformation\/select-filter<\/a><br \/>\n<a title=\"Blog Article - Selecting PII for secure quaries-fieldshield-filters\" href=\"http:\/\/www.iri.com\/blog\/data-protection\/selecting-pii-for-secure-queries-fieldshield-filters\/\" target=\"_blank\" rel=\"noopener\">http:\/\/www.iri.com\/blog\/data-protection\/selecting-pii-for-secure-queries-fieldshield-filters\/<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>One of the best ways to speed up big data processing operations is to not process so much data in the first place; i.e. to eliminate unnecessary data ahead of time. Data can be culled en masse by specifying the collection, processing, or output of only a certain number of rows, or more intelligently with<\/p>\n<div><a class=\"btn-filled btn\" href=\"https:\/\/www.iri.com\/blog\/data-transformation2\/selectfilter-to-reduce-big-data-bulk\/\" title=\"Using Selection to Reduce Data Bulk (and Improve Data Quality)\">Read More<\/a><\/div>\n","protected":false},"author":7,"featured_media":3468,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_exactmetrics_skip_tracking":false,"_exactmetrics_sitenote_active":false,"_exactmetrics_sitenote_note":"","_exactmetrics_sitenote_category":0,"footnotes":""},"categories":[108,31,363,1,91],"tags":[25,44,14,77,45,5,9,141,130,850,94,76,49,68],"class_list":["post-3284","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-big-data-2","category-data-migration","category-data-quality","category-data-transformation2","category-iri-workbench","tag-big-data","tag-cosort","tag-data-masking","tag-data-migration-2","tag-data-processing","tag-data-transformation","tag-fieldshield","tag-filter-data","tag-includeomit-statements","tag-iri-workbench","tag-job-scripts","tag-nextform","tag-rowgen","tag-sortcl"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v23.4 (Yoast SEO v23.4) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Using Selection to Reduce Data Bulk (and Improve Data Quality) - IRI<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.iri.com\/blog\/data-transformation2\/selectfilter-to-reduce-big-data-bulk\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Using Selection to Reduce Data Bulk (and Improve Data Quality)\" \/>\n<meta property=\"og:description\" content=\"One of the best ways to speed up big data processing operations is to not process so much data in the first place; i.e. to eliminate unnecessary data ahead of time. Data can be culled en masse by specifying the collection, processing, or output of only a certain number of rows, or more intelligently withRead More\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.iri.com\/blog\/data-transformation2\/selectfilter-to-reduce-big-data-bulk\/\" \/>\n<meta property=\"og:site_name\" content=\"IRI\" \/>\n<meta property=\"article:published_time\" content=\"2013-01-22T02:21:26+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2017-11-06T13:32:05+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2013\/01\/include-A.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"390\" \/>\n\t<meta property=\"og:image:height\" content=\"310\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Sharon Hewitt\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Sharon Hewitt\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"2 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.iri.com\/blog\/data-transformation2\/selectfilter-to-reduce-big-data-bulk\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.iri.com\/blog\/data-transformation2\/selectfilter-to-reduce-big-data-bulk\/\"},\"author\":{\"name\":\"Sharon Hewitt\",\"@id\":\"https:\/\/www.iri.com\/blog\/#\/schema\/person\/18c4f34270c95345ba1274daad4ed795\"},\"headline\":\"Using Selection to Reduce Data Bulk (and Improve Data Quality)\",\"datePublished\":\"2013-01-22T02:21:26+00:00\",\"dateModified\":\"2017-11-06T13:32:05+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.iri.com\/blog\/data-transformation2\/selectfilter-to-reduce-big-data-bulk\/\"},\"wordCount\":485,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/www.iri.com\/blog\/#organization\"},\"image\":{\"@id\":\"https:\/\/www.iri.com\/blog\/data-transformation2\/selectfilter-to-reduce-big-data-bulk\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2013\/01\/include-A.jpg\",\"keywords\":[\"big data\",\"CoSort\",\"data masking\",\"data migration\",\"data processing\",\"data transformation\",\"FieldShield\",\"filter data\",\"include\/omit statements\",\"IRI Workbench\",\"job scripts\",\"NextForm\",\"RowGen\",\"SortCL\"],\"articleSection\":[\"Big Data\",\"Data Migration\",\"Data Quality (DQ&#041;\",\"Data Transformation\",\"IRI Workbench\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/www.iri.com\/blog\/data-transformation2\/selectfilter-to-reduce-big-data-bulk\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.iri.com\/blog\/data-transformation2\/selectfilter-to-reduce-big-data-bulk\/\",\"url\":\"https:\/\/www.iri.com\/blog\/data-transformation2\/selectfilter-to-reduce-big-data-bulk\/\",\"name\":\"Using Selection to Reduce Data Bulk (and Improve Data Quality) - IRI\",\"isPartOf\":{\"@id\":\"https:\/\/www.iri.com\/blog\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.iri.com\/blog\/data-transformation2\/selectfilter-to-reduce-big-data-bulk\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.iri.com\/blog\/data-transformation2\/selectfilter-to-reduce-big-data-bulk\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2013\/01\/include-A.jpg\",\"datePublished\":\"2013-01-22T02:21:26+00:00\",\"dateModified\":\"2017-11-06T13:32:05+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/www.iri.com\/blog\/data-transformation2\/selectfilter-to-reduce-big-data-bulk\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.iri.com\/blog\/data-transformation2\/selectfilter-to-reduce-big-data-bulk\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.iri.com\/blog\/data-transformation2\/selectfilter-to-reduce-big-data-bulk\/#primaryimage\",\"url\":\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2013\/01\/include-A.jpg\",\"contentUrl\":\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2013\/01\/include-A.jpg\",\"width\":\"390\",\"height\":\"310\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.iri.com\/blog\/data-transformation2\/selectfilter-to-reduce-big-data-bulk\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.iri.com\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Using Selection to Reduce Data Bulk (and Improve Data Quality)\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.iri.com\/blog\/#website\",\"url\":\"https:\/\/www.iri.com\/blog\/\",\"name\":\"IRI\",\"description\":\"Total Data Management Blog\",\"publisher\":{\"@id\":\"https:\/\/www.iri.com\/blog\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.iri.com\/blog\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.iri.com\/blog\/#organization\",\"name\":\"IRI\",\"url\":\"https:\/\/www.iri.com\/blog\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.iri.com\/blog\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png\",\"contentUrl\":\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png\",\"width\":750,\"height\":206,\"caption\":\"IRI\"},\"image\":{\"@id\":\"https:\/\/www.iri.com\/blog\/#\/schema\/logo\/image\/\"}},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.iri.com\/blog\/#\/schema\/person\/18c4f34270c95345ba1274daad4ed795\",\"name\":\"Sharon Hewitt\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.iri.com\/blog\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/bd823330fbdcccbe30b856710edc3f94?s=96&d=blank&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/bd823330fbdcccbe30b856710edc3f94?s=96&d=blank&r=g\",\"caption\":\"Sharon Hewitt\"},\"url\":\"https:\/\/www.iri.com\/blog\/author\/sharonh\/\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Using Selection to Reduce Data Bulk (and Improve Data Quality) - IRI","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.iri.com\/blog\/data-transformation2\/selectfilter-to-reduce-big-data-bulk\/","og_locale":"en_US","og_type":"article","og_title":"Using Selection to Reduce Data Bulk (and Improve Data Quality)","og_description":"One of the best ways to speed up big data processing operations is to not process so much data in the first place; i.e. to eliminate unnecessary data ahead of time. Data can be culled en masse by specifying the collection, processing, or output of only a certain number of rows, or more intelligently withRead More","og_url":"https:\/\/www.iri.com\/blog\/data-transformation2\/selectfilter-to-reduce-big-data-bulk\/","og_site_name":"IRI","article_published_time":"2013-01-22T02:21:26+00:00","article_modified_time":"2017-11-06T13:32:05+00:00","og_image":[{"width":390,"height":310,"url":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2013\/01\/include-A.jpg","type":"image\/jpeg"}],"author":"Sharon Hewitt","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Sharon Hewitt","Est. reading time":"2 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.iri.com\/blog\/data-transformation2\/selectfilter-to-reduce-big-data-bulk\/#article","isPartOf":{"@id":"https:\/\/www.iri.com\/blog\/data-transformation2\/selectfilter-to-reduce-big-data-bulk\/"},"author":{"name":"Sharon Hewitt","@id":"https:\/\/www.iri.com\/blog\/#\/schema\/person\/18c4f34270c95345ba1274daad4ed795"},"headline":"Using Selection to Reduce Data Bulk (and Improve Data Quality)","datePublished":"2013-01-22T02:21:26+00:00","dateModified":"2017-11-06T13:32:05+00:00","mainEntityOfPage":{"@id":"https:\/\/www.iri.com\/blog\/data-transformation2\/selectfilter-to-reduce-big-data-bulk\/"},"wordCount":485,"commentCount":0,"publisher":{"@id":"https:\/\/www.iri.com\/blog\/#organization"},"image":{"@id":"https:\/\/www.iri.com\/blog\/data-transformation2\/selectfilter-to-reduce-big-data-bulk\/#primaryimage"},"thumbnailUrl":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2013\/01\/include-A.jpg","keywords":["big data","CoSort","data masking","data migration","data processing","data transformation","FieldShield","filter data","include\/omit statements","IRI Workbench","job scripts","NextForm","RowGen","SortCL"],"articleSection":["Big Data","Data Migration","Data Quality (DQ&#041;","Data Transformation","IRI Workbench"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.iri.com\/blog\/data-transformation2\/selectfilter-to-reduce-big-data-bulk\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.iri.com\/blog\/data-transformation2\/selectfilter-to-reduce-big-data-bulk\/","url":"https:\/\/www.iri.com\/blog\/data-transformation2\/selectfilter-to-reduce-big-data-bulk\/","name":"Using Selection to Reduce Data Bulk (and Improve Data Quality) - IRI","isPartOf":{"@id":"https:\/\/www.iri.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.iri.com\/blog\/data-transformation2\/selectfilter-to-reduce-big-data-bulk\/#primaryimage"},"image":{"@id":"https:\/\/www.iri.com\/blog\/data-transformation2\/selectfilter-to-reduce-big-data-bulk\/#primaryimage"},"thumbnailUrl":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2013\/01\/include-A.jpg","datePublished":"2013-01-22T02:21:26+00:00","dateModified":"2017-11-06T13:32:05+00:00","breadcrumb":{"@id":"https:\/\/www.iri.com\/blog\/data-transformation2\/selectfilter-to-reduce-big-data-bulk\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.iri.com\/blog\/data-transformation2\/selectfilter-to-reduce-big-data-bulk\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.iri.com\/blog\/data-transformation2\/selectfilter-to-reduce-big-data-bulk\/#primaryimage","url":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2013\/01\/include-A.jpg","contentUrl":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2013\/01\/include-A.jpg","width":"390","height":"310"},{"@type":"BreadcrumbList","@id":"https:\/\/www.iri.com\/blog\/data-transformation2\/selectfilter-to-reduce-big-data-bulk\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.iri.com\/blog\/"},{"@type":"ListItem","position":2,"name":"Using Selection to Reduce Data Bulk (and Improve Data Quality)"}]},{"@type":"WebSite","@id":"https:\/\/www.iri.com\/blog\/#website","url":"https:\/\/www.iri.com\/blog\/","name":"IRI","description":"Total Data Management Blog","publisher":{"@id":"https:\/\/www.iri.com\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.iri.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.iri.com\/blog\/#organization","name":"IRI","url":"https:\/\/www.iri.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.iri.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png","contentUrl":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png","width":750,"height":206,"caption":"IRI"},"image":{"@id":"https:\/\/www.iri.com\/blog\/#\/schema\/logo\/image\/"}},{"@type":"Person","@id":"https:\/\/www.iri.com\/blog\/#\/schema\/person\/18c4f34270c95345ba1274daad4ed795","name":"Sharon Hewitt","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.iri.com\/blog\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/bd823330fbdcccbe30b856710edc3f94?s=96&d=blank&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/bd823330fbdcccbe30b856710edc3f94?s=96&d=blank&r=g","caption":"Sharon Hewitt"},"url":"https:\/\/www.iri.com\/blog\/author\/sharonh\/"}]}},"jetpack_featured_media_url":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2013\/01\/include-A.jpg","_links":{"self":[{"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/posts\/3284"}],"collection":[{"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/users\/7"}],"replies":[{"embeddable":true,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/comments?post=3284"}],"version-history":[{"count":72,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/posts\/3284\/revisions"}],"predecessor-version":[{"id":11657,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/posts\/3284\/revisions\/11657"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/media\/3468"}],"wp:attachment":[{"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/media?parent=3284"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/categories?post=3284"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/tags?post=3284"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}