{"id":12994,"date":"2019-07-16T17:38:57","date_gmt":"2019-07-16T21:38:57","guid":{"rendered":"http:\/\/www.iri.com\/blog\/?p=12994"},"modified":"2019-07-17T12:58:34","modified_gmt":"2019-07-17T16:58:34","slug":"data-profiling-splunk","status":"publish","type":"post","link":"https:\/\/www.iri.com\/blog\/business-intelligence\/data-profiling-splunk\/","title":{"rendered":"Revealing Data Profiling Secrets in Splunk"},"content":{"rendered":"<h4><b>What Is Data Profiling?<\/b><\/h4>\n<p><span style=\"font-weight: 400;\">Before you can make use of the data you have and trust its value for, analytic, testing, and other production jobs, you have to know enough <\/span><i><span style=\"font-weight: 400;\">about <\/span><\/i><span style=\"font-weight: 400;\">that data. Data that is incomplete, improperly linked, or improperly defined is an impediment to project success.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">To address this issue, data profiling technology exists to parse data sources and report on their content, layout, and relationships.\u00a0<\/span><\/p>\n<h4><b>How Do You Profile Data?<\/b><\/h4>\n<p><span style=\"font-weight: 400;\">The <\/span><a href=\"https:\/\/www.iri.com\/products\/voracity\"><span style=\"font-weight: 400;\">IRI Voracity<\/span><\/a><span style=\"font-weight: 400;\"> data management platform &#8212; or its subset products like <\/span><a href=\"https:\/\/www.iri.com\/products\/cosort\"><span style=\"font-weight: 400;\">CoSort<\/span><\/a><span style=\"font-weight: 400;\">, <\/span><a href=\"https:\/\/www.iri.com\/products\/nextform\"><span style=\"font-weight: 400;\">NextForm<\/span><\/a><span style=\"font-weight: 400;\">, <\/span><a href=\"https:\/\/www.iri.com\/products\/fieldshield\"><span style=\"font-weight: 400;\">FieldShield <\/span><\/a><span style=\"font-weight: 400;\">or <\/span><a href=\"https:\/\/www.iri.com\/products\/rowgen\"><span style=\"font-weight: 400;\">RowGen <\/span><\/a><span style=\"font-weight: 400;\">&#8212;\u00a0 has functionality to statistically profile, model, and search flat files and database tables via data profiling wizards.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The wizards are accessed from a drop-down menu indicated by the microscope icon in the top toolbar of <\/span><a href=\"https:\/\/www.iri.com\/products\/workbench\"><span style=\"font-weight: 400;\">IRI Workbench<\/span><\/a><span style=\"font-weight: 400;\">, the graphical IDE built on Eclipse that front ends Voracity et al:(<\/span><\/p>\n<p><a href=\"http:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/07\/splunk-profile-1.png\"><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-12998 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2019\/07\/splunk-profile-1-1024x546.png\" alt=\"\" width=\"839\" height=\"447\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/07\/splunk-profile-1-1024x546.png 1024w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/07\/splunk-profile-1-300x160.png 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/07\/splunk-profile-1-768x410.png 768w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/07\/splunk-profile-1.png 1352w\" sizes=\"(max-width: 839px) 100vw, 839px\" \/><\/a><\/p>\n<p><span style=\"font-weight: 400;\">So once you know what files or DB schemas you want to profile, you can use either the <\/span><a href=\"https:\/\/www.iri.com\/blog\/iri\/iri-workbench\/flat-file-profiling\/\"><span style=\"font-weight: 400;\">flat-file profiling<\/span><\/a><span style=\"font-weight: 400;\"> wizard or the <\/span><a href=\"https:\/\/www.iri.com\/blog\/data-transformation2\/database-profiling-in-iri-workbench\/\"><span style=\"font-weight: 400;\">database profiling wizard<\/span><\/a><span style=\"font-weight: 400;\"> to automatically detect the metadata and the format of the sources, and to choose those portions (or all) of the data to statistically profile.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">In addition to those reports, the wizards also have options to search for values within the chosen repositories that match specified regular expressions or strings:\u00a0<\/span><span style=\"font-weight: 400;\">\u00a0<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Pattern Search &#8211; finds and counts values that match the format of a Java regular expression.<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Fuzzy Match &#8211; searches for strings <\/span><i><span style=\"font-weight: 400;\">similar <\/span><\/i><span style=\"font-weight: 400;\">to those you enter, and to select or specify search conditions using different algorithms and probabilities<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Value Lookup &#8211; compares values in the data source(s) to every string in a set file, showing and counting the matches.<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">The database profiling wizard also runs referential integrity checks. And, another wizard creates both entity-relationship (ER) models and diagrams to illustrate the structure and relationships of tables in any RDB schema you want to examine.<\/span><\/p>\n<h4><b>Now What? Splunk of Course.\u00a0<\/b><\/h4>\n<p><span style=\"font-weight: 400;\">All of these data profiling statistics and values are output to a structured file like this:\u00a0<\/span><\/p>\n<p><a href=\"http:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/07\/splunk-profile-2.png\"><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-12999 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2019\/07\/splunk-profile-2-1024x718.png\" alt=\"\" width=\"839\" height=\"588\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/07\/splunk-profile-2-1024x718.png 1024w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/07\/splunk-profile-2-300x210.png 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/07\/splunk-profile-2-768x539.png 768w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/07\/splunk-profile-2.png 1195w\" sizes=\"(max-width: 839px) 100vw, 839px\" \/><\/a><\/p>\n<p><span style=\"font-weight: 400;\">These flat files can be easily loaded into tools like Splunk for more a more visual analysis, and possible adaptive responses (actions to take). To make that happen automatically, check out <\/span><a href=\"https:\/\/www.iri.com\/blog\/business-intelligence\/forward-voracity-data-into-splunk\/\"><span style=\"font-weight: 400;\">this article<\/span><\/a><span style=\"font-weight: 400;\"> on utilizing the Splunk Universal Forwarder with IRI software output.<\/span><\/p>\n<h4><b>Indexing those Results in Splunk<\/b><\/h4>\n<p><span style=\"font-weight: 400;\">I used Universal Forwarder to push the tab separated values file created by the database profiling wizard in IRI Workbench to Splunk Enterprise from my local computer.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Once the file is indexed in Splunk, the data can be searched and used to make visualizations. Go to the Search and Reporting app in Splunk Enterprise, and find your data source(s) that have been indexed. Once the source is selected, a screen like this should appear:<\/span><\/p>\n<p><a href=\"http:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/07\/splunk-profile-3.png\"><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-13000 aligncenter\" src=\"\/blog\/wp-content\/uploads\/2019\/07\/splunk-profile-3-1024x555.png\" alt=\"\" width=\"839\" height=\"455\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/07\/splunk-profile-3-1024x555.png 1024w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/07\/splunk-profile-3-300x163.png 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/07\/splunk-profile-3-768x416.png 768w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/07\/splunk-profile-3.png 1600w\" sizes=\"(max-width: 839px) 100vw, 839px\" \/><\/a><\/p>\n<p><span style=\"font-weight: 400;\">This gives you a look at the entries of the data profiling results. If you are dealing with many database tables or schemas, creating dashboards will give you a faster way to visualize the relevant results of the data profiling operation.<\/span><\/p>\n<h4><b>Seeing Key Data Profiling Details<\/b><\/h4>\n<p><span style=\"font-weight: 400;\">Using Splunk commands, I was able to construct a dashboard that lends more graphical insight into the data profiling results. Some of the important concepts of data profiling are structure discovery, content discovery, and relationship discovery.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">In this dashboard, I can see that the main data types are varchar and numeric, which seem to be consistent with the many min and max values presented in the charts below. So, the structure of this data set appears to be fine.\u00a0<\/span><\/p>\n<p><a href=\"http:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/07\/splunk-profile-4.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-13001 \" src=\"http:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/07\/splunk-profile-4-e1563381537698-1024x507.png\" alt=\"\" width=\"840\" height=\"416\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/07\/splunk-profile-4-e1563381537698-1024x507.png 1024w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/07\/splunk-profile-4-e1563381537698-300x149.png 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/07\/splunk-profile-4-e1563381537698-768x380.png 768w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/07\/splunk-profile-4-e1563381537698.png 1600w\" sizes=\"(max-width: 840px) 100vw, 840px\" \/><\/a><\/p>\n<p><span style=\"font-weight: 400;\">In terms of content discovery, we find out that the top column name of all of the databases included in this data profile is State, with a share of 3.3 percent of columns. However, we also find that the median and max lengths of the data values is heavily represented by 0, meaning that there are a lot of null or empty values.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Finally, we see from the visualizations that most of the data from different databases is unrelated, which is true in this example.<\/span><\/p>\n<p><a href=\"http:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/07\/splunk-profile-5.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-13002 \" src=\"http:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/07\/splunk-profile-5-e1563381772818-1024x325.png\" alt=\"\" width=\"838\" height=\"266\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/07\/splunk-profile-5-e1563381772818-1024x325.png 1024w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/07\/splunk-profile-5-e1563381772818-300x95.png 300w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/07\/splunk-profile-5-e1563381772818-768x244.png 768w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/07\/splunk-profile-5-e1563381772818.png 1600w\" sizes=\"(max-width: 838px) 100vw, 838px\" \/><\/a><\/p>\n<p><span style=\"font-weight: 400;\">From what has been discovered in the database profiling results in this example, the data tables in the databases should be filled out more if analyzing the data from the aforementioned databases is your objective. Also, data from different databases should be indexed separately into Splunk since the data is unrelated.\u00a0<\/span><\/p>\n<h4><b>Bottom Line<\/b><\/h4>\n<p><span style=\"font-weight: 400;\">Data profiling helps you understand the data you have, make sure it is healthy, and gives you ideas for making the best use of it. Ultimately, data profiling improves the productivity and effectiveness of data governance and analytics.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The use of data profiling tools in <\/span><a href=\"https:\/\/www.iri.com\/products\/workbench\"><span style=\"font-weight: 400;\">IRI Workbench<\/span><\/a><span style=\"font-weight: 400;\"> can be combined with analysis and visualization tools like Splunk to improve insights into the makeup of your data. If you need our help using the profilers, or getting their results into Splunk, email <\/span><a href=\"mailto:voracity@iri.com\"><span style=\"font-weight: 400;\">voracity@iri.com<\/span><\/a><span style=\"font-weight: 400;\">.<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>What Is Data Profiling? Before you can make use of the data you have and trust its value for, analytic, testing, and other production jobs, you have to know enough about that data. Data that is incomplete, improperly linked, or improperly defined is an impediment to project success. To address this issue, data profiling technology<\/p>\n<div><a class=\"btn-filled btn\" href=\"https:\/\/www.iri.com\/blog\/business-intelligence\/data-profiling-splunk\/\" title=\"Revealing Data Profiling Secrets in Splunk\">Read More<\/a><\/div>\n","protected":false},"author":119,"featured_media":13001,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_exactmetrics_skip_tracking":false,"_exactmetrics_sitenote_active":false,"_exactmetrics_sitenote_note":"","_exactmetrics_sitenote_category":0,"footnotes":""},"categories":[32,776,91],"tags":[278,100,789,850,574],"class_list":["post-12994","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-business-intelligence","category-etl","category-iri-workbench","tag-data-profiling","tag-etl","tag-iri-voracity","tag-iri-workbench","tag-splunk"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v23.4 (Yoast SEO v23.4) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Revealing Data Profiling Secrets in Splunk - IRI<\/title>\n<meta name=\"description\" content=\"Before you can make use of the data you have and trust its value for, analytic, testing, and other production jobs, you have to know enough about that data.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.iri.com\/blog\/business-intelligence\/data-profiling-splunk\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Revealing Data Profiling Secrets in Splunk\" \/>\n<meta property=\"og:description\" content=\"Before you can make use of the data you have and trust its value for, analytic, testing, and other production jobs, you have to know enough about that data.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.iri.com\/blog\/business-intelligence\/data-profiling-splunk\/\" \/>\n<meta property=\"og:site_name\" content=\"IRI\" \/>\n<meta property=\"article:published_time\" content=\"2019-07-16T21:38:57+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2019-07-17T16:58:34+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/07\/splunk-profile-4-e1563381537698.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1600\" \/>\n\t<meta property=\"og:image:height\" content=\"792\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Devon Kozenieski\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Devon Kozenieski\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"4 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.iri.com\/blog\/business-intelligence\/data-profiling-splunk\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.iri.com\/blog\/business-intelligence\/data-profiling-splunk\/\"},\"author\":{\"name\":\"Devon Kozenieski\",\"@id\":\"https:\/\/www.iri.com\/blog\/#\/schema\/person\/de972c035aaeecfc40a3ae2ea5ff7ba1\"},\"headline\":\"Revealing Data Profiling Secrets in Splunk\",\"datePublished\":\"2019-07-16T21:38:57+00:00\",\"dateModified\":\"2019-07-17T16:58:34+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.iri.com\/blog\/business-intelligence\/data-profiling-splunk\/\"},\"wordCount\":806,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/www.iri.com\/blog\/#organization\"},\"image\":{\"@id\":\"https:\/\/www.iri.com\/blog\/business-intelligence\/data-profiling-splunk\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/07\/splunk-profile-4-e1563381537698.png\",\"keywords\":[\"data profiling\",\"ETL\",\"IRI Voracity\",\"IRI Workbench\",\"Splunk\"],\"articleSection\":[\"Business Intelligence (BI&#041;\",\"ETL\",\"IRI Workbench\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/www.iri.com\/blog\/business-intelligence\/data-profiling-splunk\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.iri.com\/blog\/business-intelligence\/data-profiling-splunk\/\",\"url\":\"https:\/\/www.iri.com\/blog\/business-intelligence\/data-profiling-splunk\/\",\"name\":\"Revealing Data Profiling Secrets in Splunk - IRI\",\"isPartOf\":{\"@id\":\"https:\/\/www.iri.com\/blog\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.iri.com\/blog\/business-intelligence\/data-profiling-splunk\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.iri.com\/blog\/business-intelligence\/data-profiling-splunk\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/07\/splunk-profile-4-e1563381537698.png\",\"datePublished\":\"2019-07-16T21:38:57+00:00\",\"dateModified\":\"2019-07-17T16:58:34+00:00\",\"description\":\"Before you can make use of the data you have and trust its value for, analytic, testing, and other production jobs, you have to know enough about that data.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.iri.com\/blog\/business-intelligence\/data-profiling-splunk\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.iri.com\/blog\/business-intelligence\/data-profiling-splunk\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.iri.com\/blog\/business-intelligence\/data-profiling-splunk\/#primaryimage\",\"url\":\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/07\/splunk-profile-4-e1563381537698.png\",\"contentUrl\":\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/07\/splunk-profile-4-e1563381537698.png\",\"width\":1600,\"height\":792},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.iri.com\/blog\/business-intelligence\/data-profiling-splunk\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.iri.com\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Revealing Data Profiling Secrets in Splunk\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.iri.com\/blog\/#website\",\"url\":\"https:\/\/www.iri.com\/blog\/\",\"name\":\"IRI\",\"description\":\"Total Data Management Blog\",\"publisher\":{\"@id\":\"https:\/\/www.iri.com\/blog\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.iri.com\/blog\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.iri.com\/blog\/#organization\",\"name\":\"IRI\",\"url\":\"https:\/\/www.iri.com\/blog\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.iri.com\/blog\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png\",\"contentUrl\":\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png\",\"width\":750,\"height\":206,\"caption\":\"IRI\"},\"image\":{\"@id\":\"https:\/\/www.iri.com\/blog\/#\/schema\/logo\/image\/\"}},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.iri.com\/blog\/#\/schema\/person\/de972c035aaeecfc40a3ae2ea5ff7ba1\",\"name\":\"Devon Kozenieski\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.iri.com\/blog\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/e4c421588c1a85dd9a76146fe15528f7?s=96&d=blank&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/e4c421588c1a85dd9a76146fe15528f7?s=96&d=blank&r=g\",\"caption\":\"Devon Kozenieski\"},\"url\":\"https:\/\/www.iri.com\/blog\/author\/devonk\/\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Revealing Data Profiling Secrets in Splunk - IRI","description":"Before you can make use of the data you have and trust its value for, analytic, testing, and other production jobs, you have to know enough about that data.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.iri.com\/blog\/business-intelligence\/data-profiling-splunk\/","og_locale":"en_US","og_type":"article","og_title":"Revealing Data Profiling Secrets in Splunk","og_description":"Before you can make use of the data you have and trust its value for, analytic, testing, and other production jobs, you have to know enough about that data.","og_url":"https:\/\/www.iri.com\/blog\/business-intelligence\/data-profiling-splunk\/","og_site_name":"IRI","article_published_time":"2019-07-16T21:38:57+00:00","article_modified_time":"2019-07-17T16:58:34+00:00","og_image":[{"width":1600,"height":792,"url":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/07\/splunk-profile-4-e1563381537698.png","type":"image\/png"}],"author":"Devon Kozenieski","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Devon Kozenieski","Est. reading time":"4 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.iri.com\/blog\/business-intelligence\/data-profiling-splunk\/#article","isPartOf":{"@id":"https:\/\/www.iri.com\/blog\/business-intelligence\/data-profiling-splunk\/"},"author":{"name":"Devon Kozenieski","@id":"https:\/\/www.iri.com\/blog\/#\/schema\/person\/de972c035aaeecfc40a3ae2ea5ff7ba1"},"headline":"Revealing Data Profiling Secrets in Splunk","datePublished":"2019-07-16T21:38:57+00:00","dateModified":"2019-07-17T16:58:34+00:00","mainEntityOfPage":{"@id":"https:\/\/www.iri.com\/blog\/business-intelligence\/data-profiling-splunk\/"},"wordCount":806,"commentCount":0,"publisher":{"@id":"https:\/\/www.iri.com\/blog\/#organization"},"image":{"@id":"https:\/\/www.iri.com\/blog\/business-intelligence\/data-profiling-splunk\/#primaryimage"},"thumbnailUrl":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/07\/splunk-profile-4-e1563381537698.png","keywords":["data profiling","ETL","IRI Voracity","IRI Workbench","Splunk"],"articleSection":["Business Intelligence (BI&#041;","ETL","IRI Workbench"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.iri.com\/blog\/business-intelligence\/data-profiling-splunk\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.iri.com\/blog\/business-intelligence\/data-profiling-splunk\/","url":"https:\/\/www.iri.com\/blog\/business-intelligence\/data-profiling-splunk\/","name":"Revealing Data Profiling Secrets in Splunk - IRI","isPartOf":{"@id":"https:\/\/www.iri.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.iri.com\/blog\/business-intelligence\/data-profiling-splunk\/#primaryimage"},"image":{"@id":"https:\/\/www.iri.com\/blog\/business-intelligence\/data-profiling-splunk\/#primaryimage"},"thumbnailUrl":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/07\/splunk-profile-4-e1563381537698.png","datePublished":"2019-07-16T21:38:57+00:00","dateModified":"2019-07-17T16:58:34+00:00","description":"Before you can make use of the data you have and trust its value for, analytic, testing, and other production jobs, you have to know enough about that data.","breadcrumb":{"@id":"https:\/\/www.iri.com\/blog\/business-intelligence\/data-profiling-splunk\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.iri.com\/blog\/business-intelligence\/data-profiling-splunk\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.iri.com\/blog\/business-intelligence\/data-profiling-splunk\/#primaryimage","url":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/07\/splunk-profile-4-e1563381537698.png","contentUrl":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/07\/splunk-profile-4-e1563381537698.png","width":1600,"height":792},{"@type":"BreadcrumbList","@id":"https:\/\/www.iri.com\/blog\/business-intelligence\/data-profiling-splunk\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.iri.com\/blog\/"},{"@type":"ListItem","position":2,"name":"Revealing Data Profiling Secrets in Splunk"}]},{"@type":"WebSite","@id":"https:\/\/www.iri.com\/blog\/#website","url":"https:\/\/www.iri.com\/blog\/","name":"IRI","description":"Total Data Management Blog","publisher":{"@id":"https:\/\/www.iri.com\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.iri.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.iri.com\/blog\/#organization","name":"IRI","url":"https:\/\/www.iri.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.iri.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png","contentUrl":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png","width":750,"height":206,"caption":"IRI"},"image":{"@id":"https:\/\/www.iri.com\/blog\/#\/schema\/logo\/image\/"}},{"@type":"Person","@id":"https:\/\/www.iri.com\/blog\/#\/schema\/person\/de972c035aaeecfc40a3ae2ea5ff7ba1","name":"Devon Kozenieski","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.iri.com\/blog\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/e4c421588c1a85dd9a76146fe15528f7?s=96&d=blank&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/e4c421588c1a85dd9a76146fe15528f7?s=96&d=blank&r=g","caption":"Devon Kozenieski"},"url":"https:\/\/www.iri.com\/blog\/author\/devonk\/"}]}},"jetpack_featured_media_url":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/07\/splunk-profile-4-e1563381537698.png","_links":{"self":[{"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/posts\/12994"}],"collection":[{"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/users\/119"}],"replies":[{"embeddable":true,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/comments?post=12994"}],"version-history":[{"count":8,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/posts\/12994\/revisions"}],"predecessor-version":[{"id":13008,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/posts\/12994\/revisions\/13008"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/media\/13001"}],"wp:attachment":[{"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/media?parent=12994"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/categories?post=12994"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/tags?post=12994"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}