{"id":1130,"date":"2012-06-26T15:07:15","date_gmt":"2012-06-26T15:07:15","guid":{"rendered":"http:\/\/www.iri.com\/blog\/?p=1130"},"modified":"2026-01-09T16:33:07","modified_gmt":"2026-01-09T21:33:07","slug":"what-is-hadoop","status":"publish","type":"post","link":"https:\/\/www.iri.com\/blog\/iri\/business\/what-is-hadoop\/","title":{"rendered":"What is Hadoop?"},"content":{"rendered":"<p><a href=\"http:\/\/www.iri.com\/blog\/wp-content\/uploads\/2012\/06\/hadoop1.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignleft size-full wp-image-1554\" title=\"hadoop\" src=\"http:\/\/www.iri.com\/blog\/wp-content\/uploads\/2012\/06\/hadoop1.png\" alt=\"\" width=\"345\" height=\"264\" srcset=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2012\/06\/hadoop1.png 345w, https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2012\/06\/hadoop1-300x229.png 300w\" sizes=\"(max-width: 345px) 100vw, 345px\" \/><\/a>Hadoop is an increasingly popular computing\u00a0 environment\u00a0for distributed processing that business can use to analyze and store huge amounts of data. Some of the world&#8217;s largest and most data-intensive corporate users deploy Hadoop to consolidate, combine and analyze <a href=\"http:\/\/www.iri.com\/solutions\/big-data\">big data<\/a> in both structured and complex sources.<\/p>\n<p>With <a title=\"Apache Hadoop Information Page Wiki\" href=\"http:\/\/en.wikipedia.org\/wiki\/Apache_Hadoop\" target=\"_blank\" rel=\"noopener\">Hadoop<\/a>, and its MapReduce programming language (and later variations like Spark, Storm, and Tez), high-volume data processing operations can scale up from running on one server to several thousand machines at once, harnessing the computing power on a managed grid.<\/p>\n<p>Today, companies like Google, Yahoo, Facebook, Ebay and Linkedin use Hadoop. It&#8217;s for that reason major industry vendors IBM, Oracle, Informatica and Microsoft are positioning themselves on Hadoop, and long-time competing innovators\u00a0like IRI (The CoSort Company), have as well. Both sides recognize that Hadoop is becoming a cost effective way to work with petabytes of data.<\/p>\n<p>What makes Hadoop more powerful than previous distributed processing technologies is that it can run on a large number of machines that don not share memory or disks. Hadoop breaks the data into smaller pieces, distributes those pieces across the grid, and merges the results automatically on the desired target platform. In addition, it has the intelligence to balance workloads, and recover from individual node failures through redundancy.<\/p>\n<div>\n<p><a title=\"IRI, The CoSort Company HomePage\" href=\"http:\/\/www.iri.com\">IRI<\/a> has always been a big data vendor, processing data outside databases to improve performance and leverage standard file systems. The Hadoop File System (HDFS) is the applicable equivalent in this case. IRI began working with Hadoop innovators in S.E. Asia (Solusi247) in 2014 to distribute and optimize CoSort-compatible <a href=\"http:\/\/www.iri.com\/solutions\/big-data\/big-data-packaging\">transformations<\/a> and FieldShield data <a href=\"http:\/\/www.iri.com\/solutions\/big-data\/big-data-protection\">masking<\/a> functions across large grids. RowGen-compatible test data generation is next.<\/p>\n<p>By 2017, IRI&#8217;s modern platform for &#8220;total data management&#8221; &#8212; called\u00a0<a href=\"http:\/\/www.iri.com\/products\/voracity\">Voracity<\/a>\u00a0&#8212; began running\u00a0the above jobs either via the\u00a0default <a href=\"http:\/\/www.iri.com\/products\/cosort\/sorrtcl\">SortCL<\/a> engine, \u00a0<a href=\"http:\/\/www.iri.com\/solutions\/big-data\/hadoop-optional\">or seamlessly<\/a> in Map Reduce 2, Spark, Spark Stream, Spark Stream, and Tez. Support is also available for data streaming through Kafka, etc., compressed formats like Parque, and both SQL and NoSQL databases compatible with Hadoop.<\/p>\n<p>The results of IRI&#8217;s map-once-deploy-anywhere options are significant\u00a0price-performance gains for big data integration (ETL) architects and data scientists, as well as data governance officers dealing with PII in JSON and other<a href=\"http:\/\/www.iri.com\/products\/workbench\/data-sources\"> sources<\/a>. That is not only because of the relatively low cost of Voracity subscriptions, but because there is no need to learn to program in any language to get work done. The free IRI Workbench GUI, built on Eclipse, makes <a href=\"http:\/\/www.iri.com\/products\/workbench\/voracity-gui\/design\">job design<\/a> a graphical affair, and coding in Hadoop moot.<\/p>\n<p>Check out <a href=\"http:\/\/www.iri.com\/blog\/data-transformation2\/when-to-use-hadoop\/\">this article<\/a>\u00a0to help you decide when Hadoop should be used, and <a href=\"http:\/\/www.iri.com\/blog\/data-transformation2\/running-voracity-jobs-in-hadoop\/\">this article<\/a> for how to connect to HDFS and run jobs seamlessly in Voracity.<\/p>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Hadoop is an increasingly popular computing\u00a0 environment\u00a0for distributed processing that business can use to analyze and store huge amounts of data. Some of the world&#8217;s largest and most data-intensive corporate users deploy Hadoop to consolidate, combine and analyze big data in both structured and complex sources. With Hadoop, and its MapReduce programming language (and later<\/p>\n<div><a class=\"btn-filled btn\" href=\"https:\/\/www.iri.com\/blog\/iri\/business\/what-is-hadoop\/\" title=\"What is Hadoop?\">Read More<\/a><\/div>\n","protected":false},"author":6,"featured_media":1554,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_exactmetrics_skip_tracking":false,"_exactmetrics_sitenote_active":false,"_exactmetrics_sitenote_note":"","_exactmetrics_sitenote_category":0,"footnotes":""},"categories":[108,34,2255],"tags":[25,44,81,270,174],"class_list":["post-1130","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-big-data-2","category-business","category-archived-articles","tag-big-data","tag-cosort","tag-hadoop","tag-iri-2","tag-warehouse"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v23.4 (Yoast SEO v23.4) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>What is Hadoop? - IRI<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.iri.com\/blog\/iri\/business\/what-is-hadoop\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"What is Hadoop?\" \/>\n<meta property=\"og:description\" content=\"Hadoop is an increasingly popular computing\u00a0 environment\u00a0for distributed processing that business can use to analyze and store huge amounts of data. Some of the world&#8217;s largest and most data-intensive corporate users deploy Hadoop to consolidate, combine and analyze big data in both structured and complex sources. With Hadoop, and its MapReduce programming language (and laterRead More\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.iri.com\/blog\/iri\/business\/what-is-hadoop\/\" \/>\n<meta property=\"og:site_name\" content=\"IRI\" \/>\n<meta property=\"article:published_time\" content=\"2012-06-26T15:07:15+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2026-01-09T21:33:07+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2012\/06\/hadoop1.png\" \/>\n\t<meta property=\"og:image:width\" content=\"345\" \/>\n\t<meta property=\"og:image:height\" content=\"264\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Jeff Simpson\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Jeff Simpson\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"3 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.iri.com\/blog\/iri\/business\/what-is-hadoop\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.iri.com\/blog\/iri\/business\/what-is-hadoop\/\"},\"author\":{\"name\":\"Jeff Simpson\",\"@id\":\"https:\/\/www.iri.com\/blog\/#\/schema\/person\/9c7e21f2a369c971287d5030b7c408e6\"},\"headline\":\"What is Hadoop?\",\"datePublished\":\"2012-06-26T15:07:15+00:00\",\"dateModified\":\"2026-01-09T21:33:07+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.iri.com\/blog\/iri\/business\/what-is-hadoop\/\"},\"wordCount\":452,\"commentCount\":3,\"publisher\":{\"@id\":\"https:\/\/www.iri.com\/blog\/#organization\"},\"image\":{\"@id\":\"https:\/\/www.iri.com\/blog\/iri\/business\/what-is-hadoop\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2012\/06\/hadoop1.png\",\"keywords\":[\"big data\",\"CoSort\",\"hadoop\",\"iri\",\"warehouse\"],\"articleSection\":[\"Big Data\",\"IRI Business\",\"Archived Articles\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/www.iri.com\/blog\/iri\/business\/what-is-hadoop\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.iri.com\/blog\/iri\/business\/what-is-hadoop\/\",\"url\":\"https:\/\/www.iri.com\/blog\/iri\/business\/what-is-hadoop\/\",\"name\":\"What is Hadoop? - IRI\",\"isPartOf\":{\"@id\":\"https:\/\/www.iri.com\/blog\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.iri.com\/blog\/iri\/business\/what-is-hadoop\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.iri.com\/blog\/iri\/business\/what-is-hadoop\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2012\/06\/hadoop1.png\",\"datePublished\":\"2012-06-26T15:07:15+00:00\",\"dateModified\":\"2026-01-09T21:33:07+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/www.iri.com\/blog\/iri\/business\/what-is-hadoop\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.iri.com\/blog\/iri\/business\/what-is-hadoop\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.iri.com\/blog\/iri\/business\/what-is-hadoop\/#primaryimage\",\"url\":\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2012\/06\/hadoop1.png\",\"contentUrl\":\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2012\/06\/hadoop1.png\",\"width\":\"345\",\"height\":\"264\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.iri.com\/blog\/iri\/business\/what-is-hadoop\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.iri.com\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"What is Hadoop?\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.iri.com\/blog\/#website\",\"url\":\"https:\/\/www.iri.com\/blog\/\",\"name\":\"IRI\",\"description\":\"Total Data Management Blog\",\"publisher\":{\"@id\":\"https:\/\/www.iri.com\/blog\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.iri.com\/blog\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.iri.com\/blog\/#organization\",\"name\":\"IRI\",\"url\":\"https:\/\/www.iri.com\/blog\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.iri.com\/blog\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png\",\"contentUrl\":\"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png\",\"width\":750,\"height\":206,\"caption\":\"IRI\"},\"image\":{\"@id\":\"https:\/\/www.iri.com\/blog\/#\/schema\/logo\/image\/\"}},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.iri.com\/blog\/#\/schema\/person\/9c7e21f2a369c971287d5030b7c408e6\",\"name\":\"Jeff Simpson\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.iri.com\/blog\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/c636dae166fc9fecf2b62e88fe135187?s=96&d=blank&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/c636dae166fc9fecf2b62e88fe135187?s=96&d=blank&r=g\",\"caption\":\"Jeff Simpson\"},\"url\":\"https:\/\/www.iri.com\/blog\/author\/jeffs\/\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"What is Hadoop? - IRI","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.iri.com\/blog\/iri\/business\/what-is-hadoop\/","og_locale":"en_US","og_type":"article","og_title":"What is Hadoop?","og_description":"Hadoop is an increasingly popular computing\u00a0 environment\u00a0for distributed processing that business can use to analyze and store huge amounts of data. Some of the world&#8217;s largest and most data-intensive corporate users deploy Hadoop to consolidate, combine and analyze big data in both structured and complex sources. With Hadoop, and its MapReduce programming language (and laterRead More","og_url":"https:\/\/www.iri.com\/blog\/iri\/business\/what-is-hadoop\/","og_site_name":"IRI","article_published_time":"2012-06-26T15:07:15+00:00","article_modified_time":"2026-01-09T21:33:07+00:00","og_image":[{"width":345,"height":264,"url":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2012\/06\/hadoop1.png","type":"image\/png"}],"author":"Jeff Simpson","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Jeff Simpson","Est. reading time":"3 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.iri.com\/blog\/iri\/business\/what-is-hadoop\/#article","isPartOf":{"@id":"https:\/\/www.iri.com\/blog\/iri\/business\/what-is-hadoop\/"},"author":{"name":"Jeff Simpson","@id":"https:\/\/www.iri.com\/blog\/#\/schema\/person\/9c7e21f2a369c971287d5030b7c408e6"},"headline":"What is Hadoop?","datePublished":"2012-06-26T15:07:15+00:00","dateModified":"2026-01-09T21:33:07+00:00","mainEntityOfPage":{"@id":"https:\/\/www.iri.com\/blog\/iri\/business\/what-is-hadoop\/"},"wordCount":452,"commentCount":3,"publisher":{"@id":"https:\/\/www.iri.com\/blog\/#organization"},"image":{"@id":"https:\/\/www.iri.com\/blog\/iri\/business\/what-is-hadoop\/#primaryimage"},"thumbnailUrl":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2012\/06\/hadoop1.png","keywords":["big data","CoSort","hadoop","iri","warehouse"],"articleSection":["Big Data","IRI Business","Archived Articles"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.iri.com\/blog\/iri\/business\/what-is-hadoop\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.iri.com\/blog\/iri\/business\/what-is-hadoop\/","url":"https:\/\/www.iri.com\/blog\/iri\/business\/what-is-hadoop\/","name":"What is Hadoop? - IRI","isPartOf":{"@id":"https:\/\/www.iri.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.iri.com\/blog\/iri\/business\/what-is-hadoop\/#primaryimage"},"image":{"@id":"https:\/\/www.iri.com\/blog\/iri\/business\/what-is-hadoop\/#primaryimage"},"thumbnailUrl":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2012\/06\/hadoop1.png","datePublished":"2012-06-26T15:07:15+00:00","dateModified":"2026-01-09T21:33:07+00:00","breadcrumb":{"@id":"https:\/\/www.iri.com\/blog\/iri\/business\/what-is-hadoop\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.iri.com\/blog\/iri\/business\/what-is-hadoop\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.iri.com\/blog\/iri\/business\/what-is-hadoop\/#primaryimage","url":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2012\/06\/hadoop1.png","contentUrl":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2012\/06\/hadoop1.png","width":"345","height":"264"},{"@type":"BreadcrumbList","@id":"https:\/\/www.iri.com\/blog\/iri\/business\/what-is-hadoop\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.iri.com\/blog\/"},{"@type":"ListItem","position":2,"name":"What is Hadoop?"}]},{"@type":"WebSite","@id":"https:\/\/www.iri.com\/blog\/#website","url":"https:\/\/www.iri.com\/blog\/","name":"IRI","description":"Total Data Management Blog","publisher":{"@id":"https:\/\/www.iri.com\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.iri.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.iri.com\/blog\/#organization","name":"IRI","url":"https:\/\/www.iri.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.iri.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png","contentUrl":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2019\/02\/iri-logo-total-data-management-small-1.png","width":750,"height":206,"caption":"IRI"},"image":{"@id":"https:\/\/www.iri.com\/blog\/#\/schema\/logo\/image\/"}},{"@type":"Person","@id":"https:\/\/www.iri.com\/blog\/#\/schema\/person\/9c7e21f2a369c971287d5030b7c408e6","name":"Jeff Simpson","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.iri.com\/blog\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/c636dae166fc9fecf2b62e88fe135187?s=96&d=blank&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/c636dae166fc9fecf2b62e88fe135187?s=96&d=blank&r=g","caption":"Jeff Simpson"},"url":"https:\/\/www.iri.com\/blog\/author\/jeffs\/"}]}},"jetpack_featured_media_url":"https:\/\/www.iri.com\/blog\/wp-content\/uploads\/2012\/06\/hadoop1.png","_links":{"self":[{"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/posts\/1130"}],"collection":[{"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/users\/6"}],"replies":[{"embeddable":true,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/comments?post=1130"}],"version-history":[{"count":49,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/posts\/1130\/revisions"}],"predecessor-version":[{"id":11292,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/posts\/1130\/revisions\/11292"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/media\/1554"}],"wp:attachment":[{"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/media?parent=1130"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/categories?post=1130"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.iri.com\/blog\/wp-json\/wp\/v2\/tags?post=1130"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}