"More Hadoop projects will be swept under the rug as businesses devote major resources to their big data projects before doing their due diligence, which results in a costly, disillusioning project failure."
Gary Nakamura, CEO of Concurrent
"Spend (on big data projects) wisely. Follow a CRAWL - WALK - RUN strategy."
Dr. Peter Aiken, Data Blueprint
Are you sure you need another: IT fabric like Hadoop or Teradata, an in-memory or columnar database like HANA or Vertica, a DB or ELT appliance like Exadata or Netezza, or a complex ETL tool like Informatica or Ab Initio? And do you have the time, money, and expertise to take that on?
If you answered no, see below, because 5-figure IRI software integrates and transforms big data in the file systems you already have. IRI CoSort is extremely fast, proven, and has a low learning curve.
If you answered yes, because you want to leverage the power of Hadoop on commodity hardware, look into the new IRI Voracity platform. Providing data discovery, integration, migration, governance, and analytic solutions, Voracity gives you big data manipulation and masking options not only through CoSort, but also through MR2, Spark, Storm and Tez.
Either way, remember that Hadoop distributions and specialty software will not access or handle all the data you need, mash it, or prepare it thoroughly enough (cleansing, masking, reformatting). To mine big data, you must smelt it first. And only IRI does all of this in one Eclipse GUI.
For more than three dozen years, IRI has been the proven performer for preparing and manipulating multiple data sources data across industries, geographies, and Unix/Windows platforms. Find out why you may only need:
- one affordable product, either CoSort or Voracity, both of which use:
- one simple place, an explicit 4GL job script (supported in a free Eclipse GUI), and ...
- one I/O pass, that combines data transformation, conversion, protection, and reporting.
Here's what you can do with IRI (click to enlarge):
Big Data Protection - mask, encrypt, pseudonymize, de-ID, hash, tokenize, etc. data as you transform and provide it.
Big Data Provisioning - bulk load DBs with pre-sorted files, create replicas and federated views, franchise (munge, subset, wrangle) for BI/analytic tools, generate reports (via CoSort job scripts or ODA into BIRT), and create test data.
CoSort and its spin-offs use a simple, self-documenting 4GL program called SortCL for data definition, manipulation, masking and reporting; i.e. metadata for data sources, targets, jobs, and performance controls.
SortCL jobs are designed and managed through a choice of UIs all supported in a single Eclipse IDE. Share, version-control, secure, and run your jobs from the GUI, or build them into batch scripts, applications, or distributed computing environments like Hadoop in Voracity for even more speed.
Browse this section and its links for more details, or request a free trial.
Did You Know?
CoSort, typically used for data transformation, staging and reporting, can also do what its spin-offs do; i.e. data migration (IRI NextForm), data masking (IRI FieldShield), and test data generation (IRI RowGen).
IRI Voracity uses the same metatada and Eclipse GUI as CoSort and its spin-offs, but also lets you design and schedule jobs with state-of-the-art ETL worfklow and built-in automation tools, and run them with CoSort or Hadoop engines.