Big Data Integration


Request Info
Request Info
Overview Speed Your ETL Tool Leave Your ETL Tool Build New ETL Beyond ETL
Problem #1: Speed

Most data integration jobs are performed in legacy ETL or ELT tools that rely on compiled Java programs or inefficient in-DB transforms. Job design and execution time suffers. So do all the downstream queries and applications that depend those jobs. ETL via Hadoop or in-memory DBs may not spin up quickly enough or run fast enough. 

benefits and importance of data integration

Problems #2 & 3: Cost & Complexity

Beyond speed and scalability are affordability and ease-of-use. Hundreds of thousands of dollars and many months are spent building and supporting jobs in legacy ETL suites. Long consulting engagements are needed to manage complex data architectures and ETL workflows. Simpler tools claiming fast on-boarding are limited in scalability and capability.

The ways those platforms discover, track, govern, and persist data -- and manage data and master data -- are inadequate, cryptic, or inefficient.

How do People Respond?

  • Procrastination - chancing SLA-restricted operations on shrinking production windows
  • Betting - adopting complex tech like Hadoop, in-memory DBs, or proprietary ELT appliances
  • Partitioning - transforming data in multiple chunks and stages instead of a single-step
  • Open-Source - needing more hardware and consulting to overcome inherently slow engines
  • The Cloud - adding security and bandwidth concerns to existing functional challenges

 Real World Solutions

You can now bend the cost curve of legacy ETL vendors and multi-tool complexity with a modern, all-in-one platform for data discovery, integration, migration, governance and analytics. IRI Voracity is a proven high-performance platform for building new data integration environments, accelerating your current ETL tool, or automating your move away from it.



job design in a modern ETL platform.



your legacy ETL tool without replacing it.



your overpriced ETL vendor, automatically!

Voracity is not only ideal for fast, affordable ETL operations. It is a future-proof solution stack for big data discovery, integration, migration, governance, and analytics on structured, semi-structured, and unstructured sources.

Voracity uniquely combines the proven power of IRI CoSort or Hadoop with seamless metadata design and deployment options in Eclipse. In fact, Voracity has more job design, deployment, and licensing options than any other data integration tool.

Learn how Voracity will help you build or improve your data integration paradigm

This diagram "reflects" how Voracity combines the best attributes of the large legacy ETL vendor tools on the market:

Finally: a single, simple, affordable place to move and use big data rapidly.
What is Voracity?

Voracity is an affordable one-stop-shop and GUI for almost every conceivable data management requirement. It's also a standalone ETL and data life cycle management 'platform product'.

Voracity saves money on software, hardware and consulting resources, while expanding your enterprise information management (EIM) capabilities in support of digital business initiatives -- all from one pane of glass.

What Does Voracity Include?

All the features/functions listed below are supported in the IRI Voracity data management platform and constituent IRI Data Manager suite products.

GUI refers to the IRI Workbench Graphical User Interface. IRI Workbench is a free Integrated Development Environment (IDE), built on Eclipse,™ for integrating and transforming data with the SortCL program in Voracity, IRI CoSort, and all other IRI software.

DTP refers to the Data Tools Plugin (and Data Source Explorer) in the IRI Workbench. DDF refers to Data Definition Files, the metadata for source and target data layouts.

Operation Description
Discover data in pattern searches through DBs, files, and "dark data" documents. Perform traditional DB profiling and E-R diagramming on connected tables.
Create and modify jobs in multiple ways: a Sirius visual workflow palette, end-to-end wizards, GUI dialogs, and batchable 4GL scripts that are modeled and outlined in the GUI's syntax-aware editor and compatible with any external text editor.
All structured data assets (including RDBs, LDIF, CSV, XML, COBOL, and other sequential files) available in the project explorer, data source explorer, and remote systems explorer. Support is also available for mainframe index files, unstructured data data file formats, ASN.1-compatible CDRs, multiple legacy/proprietary formats, and soon, big data and cloud/SaaS platforms; see the complete list here.
Job Wizards
Specify extract, transform, load, reorg, report, and test data generation jobs.
High performance, standalone and integrated ETL steps through IRI FACT, CoSort, and bulk DB loaders in Voracity (as well as more real-time via ODBC select/update). Coming are seamless Voracity options for Hadooptransformation processing via MapReduce, Spark, Storm, and Tez.
High performance "E" and pre-sorted "L". Design/manage "T" in DTP SQL editor
Generate detail and summary reports in the same-pass, or hand-off data to BIRT, et al.
Encrypt, mask, de-ID, encode, hash, randomize, pseudonymize, tokenize, blur, or redact PII.
Improve data quality with a variety of data scrubbing and standardization techniques.
Acquire, filter, subset, re-map and/or copy data from old to new data stores.
Version & Compare
Update, check-in, manage, and share metadata and jobs in GIT, or other SCCS.
For DDF metadata, master data formats, set (lookup) files, rules, flow, and job scripts
Data Views
Multiple editor and cell display format supports for tables, files, and report formats
Use static or create dynamic schemas via target mapping and table creation options.
Data mapping and search functions support manual data lineage and impact analysis. Track and compare metadata and other resources (scripts, rules, templates) in version control hubs.
Job Fragments
Save, reference, and re-use job and metadata subsets in standalone, portable .DDF files.
Compare files or tables to identify, report on, and feed updates for smaller, real-time ETL.
Report on values from "fuzzy" lookup logic where they satisfy 'other than equal' criteria.
Windowed Aggregates
Perform aggregation within specified row ranges for fair cost accounting and other apps.
Define, store, and re-use field-level business rules for data transformation, protection, and test data generation.
We already spent a fortune. Can you help us just run these ETL jobs faster?

We understand that, and have been accelerating ETL tools (especially Informatica and DataStage transforms) for years.

To accelerate third-party ETL and BI/analytic tools, as well as DB operations, use IRI's scriptable, batchable transform engine(s) alongside -- and amplify the return on your investment in -- these platforms:

ETL Tools

ETI Solution
IBM DataStage
Informatica PowerCenter
Microsoft SSIS
Oracle Data Integrator
Pentaho Data Integrator

BI Tools


Analytic Tools



SQL Server

Run Voracity jobs from your tool's command-line (shell) option to prepare big data faster, and populate the DB tables or file formats your tool can directly ingest.

You'll be using the same IRI software engines Voracity does: IRI FACT for extraction, IRI CoSort(or Hadoop) for sort/join/aggregate transformation, IRI FieldShield for data masking, and/or IRI RowGen for synthetic test data.

Can we replace our legacy ETL tool automatically?

Well, now you can. Voracity is API-integrated with AnalytiX DS' metadata hub technology so you can convert from legacy ETL products more or less automatically.

Contact your IRI or ADS representative and ask about CatFX templates for Voracity from your current ETL tool, along with any LiteSpeed Conversion services you need to help port and test the more complex mappings.

Whether you're switching ETL platforms or just starting out in data integration, use Voracity to shrink time to deployment and time to information delivery.

Request More Information

* indicates a required field.
IRI does NOT share your information.