Chat
Request Info
Download
Overview Speed Your ETL Leave Your ETL Build New ETL Discover Design Deploy Support
Challenges in Data Integration

Most data integration (DI) and staging jobs rely on inefficient in-DB transforms or compiled Java programs. ETL performance suffers, and time-to-solution from data segmentation, DB queries, business intelligence (BI), and dependent downstream applications grows. Speedier, more scalable solutions in Hadoop or in-memory platforms are too complex, too costly, or both.

benefits and importance of data integration

How do People Respond?

  • Procrastination - chancing SLA-restricted operations on shrinking production windows
  • Betting - adopting complex tech like Hadoop, in-memory DBs, or proprietary ELT appliances
  • Partitioning - transforming data in multiple chunks and stages instead of a single-step
  • Open-Source - needing more hardware and consulting to overcome inherently slow engines
  • The Cloud - adding security and bandwidth concerns to existing functional challenges

Other Concerns

Beyond speed and scalability are  affordability and ease-of-use. Hundreds of thousands of dollars and many months are spent getting up to speed in mega-vendor suites. Long consulting engagements are needed to mange complex data architectures and ETL workflows. Simpler tools claiming fast on-boarding are limited in performance and capability.

The ways those platforms discover, track, govern, and persist data -- and manage data and master data -- are inadequate, cryptic, or inefficient.


Solutions

Conceive

Simplify

with a modern DI tool

Speed

Supercharge

a slow DI tool

Leave

Re-Platform

an overpriced DI tool

The IRI Voracity platform for data discovery, integration, migration, governance, and analytics is a robust ETL platform for structured, semi-structured, and unstructured data that combines the proven power of CoSort and Hadoop with multiple metadata design and management options in Eclipse.

This diagram "reflects" how Voracity combines the best attributes of the large legacy ETL vendor tools on the market:

Voracity addresses the functional and performance aspects of big data DI and ETL, without the complexity or costs typical of the alternatives. Finally: a single, simple, affordable place to move and use big data rapidly.
What is Voracity?

Voracity is an affordable one-stop-shop and GUI for almost every conceivable data management requirement. It's also a standalone ETL and data life cycle management 'platform product'.

Voracity saves money on software, hardware and consulting resources, while expanding your enterprise information management (EIM) capabilities in support of digital business initiatives -- all from one pane of glass.

What Does Voracity Include?

All the features/functions listed below are supported in the IRI Voracity data management platform and constituent IRI Data Manager suite products.

GUI refers to the IRI Workbench Graphical User Interface. IRI Workbench is a free Integrated Development Environment (IDE), built on Eclipse,™ for integrating and transforming data with the SortCL program in Voracity, IRI CoSort, and all other IRI software.

DTP refers to the Data Tools Plugin (and Data Source Explorer) in the IRI Workbench. DDF refers to Data Definition Files, the metadata for source and target data layouts.

Operation Description
Discover data in pattern searches through DBs, files, and "dark data" documents. Perform traditional DB profiling and E-R diagramming on connected tables.
Create and modify jobs in multiple ways: a Sirius visual workflow palette, end-to-end wizards, GUI dialogs, and batchable 4GL scripts that are modeled and outlined in the GUI's syntax-aware editor and compatible with any external text editor.
Connectors
All structured data assets (including RDBs, LDIF, CSV, XML, COBOL, and other sequential files) available in the project explorer, data source explorer, and remote systems explorer. Support is also available for mainframe index files, unstructured data data file formats, ASN.1-compatible CDRs, multiple legacy/proprietary formats, and soon, big data and cloud/SaaS platforms; see the complete list here.
Job Wizards
Specify extract, transform, load, reorg, report, and test data generation jobs.
High performance, standalone and integrated ETL steps through IRI FACT, CoSort, and bulk DB loaders in Voracity (as well as more real-time via ODBC select/update). Coming are seamless Voracity options for Hadoop transformation processing via MapReduce, Spark, Storm, and Tez.
High performance "E" and pre-sorted "L". Design/manage "T" in DTP SQL editor
Generate detail and summary reports in the same-pass, or hand-off data to BIRT, et al.
Encrypt, mask, de-ID, encode, hash, randomize, pseudonymize, tokenize, blur, or redact PII.
Improve data quality with a variety of data scrubbing and standardization techniques.
Acquire, filter, subset, re-map and/or copy data from old to new data stores.
Version & Compare
Update, check-in, manage, and share metadata and jobs in GIT, or other SCCS.
Repositories
For DDF metadata, master data formats, set (lookup) files, rules, flow, and job scripts
Data Views
Multiple editor and cell display format supports for tables, files, and report formats
Schema
Use static or create dynamic schemas via target mapping and table creation options.
Lineage
Data mapping and search functions support manual data lineage and impact analysis. Track and compare metadata and other resources (scripts, rules, templates) in version control hubs.
Job Fragments
Save, reference, and re-use job and metadata subsets in standalone, portable .DDF files.
Compare files or tables to identify, report on, and feed updates for smaller, real-time ETL.
Report on values from "fuzzy" lookup logic where they satisfy 'other than equal' criteria.
Windowed Aggregates
Perform aggregation within specified row ranges for fair cost accounting and other apps.
Rules
Define, store, and re-use field-level business rules for data transformation, protection, and test data generation.
We already spent a fortune. Can you help us just run these ETL jobs faster?

We understand that, and have been accelerating ETL tools (especially Informatica and DataStage transforms) for years.

To accelerate third-party ETL and BI/analytic tools, as well as DB operations, use IRI's scriptable, batchable transform engine(s) alongside -- and amplify the return on your investment in -- these platforms:

ETL Tools

ETI Solution
IBM DataStage
Informatica PowerCenter
Microsoft SSIS
Oracle Data Integrator
Pentaho Data Integrator
Talend

BI Tools

BIRT
BOBJ
Cognos
Excel
MicroStrategy
QlikView
OBIEE

Analytic Tools

JupiterOne
R
SAS
SpotFire
Splunk
Spotfire
Tableau

Databases

DB2
Greenplum
MySQL
Oracle
SQL Server
Sybase
Teradata

Run Voracity jobs from your tool's command-line (shell) option to prepare big data faster, and populate the DB tables or file formats your tool can directly ingest.

You'll be using the same IRI software engines Voracity does: IRI FACT for extraction, IRI CoSort (or Hadoop) for sort/join/aggregate transformation, IRI FieldShield for data masking, and/or IRI RowGen for synthetic test data.

Can we replace our legacy ETL tool automatically?

Well, now you can. Voracity is API-integrated with AnalytiX DS' metadata hub technology so you can convert from legacy ETL products more or less automatically.

Contact your IRI or ADS representative and ask about CatFX templates for Voracity from your current ETL tool, along with any LiteSpeed Conversion services you need to help port and test the more complex mappings.

Whether you're switching ETL platforms or just starting out in data integration, use Voracity to shrink time to deployment and time to information delivery.

Request More Information

* indicates a required field.
IRI does NOT share your information.