Data Integration Solutions

 

Next Steps
Overview Analytics CDC Data Quality Federation PII Masking MDM Migration SCD

ETL & Beyond

Does your data integration platform do everything you need it to do, and does it work seamlessly with other critical data management activities?

  • If so, and you're not happy with its performance, price, or complexity
  • If not, and you need to modernize your data integration, governance, or wrangling strategy

then you may wish to reconsider your data integration solutions strategy, and examine the big data ETL tools in the IRI Voracity platform in particular.

All the features/functions listed below are supported in the Voracity data management platform and its included IRI Data Manager and IRI Data Protector suite products.

GUI refers to the IRI Workbench Graphical User Interface for Voracity. IRI Workbench is the widely adopted, user-friendly Integrated Development Environment (IDE) built on Eclipse™ for integrating and transforming data in Voracity.

DTP refers to the Data Tools Plugin (and Data Source Explorer) in the IRI Workbench. DDF refers to Data Definition Files, the simple, open metadata for source and target data layouts.

Operation Description
Discover data in pattern, fuzzy, and dictionary searches through DBs, files, or "dark data" documents. Perform traditional DB profiling and E-R diagramming on connected tables. Auto-classify data into groups and match them to transformation, protection, and other rules.
Create and modify jobs in multiple ways: a visual workflow palette, end-to-end wizards, GUI dialogs, batchable 4GL scripts, and even a metadata API that are all modeled and outlined in the GUI's syntax-aware editor ... even work your flow and task scripts in any external text editor
Connectors
Manage your data assets (including RDBs, LDIF, CSV, XML, COBOL, and other files) from the Eclipse project explorer, data source explorer, and remote systems explorer. Support is also available for mainframe index files, unstructured data data file formats, ASN.1-compatible CDRs, multiple legacy/proprietary formats, and big data and cloud/SaaS platforms; see the complete list here.
Job Wizards
 Automate the generation of your ETL or standalone unload, transform, or load jobs, plus slowly changing dimension, change data capture, pivoting, subsetting, data masking, data migration / replication, and test data generation / population jobs
High performance, standalone or combined ETL operations in Voracity, i.e.,
  1. ODBC (surgical) or IRI FACT (parallel bulk) extracts
  2. Optimized & combined IRI CoSort data transforms
  3. ODBC (surgical) or DB utility (pre-sorted bulk) loads
If you have single sources >10TB, Voracity can also run many CoSort (SortCL) transformation, reformatting, and masking jobs seamlessly in Hadoop MapReduce2, Spark, Spark Stream, Storm, or Tez through the VGrid gateway to your (Cloudera, HortonWorks, MapR, or generic Apache) distribution.
High performance "E" and pre-sorted "L". Design/manage "T" in Voracity (above) or integrated SQL operations
Leverage embedded BI or data wrangling options in Voracity; i.e.:
  1. generate detail and summary reports in the same-pass with ETL et al
  2. feed KNIME via our Voracity data source node, or index Splunk via Voracity app, add-on, or Universal Forwarder
  3. hand-off display-ready subsets to another visualization tool like Cognos, Power BI, Qlik, R, Spotfire, Tableau and many more to speed time- to-insight in their platform where Voracity has done the heavy-lifting for them.

Learn why DW industry guru Dr. Barry Devlin named Voracity a Production Analytic Platform

Encrypt, redact, pseudonymize, hash, randomize, tokenize, or otherwise de-identify PII seamlessly; i.e., data masking on the fly in the same job script and I/O pass with all the ETL, cleansing, migration, and analytic / reporting functions listed on this page. The true 'magic' and value of Voracity is this very kind of task consolidation.
Improve data quality with a variety of data scrubbing and standardization techniques
Acquire, filter, subset, re-map and/or copy data from old to new data stores
Team Share
Update, check-in, manage, and share metadata and jobs in GIT, CVS, SVN, DataSwitch, MIMB, Quest (Erwin / AnalytiX DS) Mapping Manager, etc.
Repositories
Save, share, and re-use DDF metadata, master data dictionaries, business glossaries, set (lookup) files, rules, flow, and job scripts
Data Views
See and work directly with your source and target data in files and tables in custom editor and cell displays
Schema
Use static, create dynamic, or convert schemas via target mapping and table creation options
Lineage
Free Eclipse plug-ins support manual - and Erwin Mapping Manager supports visual - data lineage and impact analysis. Track and compare metadata and other resources (scripts, rules, templates) in version control hubs.
Job Fragments
Save, reference, and re-use job and metadata subsets in standalone, portable .DDF files, rule libraries, and other open artifacts
Transpose rows to columns and columns to rows to de-normalize or normalize your data efficiently through an easy wizard
Compare files or tables to identify, report on, and feed updates for smaller, real-time ETL using an intuitive job wizard
Report on values from "fuzzy" lookup logic where they satisfy 'other than equal' criteria in all the common types from one wizard
Windowed Aggregates
Perform aggregation within specified row ranges for fair cost accounting and other apps
Rules
Define, store, and re-use field-level business rules for data transformation, protection, and test data generation
Prototype & Test
Generate and load safe, realistic, and referentially correct test data in file or table targets -- without real data -- for an entire EDW in Voracity's built-in IRI RowGen wizard(s). Or, use Voracity's built-in DB subsetting wizard to filter and mask referentially correct DB test sets. Or, preview the output of ETL and other workflow tasks with real data, or immediately simulated test data in the same format.

Frequently Asked Questions (FAQs)

1. What is data integration and why is it important?
Data integration is the process of combining data from different sources into a unified view to support analytics, reporting, and business operations. It’s critical for eliminating silos, improving data quality, and enabling faster, more informed decisions.
2. How does IRI Voracity differ from traditional ETL tools?
IRI Voracity combines high-speed ETL, data masking, data cleansing, migration, test data generation, and reporting in the same job and I/O pass. This consolidation reduces complexity, boosts performance, and lowers costs compared to multi-tool ETL environments.
3. What ETL operations are supported in Voracity?
Voracity supports ODBC and high-speed FACT-based extraction, CoSort-powered transformations, and ODBC or DB-native bulk loads. It also supports textual ETL using DarkShield searches through unstructured files, and Hadoop and Spark environments through a Voracity option called VGrid for horizontally scaled ETL operations.
4. How can IRI Voracity help with ELT strategies?
Yes. Voracity can be used for extract and load (E & L) tasks, while managing or designing transformations (T) through embedded logic or external SQL. Its flexibility supports both ETL and ELT workflows based on organizational needs.
5. Can Voracity perform data masking during integration?
Yes. Voracity supports seamless data masking—such as encryption, pseudonymization, redaction, and tokenization—during ETL jobs. You can apply data protection rules on-the-fly as part of the same job script and processing pipeline.
6. How does Voracity handle data quality improvement?
Voracity supports robust data cleansing capabilities, including validation, enrichment, formatting, standardization, and de-duplication, all within its data transformation workflow.
7. What data sources and formats are supported by Voracity?
Voracity supports a wide range of structured and semi-structured formats including RDBs, LDIF, COBOL, CSV, XML, JSON, and proprietary formats. It also connects to cloud, SaaS, mainframe, legacy, and big data systems. See https://www.iri.com/products/workbench/data-sources for more information.
8. What is IRI Workbench and how does it assist data integration?
IRI Workbench is the Eclipse-based GUI for Voracity that provides visual workflows, job wizards, and a syntax-aware editor for job design, data profiling, transformation, masking, and analytics—all in one unified environment.
9. How does Voracity support change data capture (CDC)?
Voracity includes CDC job wizards that compare files or tables to detect and report changes. These can feed incremental updates into downstream systems or drive smaller, faster ETL workflows.
10. Can Voracity help with slowly changing dimensions (SCD)?
Yes. Voracity provides a wizard to handle all common types of slowly changing dimensions using fuzzy lookup logic and supports historical tracking of dimensional data changes.
11. What types of analytics can be performed within Voracity?
Voracity supports embedded BI functions like reporting and summarization, feeds to platforms like KNIME and Splunk, and output formatting for tools like Tableau, Power BI, and Qlik. It enables faster time-to-insight by performing prep and transformation ahead of external analytics tools.
12. How does IRI Voracity manage metadata and data lineage?
Voracity manages metadata using open DDF formats, supports version control via Git, SVN, and other tools, and provides lineage tracking through free Eclipse plug-ins or integration with Erwin Mapping Manager.
13. What is the role of job wizards in Voracity?
Job wizards in Voracity help automate the creation of common workflows—ETL, data masking, test data generation, change data capture, subsetting, pivoting, and more—reducing development time and user errors.
14. Can Voracity generate synthetic test data?
Yes. Voracity includes built-in wizards from IRI RowGen that generate safe, realistic, and referentially correct test data from scratch. It also supports masking and subsetting for test data from existing databases.
15. What types of organizations benefit most from Voracity?
Voracity is ideal for organizations seeking to modernize or consolidate ETL, data masking, quality, and analytics into one platform—especially where performance, affordability, and governance (compliance) are priorities.
Share this page

Request More Information

Live Chat

* indicates a required field.
IRI does NOT share your information.

X

Try Voracity Free

Speed, leave, or save on ETL jobs


Get Info See Demo