Chat
Request Info
Download
Overview Extract Transform Load Migrate Protect Report Flat-File ETL ELT

In ETL (extract, transform, load) operations, data are extracted from different sources, transformed separately, and loaded to a data warehouse (DW) database and possibly other targets:

http://www.iri.com/blog/wp-content/uploads/2012/11/ETL.jpg

ETL tools and technologies today are relatively mature, and predictable. IRI's ETL approach is special for a number of reasons (scroll left<>right on the photo):

Speed

Simplicity

Versatility

Ergonomics

Extensibility

Economics

Hover over the images to the left.

Speed

IRI FACT (Fast Extract) uses native drivers to unload huge tables in parallel to flat files or pipes.

IRI CoSort takes the output of FACT from a file or in-memory stream (pipe) and does the heavy lifting of data transformation, load pre-sort, and reporting all in the same job script and I/O pass.

The IRI Voracity total data management platform combines FACT, CoSort, and bulk DB load utilities in a visualized, scheduled ETL workflow that does not require compilation or partitioning. It can even seamlessly run CoSort jobs in MapReduce, Spark, Storm, or Tez instead.

Compare all this to slower, more verbose SQL and 3GL programs, and to costlier, more complex ETL and ELT platforms ... not to mention the onboarding delays of disjointed Apache projects.

Simplicity

ETL Metadata and job definition are automated in the IRI Workbench GUI for Voracity, built on Eclipse™. Data discovery and new job wizards, and a number of visual ETL job design options, speed-build reusable repositories and scripts without requiring an education in new syntax.

Nevertheless, Voracity metadata is the easiest in the IT industry to learn and use. It uses the same human-readable 4GL of CoSort -- called SortCL -- that leverages familiar data layout syntax, SQL manipulation concepts, and shared metadata repositories. Many users still prefer to code and tweak these simple scripts directly.

Versatility

Beyond extremely fast extract/load, and one-pass, no-partitioning-needed data transformations, the Voracity ETL environment includes:

  • Change Data Capture
  • Dark Data Search/Exract/Structure
  • Database and File Profiling
  • Data Masking, Encryption, etc.
  • Data Migration and Replication
  • Data and Metadata Discovery
  • Detail and Summary Reporting
  • Master Data Management
  • Metadata Management & Lineage
  • Offline Reorgs
  • Slowly Changing Dimensions
  • Test Data Generation

Voracity supports these activities on a very broad range of structured, legacy, big data, cloud and SaaS data sources.

Ergonomics

Create all the E, T, and L jobs in the IRI Workbench GUI for Voracity, built on Eclipse™. Edit the jobs or workflow in palettes, GUI dialogs, syntax-aware script editors (or any text editor you prefer), or the AnalytiX DS Mapping Manager. You have the ergonomic flexibility of working the data definitions and manipulations visually or through scripting; anything done in one feeds the other.

Test or run jobs individually or together in the GUI flow, or later in a (scheduled) batch operation. You have that execution flexibility because the job scripts are portable. You can run any of the pieces, or the whole project, on any platform where the engine(s) are licensed. Call them from the command line or any application.

Extensibility

The IRI Workbench GUI for Voracity delivers the visual metadata creation, conversion, and discovery tools you need to generate, deploy, and manage the job scripts, data definition files (DDF), and XML workflows common to all IRI software.

In the same place, you can also design and run COBOL, C/C++, Hive, Impala, Java, Perl, Python, R, SQL, and other programs supported in Eclipse, and sometimes incorporate them as steps in your Voracity workflow.

You can also use the CoSort's SortCL program in Voracity to optimize transforms for other ETL tools like Informatica and DataStage.

Economics

Voracity is far more than an ETL tool, yet is priced below most of them. Even if you don't use it for ETL, because its SortCL program can join across many sources and query data in flat files, Voracity continues in the CoSort tradition as one of the least expensive change data capture, and NoSQL query paradigms available.

For serious ETL architects however, Voracity's consolidation and multi-processing of transformations in the file system or Hadoop makes it the most cost-effective big data processing alternative to DB appliances, Ab Initio, SyncSort, Teradata, and in-memory DBs.

Finally, with its freemium editions, low-cost opex subscription tiers, and relative simplicity, Voracity is the most affordable data management platform to on-board and maintain.

Request More Information

* indicates a required field.
IRI does NOT share your information.