In ETL (extract, transform, load) operations, data are extracted from different sources, transformed separately, and loaded to a data warehouse (DW) database and possibly other targets:
ETL tools and technologies today are relatively mature, and predictable. IRI's ETL approach is special for a number of reasons:
|Speed in Volume||Fast Extract (FACT) uses native database drivers to unload huge transaction tables in parallel to flat files, where the fastest transformations occur. FACT speeds extraction from Oracle, DB2, SQL Server, MySQL, Altibase, and Tibero. This data can flow through a pipe into CoSort, to do the heavy lifting of data transformation.
Specifically, the CoSort Sort Control Language (SortCL) program runs multi-threaded sorting, joining, aggregation, and all other transforms in one job script and I/O pass. In that same job, generate custom reports and hand-offs. Use the pre-CoSorted output files (or named pipes) to feed direct path loads. This is the fastest possible way to bulk-load a relational target.
Compare all this to slower, more verbose SQL and 3GL programs, and to costlier, more complex ETL tools.
|Flexibility||Create all the E, T, and L jobs in the IRI Workbench GUI, built on Eclipse. You can edit the jobs or workflow in GUI dialogs or syntax-aware script editors (or any text editor you prefer). You have the ergonomic flexibility of working the data definitions and manipulations visually or through scripting; anything done in one feeds the other.
Test or run jobs individually or together in the GUI flow, or later in a (scheduled) batch operation. You have that execution flexibility because the job scripts are portable. You can run any of the pieces, or the whole project, on any platform where the engine(s) are licensed. Call them from the command line or any application.
|Versatility||The options available at each E, T, and L step address more requirements than most people have. Beyond extremely fast unload/load, and one-pass, no-partitioning-needed data transformations, SortCL also handles:
SortCL is also able to optimze transformations for other ETL tools like Informatica. By consolidating and multi-threading transformations in the file system, SortCL is a cost-effective alternative to Hadoop and in-memory DBs. Because SortCL can also join across many sources and query data in flat files, CoSort can be thought of as the first NoSQL platform. It remains among the fastest, simplest, and cheapest storage and retrieval paradigms available.
Metadata and job definition are automated in the GUI. Data discovery and new job wizards build reusable repositories and scripts without requiring an education in new syntax. Plus, that syntax is the easiest in the IT industry to learn and use. SortCL uses a human-readable 4GL that leverages familiar data layout syntax, SQL manipulation concepts, and centralized metadata repositories. Many prefer to code and tweak directly.
SortCL scripts all follow the same logical process flow (input, action, output), are self-documenting, and more concise than competing alternatives. SortCL is faster and easier to maintain than proprietary programs, SQL, shell scripts, and the compiled projects other ETL tools require. At runtime, jobs run in the file system where performance is easier to control. IRI ETL users modify resource use through explicit control scripts and familiar system commands.
|Price||Assuming you could reach similar levels of performance and capability with your current ETL or ELT solution, what does it cost? Consider not only software, but the hardware and services needed to achieve it. With IRI's smaller overhead (no investors), you win here, too. Compare the ROI of IRI solutions to what you may be using or considering now.|