Many big data integration (DI) and staging jobs run too slowly, and bog down databases (DBs) with internal transformations. Query performance suffers, and time-to-solution for data segmentation, business intelligence (BI), and dependent downstream applications grows.
And what are the most common responses?
- Procrastination - chancing SLA-restricted operations on shrinking production windows
- Early betting - on complex tech like Hadoop, in-memory DBs, or proprietary ELT appliances
- Partitioning - transforming data in multiple chunks and stages instead of a single-step
- Open source ETL - requiring more hardware, consulting, and inefficient in-DB transforms
- The Cloud - which adds security and bandwidth concerns to existing functional challenges
Beyond speed and scalability concerns lie financial ones, and ease-of-use. It takes hundreds of thousands of dollars (or more), plus months to get up to speed with mega-vendor platforms. Lengthy consulting engagements are typically required to master complex data architectures and ETL tools. Simpler ETL tools claiming fast on-boarding are limited in performance and capability.
And in the end, the ways those platforms discover, track, govern and persist data -- and manage data and master data -- are typically inadequate, cryptic, and/or inefficient.
The IRI Voracity platform for data discovery, integration, migration, governance, and analytics is a new ETL platform for structured, semi-structured, and unstructured data that combines the power of CoSort and Hadoop with multiple metadata design and management options in Eclipse.
Voracity is an affordable, one-stop-shop and GUI for almost every conceivable data management requirement. It's also a standalone ETL and data life cycle management 'platform product'.
Voracity saves money on software, hardware and consulting resources, while expanding your enterprise information management (EIM) capabilities in support of digital business initiatives -- all from one pane of glass.
Voracity integrates and stages data in your existing file system with the IRI CoSort engine, or in HDFS with Hadoop MR2, Spark, Storm, or Tez. Either way, mapping and formatting is defined in the IRI Workbench GUI, and serialized in XML metadata and batch scripts that run on Unix, Linux, Windows systems, or Hadoop clusters.
Voracity workflows contain (at a minimum) simple, wizard- or diagram-built SortCL jobs to define data and its mapping. SortCL and its surrounding ecosystem in Eclipse (IRI Workbench) deliver fast, affordable ETL, and the ability to:
- Discover data in multiple sources with profiling and metadata definition tools
- Blend disparate data sources and stage them into multiple targets at once
- Transform, pivot, map, report, and protect data in the same I/O pass
- Choose the best performance (CoSort or Hadoop) strategy for each workflow
- Capture, act, and report on changed data and handle slowly changing dimensions
- Filter, de-duplicate, cleanse, validate, and otherwise improve data quality
- Migrate data types, files, and databases
- Federate, replicate, subset
- Assess, unify, and create composite or master data values and formats
- Create test database and ETL/ELT test data
- Follow business rules and data privacy laws (mask PII)
- Optimize database unloads, loads, and reorgs
- Design basic or advanced reports, or feed and speed other BI tools
We understand that, and have been accelerating ETL tools (especially Informatica and DataStage operations) for years.
|ETL Tools||BI Tools||Analytic Tools||Databases|
|Oracle Data Integrator||MicroStrategy||Splunk||SQL Server|
|Pentaho Data Integrator||QlikView||Spotfire||Sybase|
Run Voracity jobs from your tool's command-line (shell) option to prepare big data faster, and populate the DB tables or file formats your tool can directly ingest.
You'll be using the same IRI software engines Voracity does: IRI FACT for extraction, IRI CoSort (or Hadoop) for sort/join/aggregate transformation, IRI FieldShield for data masking, and/or IRI RowGen for synthetic test data.
Well, now you can. Voracity metadata is API-integrated with AnalytixDS' metadata hub technology so you can convert from legacy ETL products automatically.
Contact your IRI or ADS representative and ask about CatFX templates for your ETL tool, along with available LiteSpeed Conversion services to port and test the more complex mappings.
Whether you're switching ETL platforms or just starting out in data integration, use Voracity to shrink time to deployment and time to deliver.