How do you currently manipulate VLDB table and flat-file data? ETL tools, Perl scripts, custom programs, and SQL procedures can be expensive and hard to maintain. Complex GUIs and coding syntax make specification difficult, and runtime performance may be lacking.
Can you leverage multiple CPUs and cores for big jobs, run several tasks in the same I/O, and dynamically allocate resources to optimize performance? Are your data and job definitions easy enough for a non-expert to safely modify?
The SortCL program in IRI's CoSort package (supported in Eclipse) does the heavy lifting of data transformation in the world's largest data warehouses, operational data stores and clickstream data webhouses.
Transform big data with speed and simplicity that no other tool or method can match. Transform without burdening the database ... without adding hardware ... without Hadoop or other new paradigms plagued by failure risks, high costs, and steep learning curves.
Optimize ETL operations as you combine sorts, joins, and aggregates in a single job script, partition, and I/O pass. At the same time, de-duplicate and filter, convert and re-map, lookup, rank, (de-)normalize, calculate, shift, encrypt to protect, and mask to re-cast.
Transform large volumes of data in many different table and file sources together. Discover, define, and expose your data and manipulation definitions in simple text file metadata repositories you can manage in the free IRI Workbench GUI, built on Eclipse. Use named fields for the mappings, as you:
- Map sources to targets
- Reduce script sizes and creation times
- Facilitate reorg and ETL operations
- Produce load and file-compare metadata
Perform all the same transforms that slower and more complex SQL procedures or ETL tools do, and be able to perform:
- change data capture
- row-column rotation (pivot/unpivot)
- slowly changing dimension reporting
- star (or snowflake) schema targeting
- static, structured, running, and windowed aggregates
- discrete and operative value lookups
- format mass and other value modifications
- data cleansing
- data protection (masking, encryption, pseudonymization, etc.)
Use Existing Metadata
SortCL and related facilities in the CoSort package accept many third-party data layouts, e.g., DB DDL, COBOL copybooks, CSV, LDIF and XML files, CLF and ELF web logs, and SQL*Loader Control File metadata. SortCL job scripts contain SQL-familiar commands that use and/or reference the layouts.
Meta Integration Technology, Inc. (MITI) also has a metadata model bridge (MIMB) spoke to SortCL's data definition file format. If you have file layouts already defined for popular ETL tools like Informatica or DataStage, MIMB can automatically produce the equivalent layouts for use in SortCL. Accelerate those tools without having to manually redefine your metadata.
Interoperate & Accelerate
SortCL transformations work hand-in-hand with data extraction and loading utilities. SortCL can take piped data from IRI's Fast Extract (FACT) tool, and pipe it pre-sorted into database load utilities like SQL*Loader. SortCL can also connect through ODBC to other databases and Excel to acquire and deliver data.
SortCL transforms can run alongside ETL tools like Informatica and DataStage, to optimize their performance. SortCL jobs run on the command line, in batch scripts, from 3GL programs, via API calls, or in the IRI Workbench GUI, built on Eclipse. Easily embed these transforms to accelerate your applications.
SortCL exploits CoSort's granular performance tuning and flexible CPU licensing. IRI's continuing innovation in parallel data movement, I/O and memory management, data manipulation functionality and consolidation -- along with our meaningful industry partnerships -- keep you at the leading edge of big data transformation.
Browse the Data Transformations that you can perform and combine in SortCL above. Learn even more below.