Matching data between large tables for query, reporting, or virtualization purposes can take a long time. SQL join functions are typically inefficient in large scale data integration (unification) operations. Custom programs designed to bring unlinked items together may also be slow, or difficult to maintain.
You may also need a fast and easy way to compare two or more files over one or more fields. How do you do that and identify the changes that have occurred (inserts, updates, deletes) between two files, especially when data are in different file formats, or across tables in different databases?
The Sort Control Language (SortCL) program in the IRI CoSort data transformation package and IRI Voracity data management (ETL) platform can simultaneously filter, sort, join, aggregate, and reformat multiple table and file sources at once.
SortCL uses simple, explicit 4GL text files to define data sources, targets, and transformations. Automatic script creation, cross-platform execution, modification, and management are supported in the free Eclipse GUI, IRI Workbench.
SortCL supports inner and outer join functionality to produce combined outputs and file compares based on specified conditions. Input, join, and output one or more pre-sorted or unsorted tables and/or files.
Eliminate inner join results from an outer join. Eliminate and reformat null records.
In the same, simple job script and I/O pass, cross-calculate and derive new values from matched results. Add field-level data masking functions to sensitive fields. For output, custom-define multiple detail and summary report targets, and hand-off selected information in different formats for data visualization tools.
The bottom line? Joining big data in SortCL allows you to compare files and table data externally, capture changed data, produce business intelligence from it, and reduce database query and refresh overhead.
Did you know that IRI CoSort was the first data management product to join flat files?
IRI introduced joins in SortCL in 1999.