Home » Solutions » Data Transformation
High-Volume Data Transformation 
Run Multiple Manipulations in One Fast Pass

How do you handle high-volume data transformation today? ETL tools, Perl scripts, custom programs, and PL/SQL procedures can be expensive and hard to maintain. Design, coding and production can be slow. Instead, you want to exploit a certain number of CPUs for big jobs, run several tasks in the same I/O, and dynamically allocate resources to optimize performance. Does your software help you do that?

The popular SortCL tool within IRI's CoSort package does the heavy lifting in the world's largest data warehouses, operational data stores and clickstream data webhouses. SortCL sorts, joins, and aggregates massive files, speeding data warehouse operations, database reorgs, ranking, searching, and matching (via joins, lookups and PCREs). At the same time, SortCL can create custom-formatted output reports and hand off pre-processed subsets that your data marts and BI tools can handle.

You can also use the CoSort SortCL tool for high volume transformations (and loads) alongside ETL tools like DataStage and Informatica. The CoSort package also provides exclusive 'plug-in' technology to replace their native sorts with powerful CoSort V9 engines on Unix and Windows.

Designed for high-volume data integration, staging AND reporting on files, and other sequential data sources, SortCL performs, combines, and accelerates these transformations:

Select/Filter
Sort/Merge
Match/Join
Aggregate
Cross-Calculate
Re-Map/Reformat
Scrub/Cleanse
Substrings
Table Lookups
Type-Convert
Encrypt/De-ID
User Functions

-- all in a single pass through the data, and often in just one SortCL job script!

SortCL processes large volumes of data in many different files and formats at once. It uses the common field names between your flat and index file sources for the mappings, exposing your data and manipluation definitions in simple text file metadata repositories.

In addition, SortCL simultaneously supports a range of data security functions for protecting data in motion, including: field-level encryption, de-identification, and pseudonymization. For prototyping data and applications, SortCL also randomly generates or randomly selects values from set files to produce safe test data in production file or report formats. SortCL can run these protection and prototyping functions in the same job script and I/O pass with the above transformations!

>> more


Request More Info:

* IRI WILL NOT share this info