Run Multiple Manipulations in One Fast Pass
How do you transform big data today? ETL tools, Perl
scripts, custom programs, and PL/SQL procedures can be expensive and hard
to maintain. Design, coding and production can be slow. Instead, you want
to exploit a certain number of CPUs for big jobs, run several tasks in the
same I/O, and dynamically allocate resources to optimize performance. Does
your software help you do that?
The popular SortCL tool within IRI's CoSort package does the heavy lifting in the world's largest data warehouses, operational data stores and clickstream data webhouses. SortCL sorts, joins, and aggregates massive files, speeding data warehouse operations, database reorgs, ranking, searching, and matching (via joins, lookups and PCREs). At the same time, SortCL can create custom-formatted output reports and hand off pre-processed subsets in CSV and XML format that your data marts and BI tools can handle.
You can also use the CoSort SortCL tool for high volume transformations (and loads) alongside ETL tools like DataStage and Informatica. The CoSort package also provides exclusive 'plug-in' technology to replace their native sorts with powerful CoSort V9 engines on Unix and Windows.
Designed for big data integration, staging
AND reporting on files, and other sequential data sources, SortCL performs, combines, and accelerates
• Table Lookups
• User Functions
-- all in a single pass through the data, and often in just one SortCL job script!
SortCL can process large volumes of data in many different tables and flat files together. It uses the common field names in your sources for the mappings, exposing your data and manipluation definitions in simple text file metadata repositories.
Some of the more specialized functions SortCL users can perform using these transformations include:
- change data capture
- row-column pivoting
- slowly changing dimension reporting
- star (or snowflake) schema targeting
- windowed aggregates
- discrete and operative value lookups
- data cleansing.
In addition, SortCL simultaneously supports a range of data security functions for protecting data in motion, including: field-level encryption, de-identification, and pseudonymization. For prototyping data and applications, SortCL also randomly generates or randomly selects values from set files to produce safe test data in production file or report formats. SortCL can run these protection and prototyping functions in the same job script and I/O pass with the above transformations!
Thank you for sending us a request for information. We will get back to you shortly.