"The use of pre-sorted summaries (aggregates) is the single most effective tool the data warehouse designer has to control performance."
"The Data Warehouse Toolkit" - Dr. Ralph Kimball
The problem is speed in volume. SQL "group by" aggregations run in databases for star schemas and ETL operations drag as data volumes increase.
The SortCL program in the IRI CoSort package and IRI Voracity platform computes huge fact-table, drill-down, and roll-up aggregates with extraordinary ease and efficiency. SortCL combines parallel summarization with high-volume sort, join, and report operations in the same job script and I/O pass.
Use SortCL to rapidly produce output values derived from accumulated detail records -- at multiple break levels (including the final aggregate). The related functions are:
- Summarization (Totals)
- Standard Deviation
Display final values at the end of a file; i.e. roll-up aggregations, and exploit SortCL wherever large, simultaneous sorting and grouping work is required.
Group data based on inter- and intra-record Boolean break conditions for EIS summary, or drill-down analyses on many types of numeric data. This is a great feature for writing detail and summary reports as well as aggregate fact table records.
Reflect transaction aggregates based on various field combinations (e.g., total sales by SKU, employee, and location) by spinning off individual sales reports.
Format summary records differently at each level. Write each level to a separate target table or file, and merge the sub-levels into a structured report.
Other options include running (accumulating) aggregates, Sybase-style windowed aggregates, and aggregates on cross-calculated values. Use these features for ad hoc presentations and complex trend analyses, with (or without) all of the other transformations and formatting options.