The native Unix sort command is functionally limited and does not perform well (scale) as input volumes increase. Operating system sort verbs cannot:
- sort the largest files efficiently
- handle many data types
- filter, reformat or otherwise transform
- replace legacy sort functions or tools
- meet DW sort, aggregation or join needs
Nevertheless, you're familiar with Unix sort syntax, and may have an investment in jobs using /bin/sort commands. You need a more robust sort engine under the hood on Unix, or the same functionality on Windows.
IRI CoSort packages for Unix, Linux, and Windows include a faster, more robust drop-in replacement for the Unix /bin/sort program.
Use the same Unix sort syntax (but with the CoSort engine) on the command line, or in batch jobs, to reorder huge files in parallel. The CoSort engine outperforms the Unix system sort by several orders of magnitude, scales linearly in volume, and does not fail.
After seeing dramatic improvements in large file sorting speed, CoSort users move onto a more powerful interface - the Sort Control Language (SortCL) program. SortCL combines sorting operations with data:
- Transformation (scrub, sort, join, group, etc.)
- Conversion (data types, record layouts, files)
- Protection (field-level encryption, de-ID, etc.)
- Reporting (custom detail, delta and summaries)
in addition to legacy sort migrations, Oracle unloading, ETL acceleration, and some of the related solutions outlined throughout this site, like: data validation and scrubbing, pattern matching, complex transforms, etc.