SortCL metadata begins with data definition files (DDF) that specify the structure and appearance of disparate, structured data sources and targets -- usually files or database tables.
Create .DDF files manually, or automatically in the IRI Workbench GUI's metadata discovery wizard. Convert third-party file layouts to SortCL DDF from command line utilities in the CoSort package or the GUI.
Simple Data Definition
A .DDF file for each data source contains a list of column or field specifications, for example:
/FIELD=(Lastname, POSITION=1, SIZE=14, TYPE=EBCDIC)
Such straightforward, self-documenting syntax is easy to learn, use, and audit. These same "/FIELD=" statements identify column:
- Names or aliases
- Sizes or ranges
- Positions or delimiters
- Data types, substrings, and composite structures
- Conditions, expressions, and lookups
- Aggregation, cleansing, security, and other functions
SortCL job specification files (.SCL), or scripts, define all the mappings and manipulations of the fields. Their field (column) layouts are typically stored in reusable DDF repositories. You can also specify /FIELD layouts directly in the input (source), inrec (pre-action layout), or output (target) phase of any IRI job script.
SortCL jobs use source field names as the symbolic references needed to remap fields and modify their attributes, for example:
/JOIN Left Right WHERE Left.Lastname=Right.Lname
Data & Application Independence
SortCL metadata also supports data and application independence because the job commands can be separated from the (centralized, re-useable) DDF field statements. Multiple levels of data and application independence can occur through nested specifications and references.
Multiple target layouts within the same job can be specified to reflect, and output, multiple views of the data.
Data architects can share, version control, secure, and track the lineage of this metadata in distributed EGit or similar repositories, when linked through the Eclipse GUI (IRI Workbench) managing SortCL or other IRI jobs.
SortCL metadata also supports master data definition and management. Existing or custom-defined master data formats held in files or tables can be specified in .DDF repositories, and then used in SortCL applications.
From a compliance standpoint, SortCL metadata supports field-level protection functions like masking and encryption while you are preparing data for analysis. SortCL metadata, and its inclusion in audit logs, allow you to map these processes, and help satisfy the requirements of your risk and controls framework.
SortCL's metadata and operational infrastructure allow you to replace multiple processes with one product, job script, and I/O pass. Simplicity, transparency, compliance, and auditability go along for the ride.