Value-Level Transformation Solutions for Format Masking, Surrogate Keys, Constants, etc.

Home » Solutions » Data Transformation » Formats/Keys

Quick Links

Overview Aggregate Cross-Calculate Custom Transforms Formats/Keys Lookup Match/Join Pivot/Unpivot Data Remapping Scrub/Cleanse Select/Filter Sort/Merge Substrings Type-Convert

Challenges

There are many lower-level, data-value-related problems that ETL architects face, such as:

Issue	Examples or Details
Applying masks to data formats to represent or re-interpret values	ID'inggender, 'F' or '2' or masking dates: mm/dd/yy to dd-mm-yyyy
Homogenizing inconsistent production data keys	assigning unique numeric surrogates for recognition and faster indexing
Pivoting and unpivoting	transposing columns to rows and vice versa without duplicates
Inserting constants or literal strings	replacing or adding substitute values to protect or pre/re-label data
Setting or labeling values	evaluating a condition to determine a default label like 'NULL'
Validating values	determining if the inbound field is printable, or of a certain data type
Re-casting values to conform to reference standards	unifying different data representations (like country names)
Slowly changing dimensions in source tables	propagating new values (like a price change) in warehouse tables

Solutions

The IRI Voracity ETL/ data management platform and the IRI CoSort tool powering Voracity address all the above data handling issues, plus:

key and other data transformations
data type and data format conversion
basic and advanced report generation
PII masking and custom test data synthesis

through their common 4GL data manipulation program called SortCL , which is also supported in their Eclipse GUI called IRI Workbench.

With SortCL, you can:

Remap dates and other data formats using new composite values, and re-usable data types
Re-assign source key names to adhere to new ontologies
Pivot (denormalize) for efficient queries, and unpivot(normalize) before loading data to the warehouse
Use in-line 'data' statements to add strings to any location, and with custom repetition counts
Use conditional data or field logic to label qualifying values, like inserts vs. deletes vs. updates, etc.
Use supplied data validation functions or third-party libraries to assess data characteristics or quality
Use integrated Perl Compatible Regular Expression (PCRE) logic to match and re-assert data patterns
Handle and report on multiple types of slowly changing dimensions and other changed data

IRI data manipulation tools like CoSortand RowGen (for synthetic test data generation) create unique initial or surrogate primary keys with features like:

SEQUENCER, a specially named index field with custom starting and increment values
random data generation, to create random numbers or strings, which can also be sorted and de-duplicated
ROWID, a field with any name but specified as this attribute with features beyond SEQUENCER
UUID (and GUID), a field-level transformation/generation function that can be used in any part of a job

Share this page

Request More Information

Live Chat

* indicates a required field.
IRI does NOT share your information.

Addressing Value-Level Issues

Data Format Conversion, Key Transformation, Constants, etc.

Quick Links

Request More Information

Solutions

Products

Customers

Services

Company

Support

News

Partners

Addressing Value-Level Issues

Data Format Conversion, Key Transformation, Constants, etc.

Quick Links

Request More Information

Follow us on

Get the IRI Newsletter