IRI Blog Articles

Diving Deeper into Data Management

 

 

RowGen v3’s Custom Test Data Wizard

by Rob Howard

A test data generator is an important part of the setup process when database architects are creating database and data warehouse operations, prototyping applications, benchmarking different platforms, and outsourcing work formats.  These architects should never use real production data in the testing process because that runs the risk of exposing personally identifiable information, proprietary information, or may not reflect the types of real data that can be encountered in the future.

IRI RowGen allows DB architects to build test data for an entire database in a single operation that supports referential integrity, applies custom formats, and, is safe. The populated test data reflects the kind of production values users would encounter in normal database or ETL operations.

In the latest version of RowGen, there are useful wizards that have been built in to the IRI Workbench GUI (built on Eclipse™) to simplify and streamline test data generation. One of the most useful wizards is the New Custom Data Job wizard. It allows the user to build an individual RowGen job with one or more custom data layouts, and to produce one or more targets all in the same pass  (while sorting on key fields). Users can build their own field/column layouts piece-by-piece, or they can incorporate existing metadata that already describes all or most of the test data structure.

RowGen makes the data as realistic as possible by invoking lists of real values from existing set files (such as from an existing database column of values in a text file), or by using any of the available mini-wizards for creating custom set files on the fly. Set files can consist of any combination of literal values and/or ranges. Values can be selected at random, or the user can ensure that all unique values in a set file are selected from top to bottom (to ensure uniqueness).

RowGen users can also select from a multi-column set file or database table to include valid pairs in the test data. The most basic form for values that the test data generator creates consists of random characters that conform to the declared data type (this does not require set file selection).

In the RowGen test data generation GUI, the vehicle for specifying new fields and their attributes, and managing existing fields, is found in the Layout Fields dialog which is shown below:

Here the user can see basic information about the custom layout, re-position or clear fields, and optionally declare a field separator and frame character. When an existing field is double-clicked, the Test Field Data Attributes dialog opens (see below) where the user can specify attributes for that field, including:  name, position within the record, data type, min- and max-size, upper- vs. lower-case letters, left vs. right alignment, etc. The dialog offers additional options for enhancing field-level realism such as value endianness, and the ability to specify a percentage of null values to be generated among the full set of values.

 

And from within the Test Field Data Attributes dialog, users can click ‘define’ to select an existing, or define a new, set file if they do not want random generation of values based solely on the declared data type.

In the New Custom Data Job wizard, there are several ways for the test data generator to add data realism to a field, including the following set file creation mini-wizards:

  • Range or Literal Values – Allows users to create their own literal and range entries, and save the custom list of entries to a set file. These can take the form of ASCII, NUMERIC, DATE or TIMESTAMP values.
  • Compound Data Values – Allows users to create their own composite values to be saved to a set file. It is a ‘builder’ that allows you to choose from all available data types, existing set files, and other field attribute options to create a customized list of entries in the format that conforms to your business rules, e.g.: part numbers:
1-5000-AQ5|123-45-6789
 B3-4956-ZV3|432-75-60893
 C9-4020-BG8|543-23-6644
  • Set from Column – Allows users to create a set file by selecting one or more columns from a database table/view that they select. Selection of multiple columns for inclusion in the set file is useful for valid pairs (threes, fours, etc.)

By default, all set files created with the above mini-wizards are invoked for the field currently being defined, in addition to being saved for future use, or by other applications in the test data generator.

A special field option called ROWID is also available if the user wants to provide a row number within each record, or some other steadily incrementing or decrementing value such as a customer ID number that could later be used as a key value during the optional sort phase. You can also select from the provided weighted distribution options, or apply a custom algorithm to any field to produce a range of values that best reflects the spread that would be encountered with real data.

Sorting and the Target Field Layout Editor

By default, the test data generator creates unsorted data. However, in RowGen, you can opt to continue to the sort phase while using the New Custom Test Data Job wizard, and declare one or more test data fields as sort keys as demonstrated below:

By default, a single target file is produced to contain the test data, and there is no need to make specifications in the OUTFILE section of a RowGen job script because the layout the users specify in the INFILE section is copied to the OUTFILE section by default. However, you can make changes in the OUTFILE section to one or more fields, such as applying protections, performing cross-calculations, using IF THEN else logic, etc., as shown here:

Functionality featured in the IRI CoSort and IRI FieldShield products is available if the user wishes to modify the OUTFILE section of a RowGen job. Additional targets, each designed for a specific recipient (based on business rules) with different layouts, or with certain field values, can be generated. These multiple targets (based on one or more custom layouts) are produced in the same pass during test data generation.

This and other functionality (including multiple set file creation dialogs), are available in IRI RowGen. For more information, see www.iri.com/products/rowgen.

Print Friendly

{ 0 comments… add one now }

Leave a Comment

Previous post:

Next post: