realistic test data Archives

ETL

Smart, Safe Test Data for DevOps, MLOps and DataOps

by David Friedland

Data flowing through application development, machine learning, and analytic pipelines must address several needs common to all three, including:

Realism, to reflect production data characteristics and application test requirements; Compliance, with business and data privacy rules, plus DB and analytic models; Availability, or security, of the data (depending on your perspective); and, Auditability, for lineage and accountability. Read More

Data Transformation

ETL Task Testing with the IRI Voracity Preview Feature

by Cyrena Pritchard

During the design of IRI Voracity workflows in the IRI Workbench (Eclipse) GUI, you can preview the results of one or more transforms before saving or running the project. Read More

Test Data

Test Phone Number Generation: Using RowGen’s Compound Data Value…

by Chaitali Mitra

One of the ways IRI RowGen builds realistic test data is through the formation and population of custom field values, such as phone numbers. In this article, we explain how to use the Compound Data Value (CDV) wizard in the RowGen GUI to build a set file containing real-looking, US phone numbers based on the North American Numbering Plan (NANP). Read More

Big Data

CLF and ELF Web Log Processing

by Chaitali Mitra

This article is second in a 3-part series on CLF and ELF web log data. We previously explained CLF and ELF web log formats, and now introduce IRI solutions for manipulating and using web log data. Read More

IRI Workbench

Populating Teradata with Realistic Test Data De Novo

by Claudia Irvine

There are a variety of testing requirements for any data warehouse and database — and especially dual platforms like Teradata — where ETL and BI prototypes, application stress testing, and performance benchmarking are essential. Read More

Test Data

Creating Multi-byte Test Data

by Paul Friedland

RowGen produces test values in target database tables, flat files, and custom reports through either random data generation (based on the defined data type) or random selection (using different random pull methods) from data in “set files.” Read More

Test Data

RowGen v3 Automates Database Test Data Generation

by Andrew Allen

The value of good test data to DBAs is well known:

“Testing of database-intensive applications has unique challenges that stem from hidden dependencies, subtle differences in data semantics, target database schemes, and implicit business rules. Read More

IRI Business

Test Data Management: Test Data Needs Assessment (Step 2…

by David Friedland

This article is part of a 4-step series introduced here. Navigation between articles is below.

Step 2: Test Data Needs Assessment

Once the questions of who needs test data for what — and who will be dealing with it along its lifecycle are answered (see Step 1) — a deeper dive is needed into the specific technical aspects of the data itself. Read More

Test Data

Test Data Management: Goal Setting & Team Building (Step…

by David Friedland

This article is part of a 4-step series introduced here. Navigation between articles is below.

Step 1: Goal Setting & Team Building

Someone needs test data to do something, like:

stress-testing the functions and performance of applications prototyping database load/query and DW ETL/ELT operations benchmarking prospective new hardware or software outsourcing development or proofs of concept demonstrating systems with real-looking, but not real, sample data

In all these cases, the most realistic data possible is needed, but it should also be safe and de-personalized. Read More

Test Data

Test Data Management: A Primer

by David Friedland

Welcome to IRI’s primer on test data management. This is the opening article, which is followed by a 4-step series.

Introduction

As anyone familiar with the challenges of healthcare.gov Read More

Tag: realistic test data

Search the Blog