One of the ways IRI RowGen builds realistic test data is through the formation and population of custom field values, such as phone numbers. In this article, we explain how to use the Compound Data Value (CDV) wizard in the RowGen GUI to build a set file containing real-looking, US phone numbers based on the North American Numbering Plan (NANP).
There are a variety of testing requirements for any data warehouse and database — and especially dual platforms like Teradata — where ETL and BI prototypes, application stress testing, and performance benchmarking are essential.
The value of good test data to DBAs is well known:
“Testing of database-intensive applications has unique challenges that stem from hidden dependencies, subtle differences in data semantics, target database schemes, and implicit business rules.
This article is part of a 4-step series introduced here. Navigation between articles is below.
Step 2: Test Data Needs Assessment
Once the questions of who needs test data for what — and who will be dealing with it along its lifecycle are answered (see Step 1) — a deeper dive is needed into the specific technical aspects of the data itself.
This article is part of a 4-step series introduced here. Navigation between articles is below.Step 1: Goal Setting & Team Building
Someone needs test data to do something, like:stress-testing the functions and performance of applications prototyping database load/query and DW ETL/ELT operations benchmarking prospective new hardware or software outsourcing development or proofs of concept demonstrating systems with real-looking, but not real, sample data
In all these cases, the most realistic data possible is needed, but it should also be safe and de-personalized.
Database and solution architects depend on realistic test data to:help create new databases, prototype ETL jobs or applications benchmark performance in new or existing platforms stress-test systems protect confidential information in existing systems if database work is outsourced or used for demonstrations.
Production data runs the risk of exposing personally identifiable information (PII), proprietary information, or may not reflect the types or volume of real data that can be encountered in the future.