Test Data Automation

 

Next Steps
Overview Benchmarking Compliance DB Test Data DB Subsetting DevOps Test Files/Reports Virtual Test Data TDaaS

Challenges


Applications and databases have their own logic and unique properties. To have realistic test data for software testing, you must make sure it reflects characteristics of production data, such as:

  • selection conditions (business rules)
  • column attributes and transformations
  • inter-field/key relationships (referential integrity)
  • value ranges and inter-column calculations

Production-quality test data must also have these attributes:

  • type - correct column / field values and formats
  • width - values with current (and future) ranges
  • frequency - realistic value occurrence patterns
  • depth - volumes that address scalability concerns
Devops cycle

In fast-paced DevOps and Continuous Integration (CI) / Continuous Deployment (CD) environments, the ability to generate and automate consistent, and realistic test sets on-demand that can range widely in format and volume can be a tall order, distracting programmers with tight delivery timelines. And of the many DevOps test data management tools available, few are sufficiently robust, ergonomic or affordable. 

Solutions

Devops cycle

Applications developed with realistic data formats and volumes are more likely to succeed in production. The IRI RowGen test data generation tool uses production metadata to synthesize custom test sets with randomly-generated data, and/or randomly selected data from production sources. The IRI FieldShield and IRI DarkShield data masking tools can also be used to find and mask PII and other sensitive data in production, and create secure targets into lower environments for testing.

To generate the right values and value ranges, RowGen uses conditional selection and formatting parameters. RowGen further enhances test data realism through referential integrity, frequency distributions, and built-in transformation and formatting functions. For example, you can randomly select data and specify ranges from pools of real data and weighted numbers (respectively).

For test data automation in DevOps, a/k/a Continuous Integration and Continuous Deployment or Delivery (CI/CD), environments, RowGen can synthesize test data at any step in a development process without depending on data being made available from another step. See linked examples of automated test data provisioning for DevOps at the bottom of this page.

Moreover and uniquely, embedded data transformation, validation, and formatting functions can run simultaneously against the generated data in the same script! This can facilitate incremental application tests that assure backwards compatibility and forward compliance with your production releases. See this use case.

Combine random generation and set-file selection, field-level conditions and manipulations, and custom layout features. Rapidly build the intelligent data you need to stress-test and vet your applications. Improve the quality and reliability of your deliverables. Schedule jobs to repeat generation and testing operations in IRI Workbench (or your own CLI-supporting automation tool) to smooth out CI, and enable CD, processes.

If you still prefer to test with real data (from production DBs or files), you can also use RowGen to quickly subset it, or you can mask an even broader range of sources with IRI FieldShield and IRI DarkShield. If you need multiple capabilities, or need to virtualize test data from a variety of static or streaming sources, check out the IRI Voracity data management platform and partners like Value Labs, and their Voracity-supporting Test Data Hub which provides test data on-demand.

ValueLabs IRI admin
ValueLabs IRI tester

And if you use and existing CI/CD pipelines, you can call IRI software directly to provide masked, subsetted of synthesized data into them! See the examples of test data generation for DevOps done in: Amazon CodePipeline, Azure DevOps, GitLab and Jenkins. If you use another test data automation framework, ask us about it!

Frequently Asked Questions (FAQs)

1. What is test data automation in DevOps?
Test data automation in DevOps refers to the process of automatically generating or provisioning test data within Continuous Integration and Continuous Deployment (CI/CD) pipelines. It ensures that developers and testers have access to realistic, secure, and up-to-date data at every stage of development.
2. How does RowGen support DevOps test data generation?
IRI RowGen creates realistic test data using metadata, value ranges, frequency patterns, and referential integrity rules. It works in automated scripts or jobs that can be integrated directly into DevOps pipelines like Jenkins, GitLab, Azure DevOps, or Amazon CodePipeline.
3. What makes test data “production-quality”?
Production-quality test data must match the structure and behavior of live data. This includes correct data types, realistic value ranges, referential integrity, transformation logic, formatting, frequency distributions, and scalability in volume.
4. Can IRI tools be used to mask sensitive data for DevOps testing?
Yes. IRI FieldShield and DarkShield can discover and mask PII and other sensitive data in structured, semi-structured, and unstructured sources. These masked datasets can then be used in lower environments safely.
5. How can I generate test data on-demand in CI/CD pipelines?
With IRI RowGen, you can synthesize test data dynamically and automatically within DevOps environments. With IRI FieldShield or DarkShield you can also mask production data in databases, files, documents, and images. Either way, jobs can be scheduled, triggered by commits or builds, or run via command-line scripts integrated into your automation tools.
6. What types of test data formats are supported?
IRI supports a wide range of data formats including relational and NoSQL databases, flat files, JSON, XML, Excel, PDFs, and message queue topics like MQTT and Kafka. The generated test data can match custom layouts required by your applications.
7. Can I combine synthetic and masked test data?
Yes. You can blend synthetically generated data with masked subsets of real data using the IRI Voracity platform, which includes RowGen and FieldShield (plus DarkShield, etc.).
8. How do I handle test data reuse in DevOps?
IRI jobs can be saved, modified, and reused across test cycles. You can schedule recurring jobs or embed scripts into your CI/CD process to provision test data consistently.
9. What if my test data needs to simulate transformations?
RowGen supports built-in transformations, formatting, and validation in the same job that creates the data. This helps you simulate real-world logic and test application behavior more thoroughly.
10. Can I virtualize test data instead of copying it?
Yes. Through Voracity integrations and partner tools like Windocks and Commvault, you can virtualize access to test data across environments, enabling on-demand provisioning without duplicating large datasets.
11. How does RowGen ensure compatibility across application versions?
RowGen can generate test data that meets the constraints and logic of both legacy and current versions of your schema. This allows you to test backwards compatibility and forward compliance during iterative releases.
12. Can IRI tools support testing from streaming data sources?
Yes. RowGen and Voracity support integration with APIs and streaming sources like Kafka and MQTT, allowing you to simulate and test real-time data workflows in DevOps environments.
Share this page

Request More Information

Live Chat

* indicates a required field.
IRI does NOT share your information.