Applications and databases have their own logic and unique properties. For test data to be useful in these contexts, it must reflect production characteristics such as:
- selection conditions (business rules)
- column attributes and transformations
- inter-field/key relationships (referential integrity)
- value ranges and inter-column calculations
Production-quality test data must also have these attributes:
- type - correct column / field values and formats
- width - values with current (and future) ranges
- frequency - realistic value occurrence patterns
- depth - volumes that address scalability concerns
In a fast-paced DevOps and Continuous Integration (CI) / Continuous Deployment (CD) environments, the ability to generate and automate consistent, and realistic test sets on-demand that can range widely in format and volume can be a tall order, distracting programmers with tight delivery timelines.
Applications developed with realistic data formats and volumes are more likely to succeed in production. The IRI RowGen test data creation package uses production metadata to build custom, synthetic test sets with randomly-generated data, and/or randomly selected data from production sources. IRI FieldShield and IRI DarkShield data masking software can also be used to find and mask sensitive data in production and move bespoke targets into lower environments for testing.
To produce the right voles and value ranges, RowGen uses conditional selection and formatting parameters. RowGen further enhances test data realism through referential integrity, frequency distributions, and built-in transformation and formatting functions. For example, you can randomly select data and specify ranges from pools of real data and weighted numbers (respectively).
For Continuous Integration and Continuous Deployment or Delivery (CI/CD) environments, RowGen can synthesize test data at any step in a development process without depending on data being made available from another step. Moreover and uniquely, embedded data transformation, validation, and formatting functions can run simultaneously against the generated data in the same script! This can facilitate incremental application tests that assure backwards compatibility and forward compliance with your production releases. See this use case.
Combine random generation and set-file selection, field-level conditions and manipulations, and custom layout features. Rapidly build the intelligent data you need to stress-test and vet your applications. Improve the quality and reliability of your deliverables. Schedule jobs to repeat generation and testing operations in IRI Workbench (or your own CLI-supporting automation tool) to smooth out CI, and enable CD, processes.
If you still prefer to test with data sitting in production DBs, you can also use RowGen to quickly subset and mask it. Or, you can mask an even broader range of sources with IRI FieldShield and IRI DarkShield. If you need multiple capabilities, or need to virtualize test data from a variety of static or streaming sources, check out the IRI Voracity data management platform and partners like Value Labs, and their Voracity-supporting Test Data Hub which provides test data on-demand.
And if you use existing CI/CD pipelines like Jenkins, Azure DevOps, Git, Maven, etc., you can invoke IRI software within them to provide the test data you need. That is because the static data masking, subsetting, and synthesis (as well as other data transformation, integration, cleansing, and reformatting) activities supported in the structured data processing engine common to FieldShield and RowGen in IRI Voracity (called SortCL), run as command-line task scripts or batch programs invoked as utility operations.