Test Data Generation Solutions

Home » Solutions » Test Data

Quick Links

Overview Benchmarking Compliance DB Test Data DB Subsetting DevOps Test Files/Reports Virtual Test Data TDaaS

Overview Benchmarking Compliance DB Test Data DevOps Test Files/Reports Virtual Test Data

Synthesize Smart Test Data & Provision It Your Way

Do you need a test data management solution that can:

produce and populate realistic test data for databases with referential integrity
generate smart test data in text files, documents, reports or images
enhance application quality through stress testing and automation
produce the volume needed for hardware and software benchmarking
preview ETL mappings and prototype Data Vault models with test data
put anonymized datasets online for offshore developers
integrate directly with database cloning, virtualization and DevOps pipelines

using just data models or metadata, but not actual production data?

If so, you need a robust test data generation tool. Table views, index orders, key relationships, and file and report contents, must reflect the characteristics of production data to be useful in testing. Generating realistic values and formats with synthetic data in ideal ranges and frequencies -- and populating large targets -- can take a long time with other test data generation tools or programs.

With the IRI RowGen test data synthesis tool or the IRI Voracity test data management (TDM) platform that embeds RowGen, you can generate multiple, intelligent test data targets -- for test databases, file structures, and custom report formats -- from scratch, all without access to real data. Or if you want to use and anonymize, subset, or otherwise mask real data from production, you can do that with IRI data masking and test data provisioning tools in Voracity, too.

IRI test data software gives you four ways to produce anonymous, but intelligent, test data in referentially correct database, flat-file, semi-structured file, formatted report, and even unstructured file targets:

DB or file synthesis (via random data generation or selection) in IRI RowGen
Prod or test data masking in IRI FieldShield, CellShieldEE, or DarkShield
RDB table subsetting and masking using RowGen or FieldShield
Any combination of the above in IRI Voracity (which includes it all)

Data Synthesis Capabilities

In addition to structurally and referentially correct synthetic test data for every popular RDBMS with defined constraints, RowGen can also create smart synthetic data for software testing. RowGen can seed randomly generated or selected values into custom detail and summary report layouts, document and image files, and popular file/feed formats like these:

Record, line, or variable sequential
ASN.1 CDRs
COBOL index (MF ISAM, Vision)
CSV, LDIF, JSON, and XML
Excel (XLS/X)
FHIR, HL/7 and X12 EDI
Fixed position text and mainframe blocked
HDFS
Image files and PDFs (using DarkShield with RowGen)
MQTT and Kafka topics
KNIME (analytic & visualization nodes) in Eclipse

RowGen randomly generates field values in more than 100 data types. It can also randomly select data from set files at the field level. That, along with custom/compound data values, value ranges, and distributions, improve test data realism.

Support for standard and complex data transformations, set files, and conditional selection also contribute to RowGen\'s value in simulating production table and file formats for a variety of applications.

For database users, RowGen leverages the DDL information for Oracle, DB2 UDB, SQL Server, Sybase, Teradata, and other platforms to create realistic tables with structural and referential integrity. Use RowGen to populate an entire test enterprise data warehouse (EDW) or Data Vault 2.0 environment.

Data Masking Capabilities

Use any of the static data masking tools available in the IRI Data Protector Suite, or included free in the IRI Voracity platform:

IRI FieldShield for structured files and databases
IRI DarkShield for structured sources as well as many semi-structured and unstructured data sources
IRI CellShield for Excel spreadsheets

to discover (profile, search, and classify), de-identify (encrypt, pseudonymize, blur, redact, etc.), data in production systems and replicate it anonymized in lower dev, test and QA environments.

If you use IRI Voracity, you can use its included RowGen synthesis and FieldShield data masking capabilities to find, classify, subset, and mask data, and integrate that data for static development use in lower environments or virtual use in live testing environments.

Consider our test data management advice as you plan your strategy, and see these links for more information on using safe test data for:

Test Data Provisioning for DevOps & Virtualization

Speed up your CI/CD pipelines with seamless test data provisioning. IRI tools integrate directly with database cloning and virtualization platforms to deliver masked or synthetic datasets on demand. This ensures that every developer has a "fresh" copy of test data without the overhead of massive storage requirements.

Frequently Asked Questions (FAQs)

1. What is test data management?

Test data management (TDM) refers to the process of creating, provisioning, masking, and maintaining data used for development, testing, QA, and benchmarking. It ensures that non-production environments have reliable, compliant, and realistic data for various test scenarios.

2. How can I generate realistic test data without using production data?

You can use IRI RowGen to synthesize structurally and referentially correct test data purely from metadata, like DDL, without requiring real data. RowGen creates randomized or patterned values in realistic formats, ideal for secure application development and performance testing. This is the safest method for synthetic data generation.

3. What are the main ways IRI supports test data generation?

IRI supports test data creation through:
• Synthetic data generation with RowGen
• Static data masking with FieldShield, DarkShield, or CellShield
• Database subsetting and masking
• Integrated workflows in the Voracity platform

4. Can I create test data for structured, semi-structured, and unstructured environments?

Yes. IRI RowGen and DarkShield can generate or mask test data for structured databases, semi-structured files (like JSON or XML), and unstructured sources (like PDFs and image files) to support diverse testing needs.

5. How does IRI RowGen maintain referential integrity in test databases?

IRI RowGen reads metadata like DDL from platforms such as Oracle, SQL Server, and DB2. It then generates test data that adheres to primary/foreign key relationships, ensuring referential integrity across tables.

6. What types of file formats can IRI RowGen generate for test data?

RowGen populates realistic test data into fixed and delimited files, Excel, ASN.1-encoded CDRs, LDIF, COBOL, and flat XML and JSON files directly. By using RowGen-synthesized data as replacement lookup values within DarkShield jobs, you can also put test data into PDFs, Office documents, HL7, FHIR, X12, images, DICOM, and Parquet and raw text/log files, too.

7. Can IRI tools help with test data masking for compliance?

Yes. The IRI RowGen test data synthesis tool, as well as the IRI FieldShield, CellShield, and DarkShield data masking tools, can produce compliant test data for privacy laws like HIPAA, the GDPR, and PCI-DSS.

8. What is the difference between data synthesis and data masking?

Data synthesis creates artificial, but realistic, test data from scratch using metadata or models. Data masking transforms real data to de-identify it through masking or anonymization functions.

9. Can I subset production data for testing with IRI tools?

Yes. Both RowGen and FieldShield support data subsetting based on rules or queries. You can extract meaningful portions of production data, mask it, and use it safely in test environments.

10. What is TDaaS (Test Data as a Service) in the IRI context?

TDaaS refers to IRI’s ability to hand-deliver ready-to-use, masked or synthetic test data on demand, integrated with DevOps pipelines, virtualization platforms, or cloud-based testing workflows. This is a professional, not software-as-a-service, offering.

11. How does IRI support DevOps and CI/CD with test data?

IRI tools integrate with CI/CD workflows by automating the generation, masking, or provisioning of test data. You can use them in scripts or pipelines to provide up-to-date, compliant datasets at each stage of development. Integration has been done with Azure DevOps, Amazon Code Pipeline, Git, Jenkins, etc.

12. Can I use IRI test data tools for benchmarking?

Yes. RowGen can create large volumes of realistic, scalable data for stress testing software, validating database performance, and benchmarking hardware or systems under load.

13. Why is synthetic data generation better than subsetting for security?

Synthetic data generation carries zero risk of re-identification because no real-world sensitive information was ever used. It is the gold standard for "privacy by design."

14. Does IRI support referential integrity in test data across different platforms?

Yes. IRI RowGen can maintain referential integrity in test data even if the parent and child tables reside in different database types (e.g., Oracle to Snowflake).

15. How does automated test data generation save time in the SDLC?

Automated test data generation removes the manual effort of scripting data inserts, allowing QA teams to refresh test environments in minutes rather than days.

16. Can I use these tools for AI and Machine Learning training?

Absolutely. IRI’s enterprise data anonymization and synthesis capabilities provide the high-volume, statistically accurate datasets required to train AI models without compromising privacy.

17. What is the role of sensitive data discovery in TDM?

Before masking or subsetting, you must identify what is sensitive. Our tools include sensitive data discovery to ensure no PII is accidentally moved to a test environment.

18. How does IRI compare to other test data management vendors?

IRI offers a unique "fit-for-purpose" approach. Whether you need a standalone tool like FieldShield or a full platform like Voracity, we provide higher speed and lower cost than legacy TDM suites.

19. Can I generate test data directly into cloud file stores?

Yes. Our tools support test data provisioning to S3 buckets, Azure Blob Storage, GCP, OneDrive, and most cloud-based databases.

20. Is it possible to generate data that follows specific business logic?

Yes. You can define compound data values and conditional logic to ensure your test data mirrors complex business scenarios, not just random characters.

Share this page

Request More Information

Live Chat

* indicates a required field.
IRI does NOT share your information.

Test Data Management Solutions

Proven Data Synthesis, Subsetting, and Masking

Quick Links

Synthesize Smart Test Data & Provision It Your Way

Data Synthesis Capabilities

Data Masking Capabilities

Test Data Provisioning for DevOps & Virtualization

Frequently Asked Questions (FAQs)

Request More Information

Solutions

Products

Customers

Services

Company

Support

News

Partners

Test Data Management Solutions

Proven Data Synthesis, Subsetting, and Masking

Quick Links

Synthesize Smart Test Data & Provision It Your Way

Data Synthesis Capabilities

Data Masking Capabilities

Test Data Provisioning for DevOps & Virtualization

Frequently Asked Questions (FAQs)

Request More Information

Follow us on

Get the IRI Newsletter