Profiling data, finding matching patterns or related values, and rapidly identifying the locations and lineage attributes of disparate data sources are all ways to reveal the content of data, and how it was created, deleted, or modified. Most tools available for these purpose are expensive, and designed for a specific data source (e.g. one database).
After data is discovered and transformed, application audit trails need comprehensive information about the target layouts and job runs. The details must be readily-available and secure. Logs should also track sensitive data protections and enable: user accountability, job replication, parameter modification, and issue analysis.
For example, data processing forensics may expose if a record count or value range changes beyond an established threshold; this could indicate a problem of data loss or fraud. Few data management software platforms or fit-for-purpose applications support this.
Using state-of-the-art database, file, and dark data discovery tools in the free IRI Workbench GUI, you can find the location of precise (and fuzzy matching) data patterns, and automatically discover source-specific metadata that help you analyze file authorship and other attributes. For example, as you locate PII values within spreadsheets and email repositories, you can also automatically display location, ownership, security, and other properties of those files.
Job scripts, statistical reports, and audit logs in the IRI Voracity data management platform and its consituent IRI CoSort (SortCL) data transformation and IRI FieldShield data masking programs contain your data layout specifications, query syntax, and manipulation details.
The XML audit log from IRI jobs provide details for each input, inrec (virtual), and output definition -- including which field attributes and protection techniques were specified. Phase-specific record counts -- including the number of records accepted, rejected, and processed -- plus job tuning details, are available in the statistical logs.
The entire job script, along with user, runtime, and environment variable information, are also recorded in the audit trail. It is easy to query and report on the logs using your preferred XML parsing tool or SortCL (through supplied data definition files for the logs). For example, you can query on file and field names, run dates, and job duration. You can quickly examine specific jobs without having to manually review a giant audit trail.
Free lineage capabilities are also available in the IRI Workbench, through the use of search tools and hubs like EGit for sharing and securing master data and metadata in the cloud. Advanced data linage capabilities in the IRI Voracity data management platform are available through an optional upgade to the AnalytiX DS Mapping Manager plug-in.