Case Studies

 

Next Steps
Case Study

Voracity for MDM, an Integrated Source of Truth for Malaysian Citizens

Background
When government agencies collect data on their citizens, there is understandable concern. There are more stories about data breaches and misuse than benevolent and helpful ones. That and a lack of citizen control and government accountability over big data, feed mistrust and calls for limiting data collection. But what if those agencies could share citizen data in responsible, beneficial ways?

Voracity Data Integration and Governance Use Case: Inter-Agency Data Exchange

When government agencies collect data on their citizens, there is understandable concern. There are more stories about data breaches and misuse than benevolent and helpful ones. That and a lack of citizen control and government accountability over big data, feed mistrust and calls for limiting data collection. But what if those agencies could share citizen data in responsible, beneficial ways?

For example, many countries do not know how or why many of their students go onto college, choose certain careers, or remain unemployed. That’s despite the fact that information on students with unique national IDs was collected in: primary and secondary schools (run by one or more agencies), colleges or training centers (reporting to other agencies), and military, healthcare, and penal institutions (still more agencies).

Correlating this siloed data could reveal which students from specific areas or backgrounds achieved specific outcomes. This in turn supports hypothesis testing, factor identification, and intervention strategies to improve those outcomes. And that is only one example.

Enter an integrated master data management system for government agencies to exchange data with each other for public good… one that leverages best-in-class data storage, integration, governance, and tracking technologies to ensure that heterogeneous agency data is unified, de-identified, and shared properly.

MyGDX & Voracity

The Malaysian Government Data Exchange is a state-of-the-art web portal supporting the upload, integration, downloading and access-logging of de-identified data in multiple government agency silos. In such cases, MyGDX holds the promise of producing more “unified views of the citizen” that not only improve early intervention strategies, but could spot disease clusters, prevent crime, and better target public and private services.

mygdx data types schematic

Some of the education-related data that Voracity consolidates in MyGDX from multiple ministries and agencies to produce an 'integrated source of truth' of lifetime education information on Malaysian residents

MyGDX is front-ended by a modern web portal that delivers available but de-identified data while tracking all requests and even supporting the eventual monetization of public data for private use.

To meet the system's quality, privacy, and analytic goals, that data must be rapidly integrated and reliably governed at once. And to do that, MyGDX uses the combined data integration (ETL), unification, cleansing, and masking functionality of the IRI Voracity data management platform:

A Voracity workflow created for MyGDX

Architectural diagram of governed ETL and MDM in MyGDX via IRI Voracity

Voracity jobs are powered by the long-proven IRI CoSort data acquisition, transformation, cleansing, and masking engine. Those jobs are built and managed in IRI Workbench, a rich and familiar IDE supporting job design and workflow management in multiple modes, built on Eclipse™.

MAMPU and other typical users of MyGDX, which includes Voracity in its applications

Voracity workflow diagram showing some of the mapping logic for integrating and governing MyGDX Data

Combining Data Discovery, Integration, Migration, Governance, and Analytics

Data integration and governance have been traditionally separate disciplines performed in separate silos and products, and managed by people with different agendas and skills. Think about all the separate, and costly, data profiling, ETL, data cleansing, data masking, master data management, and IAM/auditing technologies that your company may be entertaining.

Voracity is the rare platform product that seamlessly combines key data lifecycle management activities in the same pane of glass and I/O pass:

The capabilities of Voracity

Voracity schematic revealing data source-action-target flow, plus job design and deployment options

Inside IRI Workbench are multiple, fit-for-purpose wizards that build and/or execute portable jobs to search, classify, and apply consistent mapping, masking, and test data generation rules. Also included are seamlessly interlinked: data profiling and classification, single-pass ETL, data quality, PII masking, metadata management, analytic options, and interchangeable Hadoop runtime engines to handle on-premise and cloud data sources, big and small.