
Finding and Masking PII in XML and JSON Files…
Personally Identifiable Information (PII) like names, Social Security numbers, home addresses, etc. are stored in multiple sources and silos, including semi-structured files in JSON and XML format.
Personally Identifiable Information (PII) like names, Social Security numbers, home addresses, etc. are stored in multiple sources and silos, including semi-structured files in JSON and XML format.
Just as IRI FieldShield product users can reach and mask personally identifiable information (PII) — and IRI Voracity platform users can integrate and govern structured files — in Amazon Simple Storage Service (Amazon S3) buckets, IRI DarkShield users can now find and mask PII in unstructured files stored in S3.
Article 17 of the General Data Protection Regulation (GDPR) stipulates the Right to Erasure, often referred to as the Right to be Forgotten. While the regulation specifies some requirements as to what controllers must do with data requested to be “erased”, it does not expressly define what the term erasure means.
This article demonstrates the manipulation of a CSV file using an IRI Workbench wizard. In fact, this example shows how PII can be masked from almost any IRI job wizard, though CSV file masking is most often performed from a single or multi-file IRI FieldShield job menu.
According to Simson L. Garfinkel at the NIST Information Access Division’s Information Technology Laboratory,
De-identification is not a single technique, but a collection of approaches, algorithms, and tools that can be applied to different kinds of data with differing levels of effectiveness.