Do you need to change the individualizing characteristics of data so that a personal, or other, data item stored in the original field cannot be identified, but nonetheless remain individualized? This way, database table and flat-file data could flow safely de-identified through different departments, and where necessary, be re-identified.
Sensitive data must also be replaced with altered (but reversible) output, and maintain the characteristics of the original field (like the length, data type, or even a real-looking but different value). Very often, feld-level encryption does not accomplish this goal because the ciphertext result is usually longer than the original field and will not be recognizable.
However, format-preserving encryption can be a desireable option in the de- and re-identification requirements context.
Solutions:
Both the FieldShield package and the CoSort product's SortCL tool provide several ways to de-identify personal health information (PHI) and other
sensitive field data (like a social security or cell phone number) in your databases and files. At the same time, the de-identified field can look more realistic and still be recovered.
You can de-identify private fields in these ways:
Apply a built-in or custom format-preserving encryption function to the sensitive fields and the matchting decryption function and key to reveal them
Use the built-in ASCII de-identification function to encode fields with random ASCII characters
Specify lookup (set) files to substitute sensitive field values with a pseudonym
Transform the field with one or more data manipulations
De-identify or encode the data with your own field-level transformation function
Example of the function-based de-identification option offered in CoSort's
Graphical User Interface - to - Sort Control Language (gui2scl) Java client
application. In this example, the SSN field in a file is de-identified.
These field-level de-identification methods can help you comply with certain privacy regulations, leave your non-sensitive data and files available for further processing, and produce an XML audit trail to help you verify compliance.
If you are interested in de-identifying real data through the generation of test data instead, please look into the metadata-compatible RowGen tool for test database, file, and report targets.