Snowflake ETL and PII Masking

 

Next Steps
Overview DB2 UDB Cassandra Elasticsearch MongoDB MySQL/MariaDB Oracle PostgreSQL SAP HANA Snowflake SQL Server Sybase Teradata

Challenges

You may face one or more of these time-consuming issues working with Snowflake:

Snowflake ddf

  • Data searches, profiling, and/or classification
  • Integrating or wrangling data for DW/BI ops
  • Data movement/migration to/from tables
  • Transforming or loading large tables
  • Change data capture or replication
  • Clustering or query performance
  • Generating smart, safe test data
  • Masking sensitive data

Specific performance diagnoses and tuning also take time and may affect other users. Finally, stored SQL procedures may also be programmed inefficiently, require optimization, then still take too long to run. 

Solutions

To:

Use:

Keep Snowflake Data in Order & Externalize SQL Transforms

IRI CoSort to pre-sort flat files for bulk loads and inserts, and to bypass slower in-database transformation like sorting, joining and filtering by using the external CoSort SortCL data processing program against Snowflake data. This removes the overhead of that work from Snowflake if it needs to be done, improving the performance of clustering and commonly performed queries.

Integrate and Wrangle Data for DWH & Analytics

IRI Voracity to leverage the multi-threaded, memory-optimized, and task consolidating power of CoSort to perform ETL operations and act as a production analytics platform to simultaneous prepare, package (and even report) simultaneously. For more information, see the tabs under https://www.iri.com/solutions/data-integration/implement.

Migrate and Replicate Snowflake Databases

IRI NextForm Database Edition to acquire, re-map, re-format, and build/populate new tables during migrations to and from Snowflake. You can also use NextForm or the SortCL program in CoSort or Voracity to refresh, re-map and convert data in Snowflake, produce custom reports, copies, and federated views of data.

Mask Data in Snowflake Columns

IRI FieldShield to mask sensitive data in Snowflake, like personally identifiable information (PII) or protected health information (PHI). FieldShield applies masking, encryption, and other de-identifying functions to one or more columns at a time. Use FieldShield to comply with privacy laws like HIPAA, PCI DSS, FERPA, and GDPR.

Generate Snowflake Test Data

IRI RowGen to populate Snowflake operations rapidly with safe test data. RowGen uses your data models to generate the test data automatically for an entire database with referential integrity. IRI RowGen, FieldShield, and subsetting operations are also tightly integrated with the ValueLabs Test Data Hub for test data management (TDM) in Snowflake.

Learn more about all of these mapping and masking options in the IRI Voracity data management platform which includes these components here!

Request More Information

Live Chat

* indicates a required field.
IRI does NOT share your information.