Pump Up Pentaho

 

Next Steps
Overview DataStage ETI Solution Informatica OWB & ODI Pentaho Talend Others

Challenges

While Pentaho Data Integration (PDI) is a powerful tool for preparing and integrating data, it also has some shortcomings:

Slow Transforms

  • Native sorts, etc. may not run fast enough in high volume 

Limited De-ID Features

  • Cannot mask or encrypt data flowing through Kettle

Limited Test Data

  • Cannot prototype ETL jobs without using production data

Solutions

PDI workflows support system commands, so data can be processed externally without disruption. IRI Voracity or its component software can help Pentaho users in the following ways:

Speed Transforms

  • Use PDI's shell step to call an IRI CoSort job (e.g., SortCL script) to dramatically reduce sort, join, and aggregation times
  • Run multiple jobs in one batch file
  • Get results 14-16 times faster than Pentaho alone
  Blog


Using CoSort to Speed up the Sort Process in Pentaho


Mask Your Data

  • Run IRI FieldShield jobs from the Shell step in Pentaho to protect data at rest
  • Mask, encrypt, and encode (and others) data in your needed format
  • Secure data at the field-level
  Blog


Masking Data in Pentaho


Test Your Apps

  • Run IRI RowGen to populate tables, files and reports with synthetic test data that mimics production data
  • Generate structurally- and referentially-correct DB test data for entire EDW
  • Keep production data safe
  Blog


Creating Test Data for Pentaho

Share this page

Request More Information

Live Chat

* indicates a required field.
IRI does NOT share your information.

X

Try Voracity Free

Speed, leave, or save on ETL jobs


Get Info See Demo