|
Sorting & Transformation, Data Masking & Testing
Challenges:
PowerCenter transforms of very large data volumes can run slower than desired, even after consulting and tuning are employed. Bottlenecks may occur during large sort, join, aggregation, load, or unload operations.
"Pushdown optimization" to other tools may not be that much faster,
and at a minimum, shifts the burden onto a database or more expensive / complex platform.
Another serious need is the protection of sensitive production data moving through Informatica data warehouse ETL or data mart operations. You may need to apply role based data protections or generate large volumes of realistic, referentially correct test data you can use to prototype and populate certain applications and targets.
Solutions:
1) Faster Sorts
With CoSort, you can dramatically speed sorting directly within Informatica using CoSort's
unique (plug'n'play) Sorter TX AEP (for PowerCenter
7) or CT (for v8). This seamless CoSort component has improved PowerCenter sort
performance up to 10X with no interface changes. Subsequent join, aggregation,
and load runtimes should also benefit.
CoSort vs. Informatica Sort Benchmarks
Fixed-key, ASCII Sorting on 4-CPU IBM p650
| Input >> |
26.7MB |
267MB |
2.67GB |
| Sorter Tx |
8s |
1m 48s |
20m 35s |
| CoSort CT |
3s |
16s |
2m 1s |
| CoSort SortCL |
1s |
7s |
1m 19s |
2) Push Out Optimization
To speed transforms, reports, and field-level protections in general, consider the use of CoSort Sort Control Language SortCL programs alongside your PowerCenter or PowerMart operations. The American Stock Exchange uses CoSort as a "push out optimization" solution to triple runtime performance.
With CoSort, you can easily run large sorts, joins, aggregations, and loads in the file system, where it's much faster. Plus, CoSort allows you to convert file and data types, protect fields at risk with encryption, etc., and generate custom reports -- all at the same time (in the same job script and I/O pass)
3) Data Masking
Data at rest in tables and flat files that PowerCenter and PowerMart work with can be sensitive, containing personally-identifying information that is subject to confidentiality restrictions and data privacy laws. IRI's data-centric security product, FieldShield, can protect fields in structured datasets in any ODBC-connected database or supported file format.
Your business rules dictate the feature you choose to apply to each column; i.e. format-preserving AES-256, Open SSL and GPG encryption, lookup-value substitution (pseudonymization), character masking, custom expression logic or user field function.
4) Test Data Generation
Do you need test data for Informatica ETL prototyping? Consider IRI's test data package called RowGen. With RowGen, you can build realistic, referentially correct test data to populate target tables, data marts, flat files, and production reports, while leveraging your database data model (.DDL) files and Informatica metadata.
Informatica XML Metadata Compatibility!
Through tools like RapidACE and the Meta Integration Model Bridge (MIMB), you can use the .xml data layouts you already have for use in Informatica within CoSort (transformation), FieldShield (data masking) and RowGen (test data) operations!
See also:
FAQ > Informatica
Solutions > Data Transformation
Solutions > Field Protection
Solutions > Business Intelligence
Products > CoSort > Sort PlugIns
Products > CoSort > SortCL
Products > FieldShield (Data Masking)
Products > RowGen (Test Data)
CoSort Brochure
for Informatica Users |
Thank you for sending us a request for information. We will get back to you shortly.
|