Overview
IRI HQ Training
On-Site Training
Self-Directed Learning
Certification
- Installation
- IRI Workbench
- Data Discovery
- Data Integration
- Data & Database Migration
- Data Governance
- Analytics & BI
Self-Directed Learning Site for IRI Software
This page links you to free, self-help content in the many data management objectives that IRI software products can help you achieve.
Please note that:
- All IRI software users, from freemium to licensed subscribers, can access this material. Only supported users, however, can get help from IRI beyond this content.
- The IRI software product or licensing agreement you have may or not support the functionality, or entitle you to support for that feature even if the feature itself is enabled in your product. For more information, please refer to this product-feature matrix.
- Additional training materials are available on request, custom designed for your company's requirements and certification-oriented learning. Please describe your material requirements below and an IRI services representative will contact you.
IRI Product Installation
- All IRI Workbench & SortCL-Compatible Software
- CoSort v10 Upgrade Advice
- IRI CellShield PE & EE V2
- SQL# & JDBC SQL Trail Proxy DDM (pending)
- Updating IRI Workbench
- IRI Workbench Demo Projects in Git
- Connecting to RDBs (via O/JDBC)
- Connection Registry
- DSN Files
- Multi-Table Filtering
- Connecting to NoSQL DBs
- Cassandra
- using Flat (CSV) File Import/Export
- using Native Driver, All Collections
- Elasticsearch
- MarkLogic
- MongoDB
- for FieldShield (Flat Collections)
- for DarkShield (All Collections)
- Cassandra
IRI Workbench - General
- Getting Started
- Job Design Methods
- New Job Wizards (see Welcome > First Steps)
- Dialogs (use ? for in-context Help)
- Script Editor & Outline (see product manuals for syntax)
- Mapping Diagrams (see Work Flow from the Palette below)
- erwin Mapping Manager
- Java apps via Gulfstream API
- Flow Design (how-to articles for Voracity ETL and other batch jobs)
- Job Deployment
- Command Line / Wicked Shell
- Single Job Execution Options
- Batch Job (Work Flow) Execution Options
- Remote Connection/Executions
- Running Voracity Jobs in Hadoop
- Running Voracity Jobs with KNIME
- Cloud Considerations (pending)
- Amazon EC2
- Oracle Cloud Infrastructure
- MS Azure
- Google Cloud
- Job Scheduling
Data Discovery & Classification
- DB Profiling (includes 1+ table searching)
- Flat-File Profiling
- NoSQL DB Data Class Search (and Mask)
- Structured Metadata Discovery
- IRI Metadata Search (via Git)
- Structured (DB & Flat-File) Data Classification
- Directory Data Class Search (structured files)
- Schema Data Class Search (all tables in 1 or more RDB schema)
- Schema-wide Pattern Search (all tables in an RDB schema)
- Unstructured File Search, Extract, Structure & Profile (text, documents, images, faces)
Data Integration
- Data Integration Architectures (and Voracity DI paradigms)
- Enterprise Data Warehouse
- Logical Data Warehouse
- Operational Data Store / Enterprise Data Hub
- Production Analytic Platform (4-part series)
- Data Lake
- ETL vs. ELT
- ETL Job Design (see IRI Workbench > Flow Design above)
- ETL Execution (see IRI Workbench > Job Deployment above)
- Faster Extraction
- Data Transformation
- Date Masking
- Filtering
- Video: Sort/Join/Aggregate
- Video: Inner Join
- Video: Left Outer Join
- Video: Sort
- z/OS MVS & VSE JCL (ICEMAN) conversions
- Links to other plug'n'play sort replacements
- CoSort v10 Best Practices
- Video: Summary Aggregation
- Row-Column Pivoting
- Set Lookups
- Fast Loading
- Change Data Capture
- Slowly Changing Dimensions
- DB-Specific Optimization (see tabs)
- Video: Voracity ETL Workflow (wizard mode; see Flow Design above)
- Voracity ETL Job Preview (via live or test data)
- Legacy ETL Optimization
- Legacy ETL Tool Migration (via erwin Smart Connector)
Data & Database Migration
- File-Format Conversion
- Database Migration
- Database Subsetting
- Basic Data Replication
- Incremental Data Replication
- Data Federation
- Schema (Relational to Star) Migration
- Vision File Conversion
- XML (Complex) Parse/Process
Data Governance
- Data Masking
- Dynamic Data Masking (DDM)
- Static Data Masking (SDM)
- Which IRI Data Masking Product Should I Use?
- Getting Started with FieldShield
- Which Data Masking Function Should I Use?
- Multi-Table RDB Masking (with Referential Integrity)
- Rule-based using like-named columns only
- Rule-based using data classes (better)
- Video Series: Multi-table masking tutorials (2020)
- Data Classification & Discovery (see links above)
- Applying masking rules to classified data
- Data Class DB Masking Wizard
- Video Series: Multi-table masking tutorials (2020)
- Multi-Flat-File Masking (in 1 or more directories)
- Applying Field Rules Using Classification
- Structured & Semi-Structured Data Sources
- Amazon S3 file buckets via DarkShield
- Apache Cassandra
- CSV Files
- Dates & Ages
- Elasticsearch via DarkShield
- Excel Spreadsheets
- via CellShield Personal Edition (PE)
- via CellShield Enterprise Edition (EE)
- See v2 Release Information above
- Video: via DarkShield
- via FieldShield (pending)
- HL7 (or see X12 via DarkShield, below)
- JSON & XML Files via DarkShield path filtering
- Live Feeds
- MongoDB
- NIDs & SSNs
- NRIC
- Oracle & other RDBs
- via FieldShield (for structured/1NF columns only, with classification & search)
- via DarkShield (with structured & unstructured - C/BLOB, XML, text - columns)
- Pentaho
- PostgreSQL
- SAP HANA (subset and mask)
- Salesforce
- Snowflake DB
- Splunk
- Web Logs
- Video: X12 via DarkShield
- Re-ID Risk Determination (for HIPAA Expert Determination Method security rule)
- Unstructured (Dark) Data Sources
- Getting Started with DarkShield (GUI)
- Finding & Masking PII in Text/EDI files, MS Office, etc.
- DarkShield CLI SDK
- DarkShield RPC API - Plankton Web Services Framework
- DarkShield Base API (for all sources, silos, and feeds)
- DarkShield Files API (supplement for text, PDFs, images, etc.)
- Analyzing DarkShield Search/Mask Results in Splunk
- Universal Forwarder (sending search/mask logs to Splunk)
- Adaptive Response to log events in Splunk ES
- Invoking DarkShield from a Splunk Phantom Playbook
- Data Quality
- Data Quality Rule Wizards
- Filter & De-Duplication
- Fuzzy Searching
- Data Validation
- Data Unification (Homogenization, Reconciliation)
- Finding Business Rule Violations
- Finding Format Errors
- Master Data Management (MDM)
- MD Consolidation
- MDM Registry (pending)
- Alternative (via GIT)
- Metadata Management
- Teamwork (via GIT)
- Version Control
- Lineage (via Git)
- Lineage & Impact Analysis (via Erwin/ADS)
- Data & Metadata Lineage (built-in, pending)
- Asset Security (via Git)
- Teamwork (via GIT)
- Role Based Access Controls (RBAC)
- AD-compatible IAM (pending)
- Test Data Management (TDM)
- Test (Synthetic) Data Generation, for:
- RDBs via data masking (see SDM above)
- RDBs via auto-parse/populate/generate (synthesis)
- RDBs via masked table subsets (DB subsetting)
- Single files via the job wizard
- Specific formats or data classes
- COBOL files
- Customer (transaction) data
- Personally Identifiable Information (PII) (fake PII for DevOps)
- Credit Card Numbers
- NIDs (US & Korea SSNs, Italy CF, Netherlands BSN)
- UUIDs/GUIDs
- Cassandra
- MongoDB
- Teradata
- Weighted distributions
- Java & Hadoop (API)
Analytics & BI
- Production Analytic Platform (4-part series)
- Embedded BI
- see report examples in CoSort manual (SortCL program)
- IoT: Aggregation on the Edge
- Change Data Capture
- Slowly Changing Dimensions
- Predictive Analytics
- Clickstream Analytics
- BIRT integration
- Datadog Integration
- Splunk Integrations
- Data Wrangling for Other BI & Analytic Tools
- for Cubeware (pending)
- for DWDigest
- for IBM Cognos
- for Microsoft Power BI
- for Oracle Visualization Desktop
- for KNIME