There are a number of business intelligence tools available today than can transform raw data into meaningful information. Because this process can be complex and involve large volumes of data however, it makes sense to use the right technologies at each step in the process … tools and techniques that combine well to deliver the fastest, most accurate results for business decision making, and make the process of metadata management and report design simpler and more efficient.
Founders of modern data warehousing like Ralph Kimball have long recommended the external preparation of big transaction data outside the database and ETL tool layer, in sequential (flat) files. This approach removes a regular, resource-draining overhead from systems designed for queries and integrations, and provides a central, platform-independent way to exchange and manage data. The same approach for business intelligence (BI) tools — called data franchising — has been espoused for years by experts like Richard Sherman for logically related reasons: Data Franchising on Information Management. In addition to better performance, Sherman points out that pre-staging data to be visualized avoids the inherent metadata complexity, redundancy and synchronization problems of performing transformations in the BI layer.
One of the best examples of BI optimization through technology combination is the use of IRI CoSort to franchise, or prepare, high volumes of transactional data for data visualization outside Cognos. IBM’s Cognos tool delivers visual information on cumulative financial performance, strategy management and business intelligence, typically for analytic purposes. CoSort’s Sort Control Language (SortCL) data transformation program simultaneously filters, sorts, joins, aggregates, segments and reformats data into subset CSV and other popular ‘feed’ files outside the BI layer.
As you would expect, staging data in flat files with CoSort — especially within a modern, metadata-managing, data-federating GUI like the IRI Workbench built on Eclipse — avoids the data integrity and reconciliation issues Sherman describes. Moreover, when high volumes of data are prepared in advance with CoSort, Cognos users can get their information in at least half the time it would otherwise take (using Cognos alone).
A simple laptop-based benchmark running Cognos under Windows XP SP3 on an Intel® Core™ CPU M370 @2.4GHz with 3 GB of DDR3 memory bears this out. When the necessary data transformations of sorting, joining, and aggregation were performed in Cognos for creating a report based on source data in 345MB and 89.2MB flat files, the entire job took 158 seconds:
Accelerating Cognos with SortCL is straightforward, especially if you are familiar with Eclipse, or at least the structure of your source data. The IRI Workbench GUI for CoSort users is a convenient, graphical environment for discovering, defining, managing, and transforming data in flat files and/or data in RDBMS tables with SortCL. But whether you define the SortCL data definitions or job specifications in text files or with the GUI, the self-documenting nature of the 4GL makes it easy to create and maintain both forms of metadata. The free GUI merely automates many of these tasks:
The same transforms run in this one CoSort’s Sort Control Language (SortCL) program took 21 seconds on the same laptop, with the results available to multiple applications. In this case, the Cognos user who created the same report (above in 158 seconds) only needed another 58 seconds to run this report, meaning that the start-to-finish time using CoSort was only 79 seconds.
For more information on using CoSort for standalone or accelerated business intelligence, please see the Business Intelligence Solutions page available on IRI.com found here http://www.iri.com/solutions/business-intelligence and the other articles in the BI section of the IRI blog site.