India's Leading Telco Relies on CoSort to Meet Billing SLAs
Arun Verma, Decision Support System Analyst for Reliance Communications.
Established in 2004 as a subsidiary of the Reliance Group, Reliance Communications is an Indian broadband and telecommunications company headquartered in Navi Mumbai, India. With over 150 million subscribers, Reliance is India’s 2nd largest telecom operator, and the 15th largest in the world. Reliance has several major business segments, including: wireless, broadband, national and international long distance, wholesale operations of its subsidiaries, investment activities of group companies, customer care activities, and direct-to-home (DTH) activities. To meet the requirements of its service level agreements (SLAs) in billing and analytics for the wireless (mobile) and global (land line) segments alone, Reliance Communications’ Decision Support team must process and report on hundreds of millions of call detail records (CDRs) within a narrow daily batch window.
Reliance maintains a large data warehouse environment to handle ETL, ELT, and reporting processes around IBM DataStage (for ETL), Oracle (the main RDB), and Business Objects (primary BI). These systems, as well as CoSort, all run on large, 64-bit Solaris servers.
The daily CDR volumes from Reliance subscribers come from binary telco switch data that Intec software ‘mediates’ into hundreds of flat files which CoSort transforms alongside DataStage, and prior to subsequent ETL processing in DataStage, and reporting in Business Objects. Prior to the implementation of CoSort more than 10 years ago, however, pilot programs to integrate and stage all the source data were not successful. Slow and inaccurate results meant that Reliance could not meet its SLAs, especially as CDR volumes grew. But once Reliance began to use CoSort’s Sort Control Language (SortCL) program scripts to filter, sort, join, and aggregate flat files in the 30-60GB range, the processing bottleneck disappeared, and the results of the file transforms were always accurate. Downstream ETL and BI processes succeeded, and SLA requirements are easily met.
CoSort is tremendously powerful when it comes to high volume data transformation. It is simply the only way we can do the ‘heavy lift’ of big file sorts, joins, and aggregations within our batch windows, without throwing increasingly large amounts of hardware at the problem. CoSort’s SortCL (4GL) program is also comparatively simple, and it is much easier to code and maintain than equivalent SQL and ETL programs.
Error messaging in CoSort 8.1.3 is not as effective as we’d like; for example, codes relating to out-of-workspace conditions were not being returned to shell programs calling SortCL programs. We understand the more current 9.5.2 release offers enhanced error traps and reporting, options for work space re-allocation, and more runtime monitoring and logging.
Reliance needed a tool to manipulate the sheer volumes of CDR data faster, and only CoSort could do it in time to meet our SLA commitments. We scoured the market for bulk data transformation engines and, to this day, we find that CoSort still provides the best price/performance available for file-based integration and staging tasks in the data warehouse.
Reliance uses CoSort’s SortCL program and has more than 100 job scripts in production. As we upgrade to the next release, we will look at the free IRI Workbench GUI (built-on Eclipse) for the purpose of managing and expanding our use of SortCL programs. We are also excited by some of the newer features in the 9.5 release, including the ability to mask and encrypt data, and produce referentially correct test data for prototyping and benchmarking.
Since 2003, Reliance has enjoyed an unusually good relationship with this vendor. Specifically, IRI has been very responsive to our technical support and license migration requests. We appreciate that IRI’s support has remained as consistent and reliable through the years as our CoSort operations.
There are dozens of commands and hundreds of options in SortCL which are explained in a lengthy manual we refer to, but do not need to study. The self-documenting nature of the SortCL language is such that we can easily understand the job flows, metadata, manipulation functions, and target mappings, simply by looking at the scripts.