Innovative Routines International (IRI), Inc. (The CoSORT Company) respects your time and privacy. You can stop or start future quarterly mailings at any time: click here or on the links at the bottom of this message. Please forward this newsletter to anyone interested in IRI or our high-performance data manipulation and management  tools, or open systems data processing technology.


FAst extraCT Logo       CoSORT Logo       RowGen Logo
The CoSORT Journal: Data Sorting and ETL News
                                                                                  Quarter 4, 2006
In this issue:

Zune Drawing via Short Survey
Multi-Core & VPAR Licensing
CoSORT Opens in Thailand
Tech Tip: Protecting Data at Risk
Short Industry Survey - Please Take a Minute

In an ongoing effort to provide fast, inter-connected data processing solutions, Innovative Routines International (IRI) periodically surveys current and prospective industry partners to stay up-to-date on current application environments and future data processing requirements.

The survey data may be aggregated for trend analysis, but individual responses are always considered (and never forwarded outside IRI). Please follow the link above to complete this very quick survey and enter the raffle for a 30GB Microsoft Zune player. You need not be a current IRI customer, and your feedback is valuable. Thank you!

Multi-Core & Virtual Partition Licensing

IRI's main software products - CoSORT, FACT and RowGen - are licensed for perpetual use at one-time prices based on the performance and capacity of the computer nodes on which they run.

Just as CoSORT can directly exploit multiple CPUs to improve high volume data transformation performance, it can distribute jobs across multiple cores on single CPUs. The licensing mechanism does not distinguish between cores and CPUs, so it continues to be up to each site how many cores and CPUs should be licensed based on benchmark tests during installation and proof of concept. Multi-core discounts may be available based on volume and performance, and you should discuss configuration particulars and test results with your IRI agent.

Similarly, with respect to virtual partitions, there is no way to distinguish between physical and logical nodes, where IRI's applications are often required for multiple purposes (e.g. production, development, test, QA, fail-over, etc.). For these platforms, base license fees are already lowered to reflect the likelihood of multi-domain or LPAR licensing. Multi-partition discounts apply in simultaneous procurements. You should discuss your configuration particulars and node usage with your IRI agent.

The next challenge will be systems that can dynamically allocate resources. Keep reading these newsletters.

IRI-CoSORT Now Open in Thailand

Suvitech is an IT Service company headquartered n Bangkok with operations in the Asia Pacific region. The company is certified by the Board of Investment under the Thai Ministry of Industry, and is a registered consultant with the Ministry of Finance. Suvitech provides specific business solutions, systems integration and consulting services -- especially for telco and finance companies. Suvitech can integrate and resell CoSORT alongside its telecom traffic analytic and electronic billing applications.

About IRI, Inc. and The CoSORT Journal
  • CoSORT solutions serve data and data warehouse (ETL) architects, very large database (VLDB) administrators, mainframe sort migrators, and independent software vendors (ISVs)  building faster sorting and data transformation into their applications.
  • CoSORT delivers the IT industry's fastest parallel UNIX sort engine and one of its most powerful flat-file manipulation and reporting programs, SortCL, which combines: row filtering and conditional selection, sort/merge and joins, drill-down aggregation and cross-row calculation, conversion and collation of more than 100 data types, database sequencing, and multi-target, multi-level output reformatting for reports, hand-offs, and DB load utilities.
  • Other special CoSORT features include: coroutine sort architecture; fully tunable and scalable parallel sort performance on all multi-CPU UNIX and Windows servers; cross-calculation on aggregated values and aggregation on cross-calculated values, cross-table joins (matching) integrated with data conversion and expression logic; multinational date and timestamp support; cross-platform Java GUI; and, e-commerce reporting via CSV/CLF and IP Address manipulation, plus ELF input and HTML output.
  • CoSORT also has plug-n-play replacements or parameter converters for sorting in: ACUCOBOL-GT, Amdocs Ensemble (telecom billing) Ascential DataStage; Informatica PowerCenter and PowerMart; Cincom Supra; IBM's DB2 loader and MVS/VSE sorts; MF COBOL Workbench, Net and Server Express; SAS System; Software AG Natural; Sun MRP, and, UNIX SVR4 (/bin/sort), et al.
  • IRI has begun to offer other data manipulation and management solutions like: FACT for fast unloads from Oracle; RowGen for custom data generation and format simulations; netCONVERT for mainframe tape data conversion and reformatting; x-PRESS for fast, and secure data compression and decompression; Logon for controlling and auditing access to UNIX systems; and Permitas for licensing and activating software applications.
  • The CoSORT Journal is a quarterly Email newsletter designed to keep subscribers updated on salient news and events at IRI, Inc. Past editions are archived here.

To remove or add an Email address in future CoSORT Journal mailings, please email news@iri.com. To contact an IRI agent, click here, call 1-800-333-SORT, or email info@iri.com.

Copyright © 2006 Innovative Routines International (IRI), Inc. 2194 Highway A1A, Suite 303
Melbourne, FL 32937-4932 USA
All rights reserved.

Ripped from the Headlines - Private Data at Risk

Did you know that more than 100 million records containing the personal information of U.S. residents have been exposed due to security breaches since just February of 2005? This information is tallied and posted on-line by the

Privacy Rights Clearing House

Many breaches are reported each month. For example:

02'05 - DB Company - 163K records - identity thieves
03'05 - University - 120K records - hacking
04'05 - State Agency - 464K records - bad insider
05'05 - Cable Company - 600K records - lost backup
06'05 - Credit Card Co. - 40M records - hacking
07'05 - University - 49K records - hacking
08'05 -
University - 100K records - hacking
09'05 - School Loan Agency - 165K records - lost CD
10'05 - Hospital - 130K records - lost backup
11'05 - Conglomerate - 161K records - stolen laptop
12'05 - Automobile Co. - 70K records - stolen computer
01'06 - Health Care Co. - 365K records - backup theft
02'06 - Federal Agency - 38K records - stolen laptop
03'06 - Credit Card Co. - 17.7M+ records - insider or malware
04'06 - University - 300K records - hacking
05'06 -
Federal Agency - 30M records - laptop (recovered)
06'06 - Insurance Co. - 930K records - stolen computer
07'06 -
Federal Agency - 100K records - posted
08'06 - Energy Co. - number unknown - stolen laptop
09'06 - Conglomerate - 50K records - stolen laptop
10'06 - City Agency - 1.35M records - hacking
11'06 - Beverage Co. - 60K records - misplaced laptops
12'06 - University - 800K records - database hacked

The above incidents are only a fraction of what is posted at the Privacy Rights Clearinghouse web site. Among the personal data elements exposed are names, home and email addresses, telephone, credit card, and/or social security numbers.

Personal information can be pre-screened for sensitivity through a reasonable data governance effort. Private files and fields can be de-identified so that future security breaches cannot compromise personal identities.

IRI software can assist in this task in several ways. Consider, for example, that RowGen users can generate safe test data in the same field forms and file formats as real data. This data is safe for outsourcing file formats, and ideal for application prototyping, stress-testing and benchmarking.

If you must use and distribute real, but confidential production data, you can still easily de-identify it. See the "Tech Tip" below for three simple ways you can protect data at the field level - while at the same time transforming and reporting on it - in one I/O and CoSORT Sort Control Language (SortCL) job script.

 
Tech Tip: 3 Ways to Mask Sensitive Field Data

CoSORT SortCL users formatting flat files or loading database tables can protect sensitive fields in their production data while simultaneously processing (transforming) and presenting (reporting).

In the sample input below, there are 5 data fields:

ID SSN       Last Name  Salary St
01 330170363 Clay       56,650 CT
02 421901269 Guerrero   15,000 MD
03 529433545 Caldwell   41,100 NY
04 129737773 Puckett    44,550 NY
05 594521240 Lindsey    55,800 TX
06 796569799 Lindsey    98,525 TX

The objective is to make the file safe for compliance with industry regulations and company privacy policies so the data can travel or be outsourced without risk. The desired output preserves the ID numbers, but de-identifies each record by 1) transforming the social security number, 2) filtering out the last name, and 3) masking the salary field.

ID SSN       Salary St
01 359790672 ****** CT
02 443252484 ****** MD
04 158913886 ****** NY
03 558317036 ****** NY
06 748284900 ****** TX
05 547262426 ****** TX

These objectives were accomplished simultaneously by running the simple job script below.

/INFILE=$private_input.dat 
/FIELD=(index,POS=1,SIZE=2, ASCII)
# Break the SSN into parts to obscure it

/FIELD=(ssno_part1,POS=4,SIZE=1)
/FIELD=(ssno_part2,POS=5,SIZE=4,NUMERIC)
/FIELD=(ssno_part3,POS=9,SIZE=4,NUMERIC)
/FIELD=(lname,POS=14,SIZE=10)
/FIELD=(salary,POS=25,SIZE=6)
/FIELD=(state,POS=32,SIZE=2)
/SORT # Optional high-performance CoSORT job
/KEY=state # Only field name is needed
/OUTFILE=secure # One of many possible re-formatted targets
/FIELD=(index,POS=1,SIZE=2)
# Keep the first digit of the SSN the same:
/FIELD=(ssno_part1,POS=4,SIZE=1)
# Obscure the next 4 digits by dividing by 2 if digits > 4500
# otherwise multiply by 2 and then subtract 55

/FIELD=(ssno_part2_new,POS=5,SIZE=4.0,FILL='0',NUMERIC,\
IF ssno_part2 GT 4500\
THEN ssno_part2 / 2\
ELSE 2 * ssno_part2 - 55)
# Obscure the final 4 digits by dividing by 2 if digits > 4500
# otherwise multiply by 2 and then subtract 54

/FIELD=(ssno_part3_new,POS=9,SIZE=4.0,FILL='0',NUMERIC,\
IF ssno_part3 GT 4500 \
THEN ssno_part3 / 2 \
ELSE 2 * ssno_part3 - 54)
/DATA=" "
# Leave out the lname field (just don't specify it)
# Use a masking character where the salary field would be

/DATA={6}"*" # Display 6 asterisks
/FIELD=(state,POS=21,SIZE=2)

As always, email questions to  support@iri.com.


CoSORT®, SortCL, Rowgen and Permitas are trademarks of IRI. FACT is a trademark of IDS Ltd. (CoSORT Korea). Other product or brand names mentioned herein may be (registered) trademarks of their respective owners. For example, Zune is a trademark of Microsoft Corporation.