Chat
Request Info
Download
Overview BIRT BOBJ Cognos iViz MSTR OBIEE QV R Splunk Spotfire Tableau

R is a free programming language and software environment that statisticians and data miners use for analysis and predictions, and has become known as a 'big data' visualization tool. Because R holds all its objects in memory, however, it cannot effectively work with very large data sets.

The SortCL program in the IRI Voracity platform or standalone IRI CoSort package is a fast, simple, and inexpensive way to prepare big data for R efficiently -- both in terms of job design and runtime performance. See this section to understand why.

When SortCL sorts, joins, and aggregates raw datasets in a single job and I/O pass ahead of R, time-to-visualizations in tools like ggplot or qplot are cut in half:

performance chart

Without SortCL, R will only work on multiple, small chunks of data, and require multiple code files to produce the same result as one a single SortCL script. Hadoop is another way to rapdily prepare big data sets for R of course, and Voracity users can run SortCL jobs seamlessly in Map Reduce 2, Spark, Storm, or Tez without additional coding.

For more information on the benchmark and how SortCL can prepare data in the same Eclipse environment (via the StatET for R plug-in for the IRI Workbench), see:

Blog > Business Intelligence > Easier Big Data Prep for R

Request More Information

* indicates a required field.
IRI does NOT share your information.