For the last 30 or so years, the precursor to most large scale business intelligence (BI) environments has been the Enterprise Data Warehouse (EDW). A data warehouse (DW) is usually a central database (DB) for reporting, planning, and analyzing summarized, subject-matter data integrated from disparate, historical transaction sources.
Big data integration activities can happen outside the database in an extract, transform, load (ETL) environment, or inside the database in ELT:
One example of an ELT operation would be Informatica’s Pushdown Optimization option, in which users transform data in a relational database like Oracle, or in Teradata.
The IRI data management platform Voracity, as well as its constituent tools, can perform and speed big data warehouse extract, transform, load (ETL) operations, delaying the need for new hardware or expensive proprietary appliances: http://www.iri.com/blog/data-transformation2/a-big-data-quandary-hardware-or-software-appliances-or-cosort/
In 1992, Digital Equipment Corporation (DEC, long since acquired) asked IRI to develop a 4GL interface to CoSort in the syntax of the VAX VMS sort/merge utility.
Big Data Problem Big data volumes are growing exponentially. This phenomenon has been happening for years, but its pace has accelerated dramatically since 2012. Check out this blog entitled Big Data Just Beginning to Explode from CSC for a similar viewpoint on the emergence of big data and the challenges related thereto.
Database Test Data Usage – This blog caught my eye because of its title, Do the right thing when testing with production data. It struck me as oxymoronic, since we know production data should not be used for testing at all …
Of course we know how tempting it is to use production data for testing applications, simulating databases, prototyping ETL operations, and just about anything else that needs to work with the real thing.