Operating a data warehouse is expensive. Therefore, companies are choosing to partially off-load historical data onto cheaper Hadoop stores such as Cloudera and Hortonworks as natural solutions to drive costs down. However, introducing Hadoop to an existing data warehouse creates new problems with no simple way to bridge the gap—there are two data stores in place with very different modes of access, protocols, data formats, and performance and security capabilities. These data silos present several challenges such as creating a unified report that combines the data from the two systems. IT teams may respond to this problem by physically integrating the data, but such approaches are time consuming, resource intensive, and expensive.

Data virtualization is an agile data integration approach that easily solves this problem by creating a virtual data layer on top of both the data warehouse and the Hadoop store, essentially abstracting the access to both systems, and seamlessly combining the disparate data into a unified view. Thus, data virtualization enables an effective logical data warehouse architecture that can be rapidly implemented with fewer low-skilled resources because the data from the two systems does not need to be physically moved. Now companies can enjoy the dual benefits of lowering data warehousing operational costs with a Hadoop cold data store as well as providing a virtual unified view of data that transparently transcends both systems.

Managing Data Warehouse Offloading with Big Data is the definitive guide to implementing a data warehouse offloading pattern with data virtualization.

Download the guide and:

  • Learn how to implement a successful data warehouse offloading project that integrates data across systems without the need for physical movement of data.
  • Delve into all the minute details of implementation with examples, code, and sample data sets, so you can test drive this architecture.
  • Discover the pattern and the rationale for using data warehouse offloading in the context of a logical data warehouse: when to use it and when not to use it.
  • Understand detailed business use cases that explain how a company would decide whether or not to implement data warehouse offloading.