You are here

Data Virtualization for Big Data

The Denodo Platform supports many patterns, or use cases, with Big Data – whether with Hadoop distributions (Cloudera, Hortonworks, Amazon’s Elastic Map reduce on EC2, etc.) or NoSQL data stores such as MongoDB, Cassandra, Neo4j, Aerospike, and so on. These newer data stores do not typically provide a standard JDBC/ODBC-based SQL interface which make them difficult to use for BI and reporting tools and for data analysts who are familiar with SQL queries. The Denodo Platform provides an abstraction and federation layer that hides the complexities of the Big Data stores and makes it easy to integrate data from these stores with other data within the enterprise.

The Denodo Platform supports a number of Big Data patterns, including:

Hybrid Data Warehouse

Also called ‘Data Warehouse Offloading’ or ‘Horizontal Partitioning’, this offloads older, less frequently accessed data from the data warehouse into cheaper commodity storage, such as Hadoop HDFS. The Denodo Platform sits on top of both data stores and federates queries across both so that the hybrid data warehouse appears as a single data store to users. The Denodo Platform optimizes queries so that unnecessary query branches are ignored to provide better performance to allow you to meet your SLAs.

Hybrid Data Warehouse

Click to enlarge

Data Lakes and Enterprise Data Hubs

Both of these patterns involve storing data in a central Hadoop data store. This makes it cheaper to store massive amounts of data in a central repository and make it available throughout the organization.

However, the users still need to access the data with their existing tools – whether using JDBC/ODBC for SQL queries with BI and reporting tools, SOAP/XML Web Services for ESB/BPM systems, RESTful Web Services for mobile or web applications, and so on.

The Denodo Platform provides a standardized access layer for data stored in a data lake or enterprise data hub. It abstracts the complexities of the data lake or enterprise data hub interfaces to present the data as standards-based protocols such as SQL (JDBC, ODBC, and ADO.NET), Web Services (SOAP/XML and REST), and Web Parts for Microsoft SharePoint integration. The Denodo Platform also makes it easy to integrate data in the data lake with data from other data sources, such as external Web services, Web data, operational databases, enterprise applications, and so on.

Analytical Data Integration Diagram

Click to enlarge

 

Analytical Data Integration

Hadoop is great for cost and time-effectively performing analysis over huge data sets – extracting critical insights from sensor data, clickstream data, mobile data, etc. However, performing the analysis to get these insights is only the first step in taking advantage of the benefits of Hadoop and Big Data generally.

Once you have performed the analysis and got the results that you were looking for, you need to turn this into actionable information. Typically, this means integrating the analytics results with other information with your enterprise – information from, say, your CRM system or your Order Management System.

This is where the Denodo Platform comes into play. The Denodo Platform makes it quick and easy to integrate the data from Hadoop with data from more traditional data sources – such as data warehouses, operational databases, enterprise applications (on-premise or Cloud-based), and so on. The Denodo Platform exposes this integrated data as services to consuming applications – BI reporting tools, dashboards, Excel, web applications, mobile applications, etc. – making it easy to turn the analytics results into actionable information that can be used by the business stakeholders.

Analytical Data Integration Diagram

 

Big Data As an Analytics Sandbox

Your data warehouse is a critical part of your data architecture. It’s designed to provide timely response to queries for analysis and reporting. The last thing that you want is a business analyst unleashing complex, resource consuming queries as he ‘plays’ with the data – this sort of thing can disrupt the smooth running of the data warehouse and threaten your SLAs.

You need to move the business analyst to a sandbox where they can play with all the data without affecting the performance and stability of your data warehouse operations. This is where Hadoop and the Denodo Platform come in…the required data is offloaded from the data warehouse into low cost storage in Hadoop, saving the data warehouse from additional unplanned work load. The Denodo Platform provides a standards-based interface to the offloaded data in Hadoop, allowing the business analyst to use his usual tools to perform the analysis of the data. The Denodo Platform supports JDBC and ODBC APIs allowing the analyst to use the usual reporting and visualization tools together with programming languages such as ‘R’ to develop new analytical applications.

Analytics Sandbox Diagram

Click to enlarge

Tags: