A question that is frequently asked is “when should I use data virtualization and when should I use ETL tools?” Other variants of this question is “does data virtualization replace ETL?” or “I’ve already got ETL, why do I need data virtualization?”  This Denodo Technologies architecture brief will answer these questions.

Extract, Transform, and Load (ETL) is a good solution for physical data consolidation projects which result in duplicating data from the original data sources into an enterprise data warehouse (EDW) or a new database.
This includes:

  • ETL tools that are designed to bulk copy very large data sets, comprising millions of rows, from large structured data sources.
  • Creating historical records of data, e.g. snapshots at a particular time, to analyze how the data set changes over time.
  • Performing complex, multi-pass data transformation and cleansing operations, and bulk loading the data into a target data store.


The reality is that, while the two solutions are different, data virtualization and ETL are often complementary technologies. Data virtualization can extend and enhance ETL/EDW deployments in many ways, for example:

  • Extending existing data warehouses with new data sources.
  • Federating multiple data warehouses.
  • Acting as a virtual data source to augment an ETL process.
  • Isolating applications from changes to the underlying data sources (e.g. migrating a data warehouse.