Different sources call Reverse ETL – a necessary element of a modern data stack for data-driven businesses. So, what is reverse ETL, and why do you need it? Let’s explain.
Demystifying Reverse ETL
First, we need to understand what ETL is, and then we explain how Reverse ETL differs from other ETL processes.
What Is ETL?
ETL (which stands for Extract Transform Load) is a set of processes that transfer data from one or more sources to a destination in three distinct steps: extracting data from a source, transforming data, and loading it into destination. While this is not mandatory, in “traditional” ETL, the destination is often assumed to be a database or data warehouse.
Traditional ETL vs Reverse ETL
Traditional ETL usually loads data from different on-premises and cloud business apps into a database or data warehouse. Reverse ETL does exactly the opposite – it extracts data from the databases and data warehouses and puts it into operational data sources.
ETL Step | Traditional ETL | Reverse ETL |
---|---|---|
Extract | Traditional ETL connects to different business apps via their APIs and extracts data from them. | Reverse ETL extracts data from databases and data warehouses. When selecting a tool to perform Reverse ETL, it is important to select one that supports your data warehouse as a source because data warehouses are often supported as destinations only. |
Transform | Transformations are used for data cleansing, deduplication, etc. Cleansed data is brought to the format and schema, used in the destination warehouse. | Reverse ETL maps data stored in the warehouse so that it fit the schema of the corresponding operational data sources. |
Load | Data is loaded to the centralized data storage – a database or cloud data warehouse. | Data is loaded into different cloud and on-premises apps. |
Reverse ETL Evolution
How did we come to Reverse ETL? The development of Reverse ETL is probably caused by the appearance of data warehouses and later – cloud data warehouse services, like Amazon Redshift, Google BigQuery, and Snowflake. Businesses started using them as a central storage for their data and for data analysis. They collected all the data from their operational systems in warehouses for data archiving and analysis purposes.
However, such a centralized storage could become a data silo. It could contain data necessary for sales and marketing teams or other company divisions but not available to them. There was a need to get the data out of warehouses and put them back into operational systems.
When the demand for Reverse ETL grew, a number of tools and solutions appeared. Now there are both universal ETL solutions that can do both classic and Reverse ETL, like Skyvia, Integrate.io, and Hevo Data, and tools designed specifically for the purpose of Reverse ETL, like Hightouch and Grouparoo, on the market.
Role of Customer Data Platforms (CDP)
Another important step in Reverse ETL evolution is the appearance of Customer Data Platforms.
What Is Customer Data Platform
Customer Data Platform (CDP) is a data platform that collects all the data about customers from different sources, cleanses, combines, and stores them in a centralized database, providing a 360-degree view of the customer. The collected data includes both customer records from different systems like CRMs, marketing and support tools, etc. as well as the company’s website browsing data, social network interactions, etc.
CDPs have a built-in identity resolution, so they can link all the ingested data to correct customer profiles using cookies, personal identifying information, mobile advertising IDs, etc.
CDPs help businesses in many different scenarios. They help segment customers for marketing campaigns, perform marketing via multiple different channels, analyze customer behavior, understand customer traits, etc.
CDP and Reverse ETL
It is important not only to collect and analyze all this data – it is also important to "activate" the enriched, cleansed, and segmented data, to give them back to marketing, sales, and support specialists. Thus, Reverse ETL becomes required to "operationalize" the CDP data and make full use of them.
Because of the Reverse ETL necessity, some CDP systems, like Segment, have built-in Reverse ETL capabilities. Others may revert to partner integration solutions or just widely used Reverse ETL tools.
Use Cases and Benefits of Reverse ETL
So, what can Reverse ETL do for you, and why is it considered necessary by many businesses? Actually, a lot. When a data warehouse becomes a data silo, Reverse ETL tools can help you break this silo, returning the necessary data into action. Reverse ETL tools bridge the gap between data warehouses and business apps and make data from the data warehouse available for business teams that need these data to work.
Central data warehouses usually contain lots of data collected from different sources by different teams. Each team usually sees only a part of these data stored in their operational systems. Reverse ETL can enrich data in the team operational systems with data from the warehouse collected from other sources. This empowers business teams with better customer view and helps them in their work.
Another use case for Reverse ETL is restoring archived data. Data warehouses are used not only for data analysis but also for data archiving. Businesses often move outdated data from operational systems to data warehouses to decrease the cost of data storage and reduce operational system load. However, sometimes, you may need to get the archived data back into your operational systems. In this case, a Reverse ETL tool can help you move data.
Example of Reverse ETL Use
Let’s provide a real-life example of using Reverse ETL. GOintegro stored all its product data in a MySQL database and used Salesforce as a solution for client management and communication. They needed to provide reports on their data to their clients and to enrich clients’ data in Salesforce with their data in MySQL.
GOintegro was in a hurry, so they looked for a simple solution with flexible pricing. They compared several competitors on the market, including Zapier, and decided to go with Skyvia as it provided the most cost-efficient and easy-to-use solution. They used Skyvia to reverse-ETL their MySQL data to Salesforce as well as to automate their export to Google Drive to provide reports to their clients.
Automation of processes allowed GOintegro to save operating costs and improve the information delivery time, and enriching client information in Salesforce improved communication with clients and reduced the margin of error.
Conclusion
In this article, we have explained what Reverse ETL is, how it differs from classic ETL, and its place in modern data stack. As we noted, it’s a process for getting data from a data warehouse and putting them into operational systems. It’s quite useful for empowering operational teams with enriched data collected from different systems.
Hope it helps you understand the meaning of Reverse ETL and its role and decide whether you need to use the Reverse ETL approach in your business.