CDC (change data capture) is an approach to data integration that is helping firms obtain greater value from their data by allowing them to integrate and analyze data faster—and using fewer system resources. A highly efficient mechanism for limiting impact on the source extract when loading new data into operational data stores and data warehouses, CDC or change data capture complements ETL and enterprise information integration tools.
CDC eliminates the need for bulk load updating and inconvenient batch windows by enabling incremental loading or real-time streaming of data changes into your data warehouse. It can also be used for populating real-time business intelligence dashboards, synchronizing data across geographically distributed systems, and facilitating zero-downtime database migrations.
For those wondering "what is change data capture?" or confused by the differing explanations provided by database vendors, we'll provide a simple definition of CDC. Change data capture refers to the process or technology for identifying and capturing changes made to a database. Those changes can then be applied to another data repository or made available in a format consumable by ETL, EAI, or other types of data integration tools. By allowing you to detect, capture, and deliver changed data, CDC reduces the time required for and resource costs of data warehousing while enabling continuous data integration.
Before built-in features for change data capture for Oracle, SQL Server, and other databases were introduced, developers and DBAs utilized techniques such as table differencing, change-value selection, and database triggers to capture changes made to a database. These methods, however, can be inefficient or intrusive and tend to place substantial overhead on source servers. This is why DBAs quickly embraced embedded CDC or change data capture features that were log-based such as Oracle Change Data Capture (Oracle 10g). Because these features utilize a background process to scan database transaction logs in order to capture changed data, transactions are unaffected, and the performance impact on source servers is minimized.
Unfortunately, few database vendors provide embedded CDC or change data capture technology. And, even when they do, the technology is generally not suitable for capturing data changes from other types of source systems, meaning that IT teams must learn, configure, and monitor separate CDC tools for each type of database system in use at their organization—unless they have Qlik Replicate (formerly Attunity Replicate).
Qlik Replicate (formerly Attunity Replicate) is a high-performance data replication and data ingestion tool that works seamlessly with a wide variety of source and target systems and most ETL tools. Our next-generation log-based CDC is fully integrated, empowering you to capture data changes easily and efficiently from your various enterprise data sources. With support for CDC for Oracle, CDC for SQL Server, CDC for mainframes, and more, your teams can leverage one tool for all of your real-time data integration and data warehousing needs.