Sunday, August 2, 2020

Change data capture - CDC

Recently I attended a webinar by RedHat, Change Data Capture with Debezium and Apache Kafka. CDC is a new jargon to me. I did a quick search on this and transaction logs. They are actually not the same thing.

CDC is an approach to capture changes made to a data source. It records insert, update and delete activities.

It can be fed to an ETL (Extract, Transform and Load application) for data transformation, then sent to target applications, such as for telemetry dashboard, replication to a different database, store into an ODS (Operational Data Store) or data lake, etc.

Some notable articles for further read.

