10 Mart 2021 Çarşamba

Change Data Capture

Giriş 
Bir veri tabanında yapılan değişikliğin bir başka veri tabanına aktarılmasını amaçlar. Açıklaması şöyle
Change Data Capture (CDC) is a design pattern focused on data integration. The CDC design pattern is quite simple: take every data change committed to a data source and publish it as an event.
CDC şu mimarilerde işe yarar

Eleştiriler
Açıklaması şöyle
... Even though CDC is a good start to moving toward event-driven architecture, it can not be a final stop. There are major challenges with it.

- CDC ties together the application database schema with the rest of the consumers. Application databases developers should be free to modify their databases without impacting any downstream consumers. CDC blocks that in a major way. When we do not allow other microservices to directly connect to databases but through APIs, how can we think the change in the stream of databases is OK for others to depend on?

- It can create major data quality issues because producers can make backward-incompatible changes without notifying consumers.

- Data domain could be very different from what you publish for the rest of the company to consume. CDC ties all downstream users to that schema 

Change Data Capture vs Event Carried State Transfer
Event Carried State Transfer için açıklama şöyle. Aslında amaç aynı. Bir veri tabanında değişiklik olunca diğerine yansıtmak. Kullanılan yöntem farklı. CDC de değişikliği yakalayan ve Kafka gibi bir broker'a yazan bir araç var. Event Carried State Transfer de ise bu işi micro service kodu yapıyor.
The idea behind Event Carried State Transfer pattern is — when a Microservice inserts/modifies/deletes data, it raises an event along with data. So the interested Microservices should consume the event and update their own copy of data accordingly.

CDC Event Yapısı
Örnek
Şöyle olabilir. Burada verinin before ve after halleri görülebilir.
{
  "schema": {...},
  "payload": {
    "before": {  
      "id": 1004,
      "first_name": "Anne Marie",
      "last_name": "Kretchmar",
      "email": "annek@noanswer.org"
    },
    "after": null,  
    "source": {  
      "name": "1.4.1.Final",
      "name": "dbserver1",
      "server_id": 223344,
      "ts_sec": 1486501558,
      "gtid": null,
      "file": "mysql-bin.000003",
      "pos": 725,
      "row": 0,
      "snapshot": null,
      "thread": 3,
      "db": "inventory",
      "table": "customers"
    },
    "op": "d",  
    "ts_ms": 1486501558315  
  }
}
CDC Araçları
Açıklaması şöyle. Önemli araçlar Debezium ve Kafka Connect
Database vendors provide a log scrapping system for popular databases. These are combined with the CDC tools to enable source and destination connectors. Two popular choices are Debezium and Kafka Connect. Kafka Connect provides support for large number of “source” and “sink” connectors.

Cloud databases provide this pattern as a managed cloud-native service. For example AWS Dynamo Streams ( Supported by AWS Kinesis Data Stream).

This pattern is also captured in the microservices Transaction Outbox pattern & Transaction Log Tailing pattern.

Frameworks like Axon & Eventuate provide support for this pattern.
Debezium
Debezium yazısına taşıdım

Hiç yorum yok:

Yorum Gönder