Yazılım Çorbası: Apache Kafka Replication

Replication Nedir?

Açıklaması şöyle

But what if a broker managing a single partition stops responding or goes down? How would you read the messages written to that partition?
Replication or data redundancy comes to rescue here. Every partition has a replica stored on a different broker. Messages are synced or copied to the replica. If the primary partition fails, then the messages can be read from the replica.

Broker managing the primary partition is known as Leader for that partition. Other brokers which handle the replica for the same partition are followers.

A broker can be a leader for a given partition and it can function as a follower for other partitions.

Producers always write the data to the Leader. Once the Leader commits the data, followers poll the leader to bring their replicas in syn with the Leader.

Şeklen şöyle. Partition'a yazma işlemi Leader Partition üzerinden yapılır. Verinin kaybolması ihtimaline karşı Leader Partition veriyi aynı cluster içindeki farklı broker'lar üzerin çalışan Follower Partition'lara da dağıtılır.

Leader Partition

Açıklaması şöyle. Leader partition yazma işlemini gerçekleştirir ve veriyi follower partition'lara dağıtır.

Among the multiple partitions, there is one `leader` and remaining are `replicas/followers` to serve as back up. Kafka always allows consumers to read only from the leader partition. A leader and follower of a partition can never reside on the same broker for obvious reasons. Followers are always sync with a leader. The broker chooses a new leader among the followers when a leader goes down. A topic is distributed across broker clusters as each partition in the topic resides on different brokers in the cluster.

Açıklaması şöyle. Burada durability yani verinin kalıcılığı ve availability yani süreklilik atasında denge var.

1. For each partition, there exists one leader broker and N follower brokers. The config which controls how many such brokers (1 + N) exist is replication.factor .
2. An in-sync replica (ISR) is a broker that has the latest data for a given partition. A leader is always an in-sync replica. A follower is an in-sync replica only if it has fully caught up to the partition leader it’s following.
3. The time period can be configured via replica.lag.time.max.ms. If a broker goes down or has network issues, then it couldn’t follow up with the leader and after 10 seconds, this broker will be removed from ISR.
4. Producer clients only write to the leader broker — the followers asynchronously replicate the data.
5. replication.factor is a broker configuration which denotes the total amount of times the data inside a single partition is replicated across the cluster.
6. acks is a producer configuration which specifies the number of brokers that must receive the record before it is considered a successful write.
7. When a producer is set to acks=0 , the write is considered successful the moment the request is sent out without waiting for the broker to send an acknowledgement.
8. When a producer is set to acks=1 , the write is considered successful when the message is acknowledged by only the partition leader.
9. When a producer is set to acks=all or acks=-1, min.insync.replicas specifies the minimum number of in-sync replicas required to exist in ISR list for the request to be processed.
10. acks and min.insync.replicas setting together form a good way to configure the preferred trade-off between durability guarantees and availability performance.

Yazılım Çorbası

20 Ocak 2021 Çarşamba

Apache Kafka Replication

Hiç yorum yok:

Yorum Gönder

Blog Arşivi