Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set replication factor for kafka stability #1606

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

fedeabih
Copy link

Description

This change resolves the issue Failed to get watermark offsets: Local: Unknown partition. The root cause was related to the Kafka replication configuration. By setting the replicationFactor to 3 (matching the number of Kafka brokers/controllers), this fix ensures consistent behavior when retrieving high watermark offsets. This issue was reported in sentry-kubernetes/charts#1458.

Technical Explanation

The issue arises because the replicationFactor was previously set to 1, meaning that each partition only had a single replica. In this configuration, the high watermark offset—a key value in Kafka that indicates the maximum offset successfully replicated to all in-sync replicas (ISRs)—becomes unreliable.

Without sufficient replication, the loss of a single broker or temporary unavailability can result in Kafka being unable to compute or provide the high watermark for affected partitions. This leads to the error:
Failed to get watermark offsets: Local: Unknown partition.
By increasing the replicationFactor from 1 to 3, each partition is replicated across all three brokers/controllers. This ensures that the high watermark offset remains consistently available, even if a broker becomes unavailable or experiences minor instability. Additionally, the increased replication enhances fault tolerance and improves the overall availability of partition data across the cluster.

For more details on how Kafka replication works and the role of the high watermark, refer to the official documentation: Replication in Apache Kafka.

@fedeabih fedeabih mentioned this pull request Nov 22, 2024
1 task
@patsevanton
Copy link
Contributor

@Mokto Please review the changes.

Copy link
Contributor

@Mokto Mokto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it backward compatible?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants