Kafka: Emit production rate #17491

arunramani · 2024-11-19T16:29:41Z

Description

Add metric to track Kafka message production rate which can be useful for correlating with Kafka lag. For each collection period, the latest Kafka broker offset is compared with the previous minute to calculate the production rate.

One interesting use case is being able to use this metric to roughly calculate lag as time. This can be done by calculating the time t such that sum(time = now to t) of productionRate = lag i.e. how many minutes of production equals the current lag.

Release note

Added a new streaming ingest metric ingest/kafka/partitionProduction

Key changed/added classes in this PR

SeekableStreamingSupervisor
KafkaSupervisor

This PR has:

been self-reviewed.
added documentation for new or modified features or behaviors.
a release note entry in the PR description.
added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
added integration tests.
been tested in a test Druid cluster.

kfaraz · 2024-11-20T06:59:55Z

@arunramani , could you please share some details as to how this metric would provide additional insight that is not already covered by metrics like Kafka lag, partition lag and/or message gap?

kfaraz

@arunramani , there are some problems with the current approach.
There is no guarantee that metrics would be emitted strictly every minute. This could lead to incorrect rate values (and thus an incorrect calculation of time lag downstream) since we are really just emitting the diff between the latestSequence at the last metric emission time at the latestSequence at the current time.

IIUC, we are trying to calculate the lag in terms of "how many minutes worth of records is the supervisor yet to process".
This value is fairly similar to the message gap but not quite the same.
The message gap is the "difference between the current timestamp of the system and the timestamp of the latest ingested record".
Whereas here, I think we want the "difference between the timestamp of the latest ingested record and the timestamp of the latest record in the stream".

If that is the case, I wonder if we even need the production rate.
Instead, do you think it would make more sense to simply calculate this timestamp difference and emit it?
Or are there challenges with that approach?

Emit production rate

43145a4

github-actions bot added Area - Streaming Ingestion Area - Ingestion labels Nov 19, 2024

Add some docs

fe72dc2

github-actions bot added the Area - Documentation label Nov 22, 2024

kfaraz reviewed Dec 5, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kafka: Emit production rate #17491

Kafka: Emit production rate #17491

arunramani commented Nov 19, 2024 •

edited

Loading

kfaraz commented Nov 20, 2024

kfaraz left a comment •

edited

Loading

Kafka: Emit production rate #17491

Are you sure you want to change the base?

Kafka: Emit production rate #17491

Conversation

arunramani commented Nov 19, 2024 • edited Loading

Description

Release note

Key changed/added classes in this PR

kfaraz commented Nov 20, 2024

kfaraz left a comment • edited Loading

Choose a reason for hiding this comment

arunramani commented Nov 19, 2024 •

edited

Loading

kfaraz left a comment •

edited

Loading