feat(kafka): tenant topics #15977

owen-d · 2025-01-28T00:53:04Z

Configurable Topic Partitioning Strategies for Kafka

This PR introduces configurable topic partitioning strategies for the Kafka tenant topic writer. Users can now choose between two strategies:

Simple Strategy (Default)

Creates one topic per tenant with multiple partitions
Uses hash-based partitioning for log distribution
Suitable for basic use cases where tenant data volume is predictable

Automatic Strategy

Creates single-partition topics in the format <prefix>.<tenant>.<shard>
Dynamically scales by creating new shards as needed
Allows both scaling up and down (by stopping writes to higher-numbered shards)
Better suited for tenants with varying data volumes

Configuration

The strategy can be configured via YAML or flags:

tenant_topic:
  enabled: true
  strategy: "simple" # or "automatic"
  # ... other config options

JStickler · 2025-01-29T20:51:48Z

docs/sources/shared/configuration.md

@@ -2368,6 +2368,31 @@ otlp_config:
 # Enable writes to Ingesters during Push requests. Defaults to true.
 # CLI flag: -distributor.ingester-writes-enabled
 [ingester_writes_enabled: <boolean> | default = true]
+
+tenant_topic:
+  # Enable the tenant topic tee


You should probably define what a "tee" is here.

JStickler · 2025-01-29T20:52:42Z

docs/sources/shared/configuration.md

+
+  # Maximum size of a single Kafka record in bytes
+  # CLI flag: -distributor.tenant-topic-tee.max-record-size-bytes
+  [maxrecordsizebytes: <int> | default = 15MiB249KiB]


Is the default a typo? Or is this some developer magic I just haven't encountered before?

This is a derived value based on some existing constants in the code, which unfortunately creates a less than pleasant reading experience for the default.

// ProducerBatchMaxBytes is the max allowed size of a batch of Kafka records. ProducerBatchMaxBytes = 16_000_000 // MaxProducerRecordDataBytesLimit is the max allowed size of a single record data. Given we have a limit // on the max batch size (ProducerBatchMaxBytes), a Kafka record data can't be bigger than the batch size // minus some overhead required to serialise the batch and the record itself. We use 16KB as such overhead // in the worst case scenario, which is expected to be way above the actual one. MaxProducerRecordDataBytesLimit = ProducerBatchMaxBytes - 16384

small fixes

rfratto · 2025-01-30T17:15:13Z

docs/sources/shared/configuration.md

+  # Prefix to prepend to tenant IDs to form the final Kafka topic name
+  # CLI flag: -distributor.tenant-topic-tee.topic-prefix
+  [topicprefix: <string> | default = "loki.tenant"]
+
+  # Maximum number of bytes that can be buffered before producing to Kafka
+  # CLI flag: -distributor.tenant-topic-tee.max-buffered-bytes
+  [maxbufferedbytes: <int> | default = 100MiB]
+
+  # Maximum size of a single Kafka record in bytes
+  # CLI flag: -distributor.tenant-topic-tee.max-record-size-bytes
+  [maxrecordsizebytes: <int> | default = 15MiB249KiB]


nit: should these names be snake_cased?

rfratto · 2025-01-30T17:16:58Z

pkg/distributor/tenant_topic_tee.go

+// ParseStrategy converts a string to a Strategy
+func ParseStrategy(s string) (Strategy, error) {
+	switch s {
+	case "simple":
+		return SimpleStrategy, nil
+	case "automatic":
+		return AutomaticStrategy, nil
+	default:
+		return SimpleStrategy, fmt.Errorf("invalid strategy %q, must be either 'simple' or 'automatic'", s)
+	}
+}


nit: Would implementing encoding.TextUnmarshaler be more idiomatic?

rfratto · 2025-01-30T17:22:00Z

pkg/distributor/tenant_topic_tee.go

+	if len(streams) == 0 {
+		return
+	}


tiniest nit: this condition could be moved to TenantTopicWrite.Duplicate to avoid spawning a goroutine when there's no work to do

rfratto · 2025-01-30T17:25:33Z

pkg/distributor/tenant_topic_tee.go

+// 1. Dynamic Scaling:
+//   - Topics are created in the form "<prefix>.<tenant>.<shard>"
+//   - Each topic has exactly one partition, with shards serving the same purpose
+//     as traditional partitions
+//   - Unlike traditional partitions which can only be increased but never decreased,
+//     this approach allows for both scaling up and down
+//   - When volume decreases, we can stop writing to higher-numbered shards
+//   - Old shards will be automatically cleaned up through normal retention policies


Using the topic name for sharding is neat, I'm interested in seeing how it goes.

If we ever need to, another solution would be to support "scale down" by adding a generation number to the topic, and writing to the newest generation (with some background discovery on a timer).

That would come with its own set of problems, probably, but it would allow us to rely on Kafka's builtin partitioning more heavily.

Either way, I don't think something like that needs to be implemented right now and I just wanted to openly share the idea. This looks good 👍

pull-request-size bot added the size/L label Jan 28, 2025

github-actions bot added the type/docs Issues related to technical documentation; the Docs Squad uses this label across many repositories label Jan 28, 2025

owen-d force-pushed the tenant-topics branch from 11e74bc to c8d2a86 Compare January 29, 2025 16:59

owen-d marked this pull request as ready for review January 29, 2025 20:43

owen-d requested a review from a team as a code owner January 29, 2025 20:43

JStickler reviewed Jan 29, 2025

View reviewed changes

owen-d force-pushed the tenant-topics branch from ff5f390 to 3f36579 Compare January 29, 2025 21:51

owen-d added 15 commits January 30, 2025 09:15

allow specifying topics in kafka encoder

09f81d7

tenant topic tee

8a0d22a

topic tee wiring

9130be1

partition resolver ifc

b42cf2c

scaffolding topic creation

393423e

topic creation

eb4e917

strategy wiring

3190e18

rate-limit adjusted partition count

44e1233

make doc

53a37e8

10s batch default

59796c0

moves code around + fixes arg parsing for strategy

cc454bc

prefix alignment

2ea8828

make doc

18275db

make doc

a5e46f6

small fixes

single flights topic creation; makes duplicate async

313c79a

owen-d force-pushed the tenant-topics branch from 3f36579 to 313c79a Compare January 30, 2025 17:15

rfratto approved these changes Jan 30, 2025

View reviewed changes

pr feedback

4502005

owen-d enabled auto-merge (squash) January 30, 2025 17:40

make doc

79c4758

owen-d merged commit c258419 into grafana:main Jan 30, 2025
60 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(kafka): tenant topics #15977

feat(kafka): tenant topics #15977

owen-d commented Jan 28, 2025 •

edited

Loading

JStickler Jan 29, 2025

JStickler Jan 29, 2025

owen-d Jan 29, 2025 •

edited

Loading

rfratto Jan 30, 2025

rfratto Jan 30, 2025

rfratto Jan 30, 2025

rfratto Jan 30, 2025

feat(kafka): tenant topics #15977

feat(kafka): tenant topics #15977

Conversation

owen-d commented Jan 28, 2025 • edited Loading

Configurable Topic Partitioning Strategies for Kafka

Simple Strategy (Default)

Automatic Strategy

Configuration

JStickler Jan 29, 2025

Choose a reason for hiding this comment

JStickler Jan 29, 2025

Choose a reason for hiding this comment

owen-d Jan 29, 2025 • edited Loading

Choose a reason for hiding this comment

rfratto Jan 30, 2025

Choose a reason for hiding this comment

rfratto Jan 30, 2025

Choose a reason for hiding this comment

rfratto Jan 30, 2025

Choose a reason for hiding this comment

rfratto Jan 30, 2025

Choose a reason for hiding this comment

owen-d commented Jan 28, 2025 •

edited

Loading

owen-d Jan 29, 2025 •

edited

Loading