Skip to content

Commit

Permalink
Minor improvements, notes reg. cortex-gateway
Browse files Browse the repository at this point in the history
  • Loading branch information
ptodev committed Jan 12, 2023
1 parent bc7f5fb commit f79d69d
Showing 1 changed file with 17 additions and 2 deletions.
19 changes: 17 additions & 2 deletions docs/rfcs/0007-multi-tenant-remote-write-in-flow.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ There are various places where the label can be set:
* the contents of another label
* The label might already exist on the scraped metric itself.
* The label might already be set when converting from an OpenTelemetry metric.
* In prometheus.remotewrite through an external_label field.
* In prometheus.remote_write through an external_labels field.

#### Step 2 - Remote write using the tenant HTTP header

Expand Down Expand Up @@ -130,6 +130,7 @@ Above, ```__tenant__``` is a label which could be created by relabeling another
* The value for the HTTP header could be a label value (e.g. a ```__tenant__``` label used for sharding), but how to we represent this in River config? We do not know what the label value is at the time of creating the config. Only the label name.
* It’s a very generic feature and might complicate the code as the feature gets extended. We could try to manage this risk by having few and simple sharding policies.
* We need to come up with a cleanup policy for unnecessary shards. Potentially this should be configurable.
* Caching of series refs is per WAL, so if tenants have a lot of overlap with series, this can lead to ballooning memory usage.

### Solution 3 - A non-generic, tenant-specific configuration for the Agent

Expand Down Expand Up @@ -178,12 +179,26 @@ Refer to the Prometheus documentation [here](https://prometheus.io/docs/promethe
#### Pros

* Grafana customers who use Prometheus instead of the Agent would be able to fulfill their multi-tenancy needs too.
* Little work required on the Agent side
* Little work required on the Agent side.

#### Cons

* It would be slow to merge such a major feature into Prometheus

### Solution 4 - Use the "cortex-tenant" product

There is a third party product called [cortex-tenant](https://github.com/blind-oracle/cortex-tenant), which acts as a very simple gateway between the Agent and the DB. It already exists and has been the de-facto solution to this problem for the last few years for some customers.

#### Pros

* The software already exists.
* No additional Agent work is required.

#### Cons

* No WAL, so the resiliency is limited.
* Third party product with no official Grafana support.

## Edge cases to keep in mind

### What if tenants get renamed?
Expand Down

0 comments on commit f79d69d

Please sign in to comment.