Tracking issue for updating timeseries schema #5941

bnaecker · 2024-06-24T23:53:23Z

This issue tracks the work required to support updating timeseries schema.

Background

The current implementation of oximeter makes it impossible to update timeseries schema. As oximeter collects samples from producers, it derives a schema for them. It then checks for an existing schema in the ClickHouse database (based on the timeseries name only). If one does not exist, the schema is inserted. If one does exist, every sample thereafter must match that inserted schema.

RFD 467 (internal-only) discusses two options for making this a reality. A few outstanding questions remain, but a lot of the work can be done now. Here are the

Describe timeseries schema in text files, rather than code. This lets us describe updates and attach much more metadata to the schema, independent of the stream of samples oximeter collects. The most important metadata is the version of each timeseries, though more will be included.
Track timeseries schema in CockroachDB. They will likely remain in ClickHouse to make querying via oxql easier, but the schema will be sent to oximeter from nexus, rather than derived from each sample. The simplest approach here would be to load these when nexus starts up, though we may also want individual producers to send them at registration time.
Update ClickHouse database to include timeseries version information. This is essentially attaching a version column to every table, in addition the current timeseries_name and timeseries_key column. This will require changing the sorting key and timeseries keys for all samples, meaning we need to drop all the historical data unfortunately.

This is broken down into individual work items below.

Tickets

Define timeseries schema in TOML #5889. This defines timeseries schema in text files, rather than code. Instead the equivalent code is generated from the text files, which also contain a good bit more metadata than we currently maintain about the schema.
Move timeseries schema to CockroachDB #1294
Populate ClickHouse timeseries schema from TOML definitions #5942. The schema need to be in ClickHouse to support oxql queries, and more generally understanding the data in a self-contained way, when one only has the ClickHouse database. As part of this, we'll update the oximeter database schema to include the new metadata. This will unfortunately require a full drop of the DB and all historical data, since we're changing the sorting key of the tables and timeseries keys on all the samples.
Move existing timeseries schema into the central library. This is really a bunch of smaller issues in each producer repo.
Enable actual timeseries schema updates #5943

There are a few things that will need to be fleshed out as we implement this. These are left as open questions on RFD 467. This includes:

Do we have exactly one current version of a timeseries schema at any one time? The alternative would be supporting more than one, such as by returning data from schema versions consistent with a query; or allowing / requiring that users query a specific version of a timeseries. If we do have one version, we need to do some kind of data migration in ClickHouse, such as adding or removing fields on old samples.
How do we support backwards-incompatible changes?
How do schema get into CockroachDB? One approach would be to load them on startup from Nexus, similar to how we load other fixed data like VPC Firewall rules. An alternative is having producers include them at registration time. I'm not sure what the right approach is.

More details

In addition to RFD 467, I wrote up this draft issue (internal-only).

The text was updated successfully, but these errors were encountered:

bnaecker added nexus Related to nexus oximeter Metrics labels Jun 24, 2024

bnaecker self-assigned this Jun 24, 2024

bnaecker mentioned this issue Jun 24, 2024

Populate ClickHouse timeseries schema from TOML definitions #5942

Open

This was referenced Jul 8, 2024

Move transaction retry timeseries to TOML #6008

Merged

Remove base_route from oximeter_collector timeseries #6028

Open

Moves the BGP session timeseries to TOML #6038

Merged

Move switch table timeseries to TOML #6047

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tracking issue for updating timeseries schema #5941

Tracking issue for updating timeseries schema #5941

bnaecker commented Jun 24, 2024 •

edited

Loading

Tracking issue for updating timeseries schema #5941

Tracking issue for updating timeseries schema #5941

Comments

bnaecker commented Jun 24, 2024 • edited Loading

Background

Tickets

More details

bnaecker commented Jun 24, 2024 •

edited

Loading