Define switch data link timeseries in TOML #6120

bnaecker · 2024-07-18T20:31:33Z

Note that this renames and reorganizes the timeseries a good deal. We want to include more of the switch / sled identifiers, and make the name of the timeseries more consistent with the existing sled data link and planned instance data link timeseries.

bnaecker · 2024-07-18T21:12:20Z

Here's a bit more context. This completely abandons the existing timeseries like data_link:rx_bytes. Instead, it creates a new timeseries switch_data_link:bytes_received, to mirror the planned timeseries sled_data_link:bytes_received and guest_data_link:bytes_received. We don't currently have the ability to actually update timeseries yet (that's in-progress), but we can pretty easily abandon old ones and redefine new ones. I've taken that path here.

Once we're all happy with the actual new definition (fields, names, descriptions, etc), I have some other WIP in Dendrite to fetch the needed identifiers (for example, switch serial numbers), and populate the new timeseries definitions with them. At some point after that, I will explicitly expunge the existing data_link:* timeseries.

The other big change here is that I've removed all the per-FSM-state timeseries and made them all one, that keeps the state itself as a field. That's because I kind of expect we'll want to more easily aggregate across these states. That would be kind of tricky in OxQL now, because we don't currently support aggregating across timeseries, but you can aggregate within a timeseries. For example, we can still do this to get the same data as before:

> get switch_data_link:link_fsm | filter state = "ber_check_done"

That would be equivalent to this today:

> get data_link:b_e_r_check_done

We can then also do aggregations, say if we wanted to count all the transitions of any kind, not just in a particular state:

> get switch_data_link:link_fsm | align mean_within(1m) | group_by [state]

Anyway, this is all a bit of an experiment, but hopefully that provides some more insight into where we want to go and why this is the next step along the way.

bnaecker · 2024-07-23T16:36:23Z

Friendly ping on this one folks

Nieuwejaar

It's obvious enough that I probably don't even need to mention it, but... it sure would be nice if there were an automated way to ensure that this definition stays in sync with the upstream data source. I did a side-by-side visual comparison, but that's not the most confidence inspiring mechanism.

bnaecker · 2024-07-23T17:07:07Z

@Nieuwejaar What is the "upstream data source" in your comment?

Nieuwejaar · 2024-07-23T17:32:09Z

@Nieuwejaar What is the "upstream data source" in your comment?

In this case, dendrite. Since our counters are defined by the ASIC, they are unlikely to change (until/unless we switch ASICs), but I was thinking of the more general problem of keeping these schema in sync with with those used by the producers.

bnaecker · 2024-07-23T17:38:34Z

Ok, I think I understand. One of the main goals of all the work to move the timeseries definitions into one central location is to ensure consistency. I have a follow-up PR in the works which will remove the existing timeseries definitions, e.g., the ones like make_cumulative!(Capacity);, and replace them with the corresponding timeseries from the new TOML definitions.

So if we decided to publish a new ASIC counter, we would add that to the TOML definitions, and then consume it in Dendrite and start publishing that when Oximeter comes knocking. Hopefully that assuages the concern about keeping things in sync: the text-file definitions in Omicron will be the single source of truth.

Nieuwejaar · 2024-07-23T17:44:32Z

Hopefully that assuages the concern about keeping things in sync: the text-file definitions in Omicron will be the single source of truth.

Very much so. Thanks.

zeeshanlakhani · 2024-07-24T19:48:42Z

guest_data_link:bytes_received

This is what we called instance previously?

zeeshanlakhani · 2024-07-24T19:49:56Z

The other big change here is that I've removed all the per-FSM-state timeseries and made them all one, that keeps the state itself as a field. That's because I kind of expect we'll want to more easily aggregate across these states. That would be kind of tricky in OxQL now, because we don't currently support aggregating across timeseries, but you can aggregate within a timeseries. For example, we can still do this to get the same data as before:

Should we update/collate a readme to update OxQL queries here to match the changes? I had to planned to do this at some point too.

zeeshanlakhani

@benjaminleonard This essentially maps what we have today and streamlines how we want to capture all these across the system, so 👍🏽. I know you mentioned that you're working on schema update, as we'll probably want to update fields over time as we add more.

zeeshanlakhani · 2024-07-24T20:02:36Z

Approved here, but had one question on documentation really.

bnaecker · 2024-07-24T20:16:22Z

guest_data_link:bytes_received

This is what we called instance previously?

This timeseries does not exist yet. We'll need to create it soon, and probably publish those samples to oximeter from the Propolis zone. I made this change here so the name and fields match up more directly with that to-be-created timeseries.

Should we update/collate a readme to update OxQL queries here to match the changes? I had to planned to do this at some point too.

The current place to do that is RFD 463, which I still owe an update on. Part of the motivation of the TOML-formatted timeseries definitions is so that we can auto-generate documentation too. Maybe it would be worth coordinating with @benjaminleonard and / or @david-crespo on how to create those and get them into our documentation pages and / or the console itself. That would be really sweet.

zeeshanlakhani · 2024-07-24T20:25:56Z

This timeseries does not exist yet. We'll need to create it soon, and probably publish those samples to oximeter from the Propolis zone. I made this change here so the name and fields match up more directly with that to-be-created timeseries.

Oh, I know that it didn't. I meant what I called it in https://github.com/orgs/oxidecomputer/projects/55/views/1?filterQuery=&pane=issue&itemId=68336554, maybe?

bnaecker · 2024-07-24T21:10:38Z

Oh, I see. Yeah instance_data_link:* might be better, I was going off my memory.

bnaecker requested review from Nieuwejaar and zeeshanlakhani July 18, 2024 20:55

bnaecker added 2 commits July 18, 2024 21:37

Add Units::None, remove duplicated timeseries

7ebafce

update openapi spec

ea8e041

Nieuwejaar approved these changes Jul 23, 2024

View reviewed changes

zeeshanlakhani approved these changes Jul 24, 2024

View reviewed changes

bnaecker merged commit 18f2520 into main Jul 24, 2024
23 checks passed

bnaecker deleted the move-switch-data-link-timeseries-to-toml branch July 24, 2024 21:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Define switch data link timeseries in TOML #6120

Define switch data link timeseries in TOML #6120

bnaecker commented Jul 18, 2024

bnaecker commented Jul 18, 2024

bnaecker commented Jul 23, 2024

Nieuwejaar left a comment

bnaecker commented Jul 23, 2024

Nieuwejaar commented Jul 23, 2024

bnaecker commented Jul 23, 2024

Nieuwejaar commented Jul 23, 2024

zeeshanlakhani commented Jul 24, 2024

zeeshanlakhani commented Jul 24, 2024

zeeshanlakhani left a comment

zeeshanlakhani commented Jul 24, 2024

bnaecker commented Jul 24, 2024

zeeshanlakhani commented Jul 24, 2024

bnaecker commented Jul 24, 2024

Define switch data link timeseries in TOML #6120

Define switch data link timeseries in TOML #6120

Conversation

bnaecker commented Jul 18, 2024

bnaecker commented Jul 18, 2024

bnaecker commented Jul 23, 2024

Nieuwejaar left a comment

Choose a reason for hiding this comment

bnaecker commented Jul 23, 2024

Nieuwejaar commented Jul 23, 2024

bnaecker commented Jul 23, 2024

Nieuwejaar commented Jul 23, 2024

zeeshanlakhani commented Jul 24, 2024

zeeshanlakhani commented Jul 24, 2024

zeeshanlakhani left a comment

Choose a reason for hiding this comment

zeeshanlakhani commented Jul 24, 2024

bnaecker commented Jul 24, 2024

zeeshanlakhani commented Jul 24, 2024

bnaecker commented Jul 24, 2024