-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define timeseries schema in TOML #5889
Conversation
eff0a37
to
13796f2
Compare
Note to self: make issues for sequence of work, and refer to them in this PR instead of |
0933ece
to
3199953
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the many nits and questions / thanks for your patience! I haven't done a lot with oximeter so I hope these aren't way off base.
/// | ||
/// As developers define new timeseries, those should be added to this test for | ||
/// checking. To define metrics in their own crates, developers use the | ||
/// proc-macro `oximeter::use_timeseries!()` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We chatted about this quite a lot offline. Trying to summarize the highlights: I'd vote to centralize the TOML files and definitions of all the timeseries types, either into its own crate in omicron
or its own crate in a separate repo altogether. Some pros:
- Needing to add newly added timeseries types to this test seems really easy to forget to do, especially if the timeseries is defined outside omicron entirely. If all of them are centralized, presumably that crate can ensure that any tests it wants to run includes all of them.
- We don't yet know exactly how we're going to support live upgrades where we might have different versions of Nexus and/or different versions of metrics producers concurrently operating within one rack, but we've talked about some ideas for how to handle that that involve tooling to check for compatibility across versions on the OpenAPI side. If we go that route and want to write similar tooling for Oximeter, my gut feeling is that will be easier to do if all the TOMLs and types are in one place.
- Similarly, if we decide to make changes to the TOML format itself, that's much easier if they're all living in a centralized place.
The biggest downside (we think?) to centralizing them is for timeseries that are defined outside omicron; updating or adding such will require
- A PR to omicron to update the TOML and generate new types
- A PR to the outside-omicron repo to pull in the new omicron
- Another PR to omicron to pull in the new outside-omicron bits from 2
(This is a really unfortunate cross-repo cycle already, but I think that's the world we're living in now regardless.) If this is a major pain point but we prefer the centralized timeseries/TOMLs otherwise, that might be an argument for moving them to their own repo that both omicron and other timeseries-users could depend on.
Ok, I believe I've addressed most of @jgallagher's comments. The big outstanding chunk I'd like to do is bite the bullet and centralize everything in one crate, probably |
Ok @jgallagher I've also centralized the timeseries definitions into the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Just one note / question about how strict we want to be with schema location.
Thanks for the review @jgallagher! All the comments were really helpful. I'm going to give Adam a chance to take a look, and then I'll fix up the conflicts on the way in. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe I found a bug... or I'm confused
Aight, I'm going to start rebasing on main to resolve the conflicts before smashing that merge button. |
3dddfc3
to
a675cf4
Compare
b1be993
to
1fa8020
Compare
This is the first commit in a thread of work to support updates to timeseries schema. In our first step, we're moving the definition of timeseries from the Rust structs we use today, to TOML text files. Each file describes one target and all the metrics associated with it, from which we generate exactly the same Rust code that it replaces. This opens the door to a lot of downstream work, including well-defined updates to schema; detection of conflicting schema; improved metadata for timeseries; and generation of documentation. As an example, this moves exactly one (1) timeseries into the new format, the physical datalink statistics tracked through illumos kstats. - Move the `oximeter` crate into a private implementation crate, and re-export it at its original path - Add intermediate representation of timeseries schema, and code for deserializing TOML timeseries definitions, checking them for conflicts, and handling updates. Note that **no actual updates** are currently supported. The ingesting code includes a check that version numbers are exactly 1, except in tests. We include tests that check updates in a number of ways, but until the remainder of the work is done, we limit timeseries in the wild to their current version of 1. The actual file format also has version number associated with it, currently asserted to be 1, so that we can more easily update the entire format if needed. - Add a macro for consuming timeseries definitions in TOML, and generating the equivalent Rust code. Developers should use this new `oximeter::use_timeseries!()` proc macro to create new timeseries. - Add an integration test in `oximeter` that pulls in all the centrally-defined timeseries schema, and checks them all for conflicts. As developers add new timeseries in TOML format, they will be picked up automatically by this test, to detect problems at merge time. - Updates `kstat-rs` dep to simplify platform-specific code. This is part of this commit because we need the definitions of the physical-data-link timeseries for our integration test, but those were previously only compiled on illumos platforms. They're now included everywhere, while the tests remain illumos-only.
1fa8020
to
72a1a6c
Compare
This is the first commit in a thread of work to support updates to timeseries schema. In our first step, we're moving the definition of timeseries from the Rust structs we use today, to TOML text files. Each file describes one target and all the metrics associated with it, from which we generate exactly the same Rust code that it replaces.
This opens the door to a lot of downstream work, including well-defined updates to schema; detection of conflicting schema; improved metadata for timeseries; and generation of documentation. As an example, this moves exactly one (1) timeseries into the new format, the physical datalink statistics tracked through illumos kstats.
oximeter
crate into a private implementation crate, and re-export it at its original pathoximeter::use_timeseries!()
proc macro to create new timeseries.oximeter::use_timeseries!()
, and checks them all for conflicts. As developers add new timeseries in TOML format, they must be added here as well to ensure we detect problems at CI time.kstat-rs
dep to simplify platform-specific code. This is part of this commit because we need the definitions of the physical-data-link timeseries for our integration test, but those were previously only compiled on illumos platforms. They're now included everywhere, while the tests remain illumos-only.