-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[nexus/sled-agent] move some types around, straighten up the dependency graph #6138
Merged
sunshowers
merged 15 commits into
main
from
sunshowers/spr/nexussled-agent-move-some-types-around-straighten-up-the-dependency-graph
Jul 25, 2024
Merged
[nexus/sled-agent] move some types around, straighten up the dependency graph #6138
sunshowers
merged 15 commits into
main
from
sunshowers/spr/nexussled-agent-move-some-types-around-straighten-up-the-dependency-graph
Jul 25, 2024
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Created using spr 1.3.6-beta.1
Created using spr 1.3.6-beta.1
Created using spr 1.3.6-beta.1
Created using spr 1.3.6-beta.1
Created using spr 1.3.6-beta.1
Created using spr 1.3.6-beta.1
Created using spr 1.3.6-beta.1
Created using spr 1.3.6-beta.1
andrewjstone
approved these changes
Jul 23, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This must have been tedious @sunshowers. Thanks for the cleanup.
Created using spr 1.3.6-beta.1
…th eastasia Created using spr 1.3.6-beta.1
Created using spr 1.3.6-beta.1
Created using spr 1.3.6-beta.1
sunshowers
deleted the
sunshowers/spr/nexussled-agent-move-some-types-around-straighten-up-the-dependency-graph
branch
July 25, 2024 02:22
This was referenced Jul 25, 2024
sunshowers
added a commit
that referenced
this pull request
Aug 13, 2024
Building on the changes in #6138, this PR turns the sled-agent API into a trait. This is high-impact because it currently is part of a circular dependency with the Nexus internal API -- this PR gets us to the point where the OpenAPI documents can be generated independently of each other. There is some code movement but there are no functional changes in this PR -- I didn't use `replace` here to keep the PR small. I'll try out `replace` for some of the types with `From` impls in a subsequent PR. (I did change the internal `VpcFirewallRule` to `ResolvedVpcFirewallRule` to match the other types in the destination file.) I've also added some documentation about when `nexus-sled-agent-shared` should be used. Depends on #6283.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Reorganize some of our data structures and simplify the dependency graph. In particular, there's now just one copy of
OmicronZonesConfig
and related types.This is mostly code reorganization, but there are a couple of interesting bits in here. Scroll down to the Questions section for more.
Motivation
1. The necessity of sled-agent-types
The primary driver behind this change was turning sled-agent's API into a trait -- breaking the sled-agent/nexus dependency cycle was one of the main motivations for that work. To recall, we currently have this logical cycle in the graph (reference):
With the work, we're attempting to move somewhere closer to this graph:
But to make that happen, some common types need to be moved into a
sled-agent-types
crate, so that the upcomingsled-agent-api
crate can depend on them.Why not put them in
sled-agent-api
itself? Because:sled-agent-client
, we often end up normalizing to a shared type -- so we either want to implementFrom
conversions between these types andsled-agent-client
types, or toreplace
them outright. So one of them has to depend on the other.replace
, the dependency direction must besled-agent-client
->sled-agent-api
.So the only feasible alternative is to put these types in a shared crate. This crate is
sled-agent-types
, which we recently introduced in #6122.2. sled-agent-types <-> nexus-types
For types shared between Nexus and sled-agent, where should we put them? The options are:
tokio-postgres
(and was actually split out of omicron-common for that reason). It wouldn't be great for nexus-client or sled-agent-client to transitively pull in tokio-postgres.Something to consider is whether we want to have two clients in omicron, one for internal consumers with more dependencies and one for external consumers with fewer ones. I'm not saying we should do this; it's just something to keep at the back of one's head.
The changes
There are a number of closely-related changes here -- most of them had to be atomic with this one. A description of the changes and the rationale for them:
1. Zone config types
These are
OmicronZonesConfig
,OmicronZoneConfig
,OmicronZoneType
, andOmicronZoneDataset
. In this PR, I've moved them fromsled-agent
tonexus-sled-agent-shared
.nexus-types
previously used the progenitor-generated copy of them.SocketAddr
/SocketAddrV6
for socket addresses instead of strings #4988.OmicronZonesConfig
in vscode always pointed me at the progenitor macro which wasn't helpful. Now, it points at the definition innexus-sled-agent-shared
.nexus-sled-agent-shared
.There's one incident that I found along the way that I believe is a bug fix. We have rules for DNS server IP addresses:
But the Progenitor-generated copy of
OmicronZonesConfig
uses strings to store address/port pairs. As a result, in some cases we managed to store an IPv4 address for an internal DNS server. With this PR, this case is now detected. (I think it would have been ultimately rejected at some point, but it's possible for the bad DNS server to end up being persisted.)2.
ZoneKind
There were previously two dataless enums describing the kind of Omicron zone floating around.
ZoneKind
insled-agent-client
, used by Nexus: exactlyOmicronZoneType
without any associated data.ZoneType
insled-agent
, used by sled-agent -- identifying the substring used within the service (notably withBoundaryNtp
andInternalNtp
combined into one).While looking around, I also noticed that there were actually four different string representations in use for Omicron zones:
Name
instanceHere, 1-3 are pretty similar to each other (they combine
InternalNtp
andBoundaryNtp
down to one, and are generated by sled-agent'sZoneType
).There is also a
ZoneType
in nexus-db-model that corresponds toZoneKind
.Based on this, I decided to combine it all down into one
ZoneKind
enum, with 4 functions returning 4 string representations: one each for 1-4.Display
impl -- users must choose the representation they want, which I think is correct.3. Physical disk config
OmicronPhysicalDiskConfig
andOmicronPhysicalDisksConfig
are shared types used by both Nexus and sled-agent. I moved them fromsled-storage
toomicron-common
, next toDiskIdentity
(which was already inomicron-common
and which these types use).nexus-types
used thesled-agent-client
-generated copy of these types -- but as discussed in the motivation section, that is no longer an option.nexus-sled-agent-shared
? Well,sled-storage
also needs these types, and I wanted to avoid sled-storage depending on the additional dependencies innexus-sled-agent-shared
.replace
to patch them in so that there's just one copy of these types throughout sled-agent and Nexus.4. Recovery silo types
These are two types,
RecoverySiloConfig
andCertificate
, that are primarily related to Nexus and thatsled-agent
was using vianexus-client
. Users of these types needed to be moved into sled-agent-types, which (as described in the motivation section) cannot depend on nexus-client. So I've moved them into shared crates.Certificate
is a simple type that can go intoomicron-common
'sapi::internal::nexus
.RecoverySiloConfig
depends onomicron-passwords
, so it goes innexus-sled-agent-shared
.We also patch these types in
nexus-client
. One behavior change as a result is that password hash strength is now validated on both the client and the server. I think this is okay, but it's worth mentioning explicitly.5.
UserId
So this is a really interesting one. It needs to be moved into a shared location because
RecoverySiloConfig
depends on it. But during this process I noticed that we weren't doing any validation on it in clients like wicketd. So this counts as a behavior change:UserId
is now validated on both the client and the server.Note that
UserId
is exposed by the Nexus external API. But this change itself has no impact on the external API (this can be seen by the fact thatnexus.json
doesn't have any changes in it)(I realized that the name should probably be something closer to
LocalUserId
, but that would be a breaking change to the external API so I've deferred it. Happy to change the name if it makes sense, though.)6. Inventory types
These are
Inventory
,InventoryDisk
andInventoryZpool
. I've moved them tonexus-sled-agent-shared
as a replacement for the previous scheme where we were using client-generated versions of them.7.
DiskVariant
,SledRole
, other misc typesThese are simple types depended upon by others that have been moved into shared crates -- locations chosen mostly based on which code depends on them.
OpenAPI changes
There are some changes to our OpenAPI specifications.
description
instances now that they're no longer going through Progenitor twice.Questions
Other notes
It would be nice to have tooling which asserts that e.g. none of the
-types
crates ever depend on-client
crates. Haven't written this tooling though.Previously I did similar things using guppy, but that's only available as a library and can take a while to build (the total number of build units for
xtask
goes from 254 to 279, which is significant). Turning this into a binary would solve this but requires some work.