-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[inventory] Add Sled Agent datasets to inventory #6167
Conversation
This PR builds on the work in #6144 - it's mostly for visibility, to help us see both "what datasets are configured to run on a sled" and also "what datasets actually exist on a sled" without needing to have shell access on the sled itself. |
This PR exposes an API from the Sled Agent which allows Nexus to configure datasets independently from Zones. Here's an example subset of `zfs list -o name` on a deployed system, with some annotations in-line ```bash # This is the pool of an arbitrary U.2 oxp_e12f29b8-1ab8-431e-bc96-1c1298947980 # Crucible has a dataset that isn't encrypted at the ZFS layer, because it's encrypted internally... oxp_e12f29b8-1ab8-431e-bc96-1c1298947980/crucible # ... and it contains a lot of region datasets. oxp_e12f29b8-1ab8-431e-bc96-1c1298947980/crucible/regions/... # We have a dataset which uses a trust-quorum-derived encryption key. oxp_e12f29b8-1ab8-431e-bc96-1c1298947980/crypt # Durable datasets (e.g. Cockroach's) can be stored in here. oxp_e12f29b8-1ab8-431e-bc96-1c1298947980/crypt/cockroachdb # The "debug" dataset has been historically created by + managed by the Sled Agent. oxp_e12f29b8-1ab8-431e-bc96-1c1298947980/crypt/debug # Transient zone filesystems also exist here, and are encrypted. oxp_e12f29b8-1ab8-431e-bc96-1c1298947980/crypt/zone oxp_e12f29b8-1ab8-431e-bc96-1c1298947980/crypt/zone/oxz_cockroachdb_8bbea076-ff60-4330-8302-383e18140ef3 oxp_e12f29b8-1ab8-431e-bc96-1c1298947980/crypt/zone/oxz_crucible_a232eba2-e94f-4592-a5a6-ec23f9be3296 ``` ## History Prior to this PR, the sled agent exposed no interfaces to **explicitly** manage datasets on their own. Datasets could be created one of two ways: 1. Created and managed by the sled agent, without telling Nexus. See: the `debug` dataset. 2. Created in response to requests from Nexus to create zones. See: `crucible`, `cockroachdb`, and the `zone` filesystems above. These APIs did not provide a significant amount of control over dataset usage, and provided no mechanism for setting quotas and reservations. ## This PR - Expands Nexus' notion of "dataset kind" to include the following variants: - `zone_root`, for the `crypt/zone` dataset, - `zone`, for any dataset within `crypt/zone` (e.g., `crypt/zone/oxz_cockroachdb_8bbea076-ff60-4330-8302-383e18140ef3`). - `debug` for the `crypt/debug` dataset. - Adds two endpoints to Sled Agent: `datasets_put`, and `datasets_get`, for setting a configuration of expected datasets. At the moment, `datasets_put` is purely additive, and does not remove any missing datasets. - This API provides a mechanism for Nexus to manage quotas and reservations, which it will do in the future. This PR is related to #6167, which provides additional tooling through the inventory for inspecting dataset state on deployed sleds. Fixes #6042, #6107 --------- Co-authored-by: Rain <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Generally looks good, mostly a bunch of local Rust comments.
@@ -4744,6 +4744,44 @@ fn inv_collection_print_sleds(collection: &Collection) { | |||
sled.reservoir_size.to_whole_gibibytes() | |||
); | |||
|
|||
if !sled.zpools.is_empty() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any chance you could paste some example output?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, here's the (mildly redacted) output of omdb db inventory collections ...
sled e68c45ba-1e10-4ae3-9bfb-a770ed7dc643 (role = Gimlet, serial unknown)
found at: 2024-09-06 02:39:19.150884 UTC from http://[fd00:1122:3344:102::1]:12345
address: [fd00:1122:3344:102::1]:12345
usable hw threads: 8
usable memory (GiB): 7
reservoir (GiB): 0
physical disks:
M2: DiskIdentity { vendor: "synthetic-vendor", model: "synthetic-model-M2", serial: "synthetic-serial-g1_0" } in 1024
M2: DiskIdentity { vendor: "synthetic-vendor", model: "synthetic-model-M2", serial: "synthetic-serial-g1_1" } in 1025
U2: DiskIdentity { vendor: "synthetic-vendor", model: "synthetic-model-U2", serial: "synthetic-serial-g1_0" } in 1026
U2: DiskIdentity { vendor: "synthetic-vendor", model: "synthetic-model-U2", serial: "synthetic-serial-g1_1" } in 1027
U2: DiskIdentity { vendor: "synthetic-vendor", model: "synthetic-model-U2", serial: "synthetic-serial-g1_2" } in 1028
U2: DiskIdentity { vendor: "synthetic-vendor", model: "synthetic-model-U2", serial: "synthetic-serial-g1_3" } in 1029
U2: DiskIdentity { vendor: "synthetic-vendor", model: "synthetic-model-U2", serial: "synthetic-serial-g1_4" } in 1030
zpools
96bb0a39-cca1-46ff-b33c-31425fce84f0: total size: 19968 MiB
aee67881-c5ef-469d-baf1-9b57af906c11: total size: 19968 MiB
b74c535a-5044-4d53-9b0a-37b966baf243: total size: 19968 MiB
e1351784-80a3-40e9-b3c3-a200fa3010e9: total size: 19968 MiB
fdc99717-dfb7-4b60-99bb-a1f4db11ba93: total size: 19968 MiB
datasets:
oxp_96bb0a39-cca1-46ff-b33c-31425fce84f0 - id: None, compression: off
available: 19162120 KiB, used: 646136 KiB
reservation: None, quota: None
oxp_96bb0a39-cca1-46ff-b33c-31425fce84f0/crucible - id: Some(7c23501e-a7bf-4614-93c7-2fdbe55bd706 (dataset)), compression: off
available: 19162120 KiB, used: 192 KiB
reservation: None, quota: None
oxp_96bb0a39-cca1-46ff-b33c-31425fce84f0/crypt - id: None, compression: off
available: 19162120 KiB, used: 644996 KiB
reservation: None, quota: None
oxp_96bb0a39-cca1-46ff-b33c-31425fce84f0/crypt/debug - id: None, compression: gzip-9
available: 19162120 KiB, used: 200 KiB
reservation: None, quota: Some(ByteCount(107374182400))
oxp_96bb0a39-cca1-46ff-b33c-31425fce84f0/crypt/zone - id: None, compression: off
available: 19162120 KiB, used: 644580 KiB
reservation: None, quota: None
oxp_96bb0a39-cca1-46ff-b33c-31425fce84f0/crypt/zone/oxz_crucible_7c23501e-a7bf-4614-93c7-2fdbe55bd706 - id: None, compression: off
available: 19162120 KiB, used: 644372 KiB
reservation: None, quota: None
...
oxp_fdc99717-dfb7-4b60-99bb-a1f4db11ba93 - id: None, compression: off
available: 19162072 KiB, used: 646184 KiB
reservation: None, quota: None
oxp_fdc99717-dfb7-4b60-99bb-a1f4db11ba93/crucible - id: Some(cc7926c3-35b7-4bc3-b68a-2ebe2073f33e (dataset)), compression: off
available: 19162072 KiB, used: 192 KiB
reservation: None, quota: None
oxp_fdc99717-dfb7-4b60-99bb-a1f4db11ba93/crypt - id: None, compression: off
available: 19162072 KiB, used: 645044 KiB
reservation: None, quota: None
oxp_fdc99717-dfb7-4b60-99bb-a1f4db11ba93/crypt/debug - id: None, compression: gzip-9
available: 19162072 KiB, used: 200 KiB
reservation: None, quota: Some(ByteCount(107374182400))
oxp_fdc99717-dfb7-4b60-99bb-a1f4db11ba93/crypt/zone - id: None, compression: off
available: 19162072 KiB, used: 644628 KiB
reservation: None, quota: None
oxp_fdc99717-dfb7-4b60-99bb-a1f4db11ba93/crypt/zone/oxz_crucible_cc7926c3-35b7-4bc3-b68a-2ebe2073f33e - id: None, compression: off
available: 19162072 KiB, used: 644420 KiB
reservation: None, quota: None
zones collected from http://[fd00:1122:3344:102::1]:12345 at 2024-09-06 02:39:19.154778 UTC
zones generation: 5 (count: 11)
ZONES FOUND
zone 078b2df5-4dd7-447f-8bba-61a7359bcad5 (type crucible_pantry)
zone 554086cc-b795-4b57-81e2-8c3e740e82aa (type cockroach_db)
zone 5c08a890-733b-4433-90e7-54364e7b6a39 (type boundary_ntp)
zone 66a572bc-8122-4d88-a8f4-4e1e6bb99e57 (type oximeter)
zone 7c23501e-a7bf-4614-93c7-2fdbe55bd706 (type crucible)
zone 8131b4fb-a3f3-4da2-a7b4-0d3c37bf5004 (type internal_dns)
zone b89bfa4d-2419-48f3-ab96-fc1863e9e43a (type crucible)
zone befda934-f3bc-4e9d-900d-04c6a5c3e537 (type crucible)
zone c698e9ab-96e4-4eb5-be4a-9f900faa1da5 (type external_dns)
zone cc7926c3-35b7-4bc3-b68a-2ebe2073f33e (type crucible)
zone e87e7caf-7b8e-4e46-afc3-a55a7912282d (type crucible)
/// Minimum space guaranteed to dataset and descendents. | ||
pub reservation: Option<ByteCount>, | ||
/// The compression algorithm used for this dataset. | ||
pub compression: String, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does compression
need to be a more specific type than this? If not then could you add a comment? (I guess compression
is not something we parse?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a type in omicron_common::disk::CompressionAlgorithm
that is the strongly-typed version of this -- I'll link to it in a comment here -- but I didn't want to rely on one of those algorithms being known (nor did I really want to add an Other(...)
variant) for the inventory to be collected.
Here's my fear: a new Helios update adds a new compression algorithm that the sled agent doesn't know about, and tries to set that value on the root filesystem. If this happens, I don't want a parsing error to prevent inventory collection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, reasonable
Provides visibility into deployed datasets, building atop #6144.
Similar to Physical Disks (and Zpools), Datasets are observable in two ways:
This PR implements the inventory aspect of datasets, to provide visibility into the state of sled storage.
Additionally, this PR provides some omdb commands for inspection:
omdb sled-agent datasets list
has been added to show dataset configuration.omdb db inventory collections show
has been updated to emit disk, zpool, and dataset info from inventory.