-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[reconfigurator] Only place Crucible zones on provisionable zpools #5601
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love how straightforward this is!
} | ||
|
||
/// Returns all in-service, active zpools | ||
pub fn provisionable_zpools( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we going to need a ZpoolFilter
soon similar to the SledFilter
type? One advantage of that type is that it can be used everywhere from this kind of thing to database queries. Regardless, not important here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC we considered specific methods like this for zones and decided to go with the filter because there were going to be a bunch of them (and the consistency between this and db queries as you mention). I'd default to making the same decision here to use a filter, but I could see an argument for methods if we don't expect very many.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went ahead and added a ZpoolFilter
here for consistency. It's simple, and it's implemented very similarly to the DiskFilter
, but I'm okay being slightly verbose if it keeps us the code consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See: 2e887c5
Provides an internal API to remove disks, and wires it into omdb. Additionally, expands omdb commands for visibility. - `omdb db physical-disks` can be used to view all "control plane physical disks". This is similar to, but distinct from, the `omdb db inventory physical-disks` command, as it reports control plane disks that have been adopted in the control plane. This command is necessary for identifying the UUID of the associated control plane object, which is not observable via inventory. - `omdb nexus sleds expunge-disk` can be used to expunge a physical disk from a sled. This relies on many prior patches to operate correctly, but with the combination of: #5987, #5965, #5931, #5952, #5601, #5599, we can observe the following behavior: expunging a disk leads to all "users" of that disk (zone filesystems, datasets, zone bundles, etc) being removed. I tested this PR on a4x2 using the following steps: ```bash # Boot a4x2, confirm the Nexus zone is running # From g0, zlogin oxz_switch $ omdb db sleds SERIAL IP ROLE POLICY STATE ID g2 [fd00:1122:3344:103::1]:12345 - in service active 29fede5f-37e4-4528-bcf2-f3ee94924894 g0 [fd00:1122:3344:101::1]:12345 scrimlet in service active 6a2c7019-d055-4256-8bad-042b97aa0e5e g1 [fd00:1122:3344:102::1]:12345 - in service active a611b43e-3995-4cd4-9603-89ca6aca3dc5 g3 [fd00:1122:3344:104::1]:12345 scrimlet in service active f62f2cfe-d17b-4bd6-ae64-57e8224d3672 # We'll plan on expunging a disk on g1, and observing the effects. $ export SLED_ID=a611b43e-3995-4cd4-9603-89ca6aca3dc5 $ export OMDB_SLED_AGENT_URL=http://[fd00:1122:3344:102::1]:12345 $ omdb sled-agent zones list "oxz_cockroachdb_b3fecda8-2eb8-4ff3-9cf6-90c94fba7c50" "oxz_crucible_19831c98-3137-4af4-a93d-fc1a17c138f2" "oxz_crucible_6adcb8ec-6c9e-4e8a-a8d4-bbf9ad44e2c4" "oxz_crucible_74b2f587-10ce-4131-97fd-9832c52c8a41" "oxz_crucible_9e422508-f4d5-4c24-8dde-0080c0916419" "oxz_crucible_a47e9625-d189-4001-877a-cc3aa5b1f3eb" "oxz_crucible_pantry_c3b4e3cb-3e23-4f5e-921b-04e4801924fd" "oxz_external_dns_7e669b6f-a3fe-47a9-addd-20e42c58b8bb" "oxz_internal_dns_1a45a6e8-5b03-4ab4-a3db-e83fb7767767" "oxz_ntp_209ad0d0-a5e7-4ab8-ac8f-e99902697b32" "oxz_oximeter_864efebb-790f-4b7a-8377-b2c82c87f5b8" $ omdb db physical-disks | grep $SLED_ID ID SERIAL VENDOR MODEL SLED_ID POLICY STATE 23524716-a331-4d57-aa71-8bd4dbc916f8 synthetic-serial-g1_0 synthetic-vendor synthetic-model-U2 a611b43e-3995-4cd4-9603-89ca6aca3dc5 in service active 3ca1812b-55e3-47ed-861f-f667f626c8a0 synthetic-serial-g1_3 synthetic-vendor synthetic-model-U2 a611b43e-3995-4cd4-9603-89ca6aca3dc5 in service active 40139afb-7076-45d9-84cf-b96eefe7acf8 synthetic-serial-g1_1 synthetic-vendor synthetic-model-U2 a611b43e-3995-4cd4-9603-89ca6aca3dc5 in service active 5c8e33dd-1230-4214-af78-9be892d9f421 synthetic-serial-g1_4 synthetic-vendor synthetic-model-U2 a611b43e-3995-4cd4-9603-89ca6aca3dc5 in service active 85780bbf-8e2d-481e-9013-34611572f191 synthetic-serial-g1_2 synthetic-vendor synthetic-model-U2 a611b43e-3995-4cd4-9603-89ca6aca3dc5 in service active # Let's expunge the "0th" disk here. $ omdb nexus sleds expunge-disk 23524716-a331-4d57-aa71-8bd4dbc916f8 -w $ omdb nexus blueprints regenerate -w $ omdb nexus blueprints show $NEW_BLUEPRINT_ID # Observe that the new blueprint for the sled expunges some zones -- minimally, # the Crucible zone -- and no longer lists the "g1_0" disk. This should also be # summarized in the blueprint metadata comment. $ omdb nexus blueprints target set $NEW_BLUEPRINT_ID enabled -w $ omdb sled-agent zones list zones: "oxz_crucible_19831c98-3137-4af4-a93d-fc1a17c138f2" "oxz_crucible_74b2f587-10ce-4131-97fd-9832c52c8a41" "oxz_crucible_9e422508-f4d5-4c24-8dde-0080c0916419" "oxz_crucible_a47e9625-d189-4001-877a-cc3aa5b1f3eb" "oxz_crucible_pantry_c3b4e3cb-3e23-4f5e-921b-04e4801924fd" "oxz_ntp_209ad0d0-a5e7-4ab8-ac8f-e99902697b32" "oxz_oximeter_864efebb-790f-4b7a-8377-b2c82c87f5b8" # As we can see, the expunged zones have been removed. # We can also access the sled agent logs from g1 to observe that the expected requests have been sent # to adjust the set of control plane disks and expunge the expected zones. ``` This is a major part of #4719 Fixes #5370
Previously, the planner placed Crucible zones on zpools unconditionally. Now, we respect the policy and state of disks to
limit which disks are targets to receive these zones.
Partial fix of #5372