Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[nexus] Remove zones on expunged disks #5599

Merged
merged 19 commits into from
May 30, 2024
Merged

Conversation

smklein
Copy link
Collaborator

@smklein smklein commented Apr 23, 2024

Expunges zones for which the corresponding physical disk, holding durable data, has been removed.

A follow-up PR will be necessary to expunge zones which have had their transient zone filesystems removed,
but Nexus is not yet aware of where zone placement decisions are made.

Partial fix of #5372

@smklein smklein force-pushed the expunge-zones-with-expunged-disks branch from 8fc8c90 to 5229eff Compare April 23, 2024 00:53
@smklein smklein changed the base branch from main to provision-crucible-on-active-zpools-only April 23, 2024 00:56
@smklein smklein changed the title [nexus] Remove zones on expunged disks, provision crucible to active disks [nexus] Remove zones on expunged disks Apr 23, 2024
@smklein smklein marked this pull request as ready for review April 23, 2024 00:57
Copy link
Contributor

@sunshowers sunshowers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering when we'd do this! Thought it would only be required for r9 but I guess we have to do it for r8. Thanks for doing it :)

nexus/reconfigurator/planning/src/planner.rs Outdated Show resolved Hide resolved
Base automatically changed from provision-crucible-on-active-zpools-only to main April 23, 2024 20:35
Comment on lines +482 to +503
let mut count_disk_expunged = 0;
let mut count_sled_decommissioned = 0;
let mut count_sled_expunged = 0;
for reason in zones_to_expunge.values() {
match reason {
ZoneExpungeReason::DiskExpunged => count_disk_expunged += 1,
ZoneExpungeReason::SledDecommissioned => {
count_sled_decommissioned += 1;
}
ZoneExpungeReason::SledExpunged => count_sled_expunged += 1,
};
}
let count_and_reason = [
(count_disk_expunged, "zone was using expunged disk"),
(count_sled_decommissioned, "sled state is decommissioned"),
(count_sled_expunged, "sled policy is expunged"),
];
for (count, reason) in count_and_reason {
if count > 0 {
self.comment(format!(
"sled {sled_id} ({reason}): {count} zones expunged",
));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is exactly the comment handling I was talking about in #5493 -- thanks for doing this!

I think I might consider doing this at a higher level. Instead of storing a list of comments in the BlueprintBuilder, store a log of operations. Then, while converting a BlueprintBuilder into a Blueprint, turn the log of operations into comments (after possibly doing some merging of operations). That would separate out the data layer from the presentation layer -- and you can test against this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm interested in this proposal, but it seems like it might extend a fair bit beyond the scope of this PR. Right now, the BlueprintBuilder is modifying structures in-place, rather than having a clear delineation of the "before" + "applied operations" .

If I was to implement what you're suggesting, presumably we'd alter the change_sled_zones and change_sled_disks - rather than mutating in place, we'd need some way to represent operations as "additions, updates, or removals" that are stored separately, and applied when we build the blueprint?

(That's my interpretation of your comment at least, am I reading that right?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I was to implement what you're suggesting, presumably we'd alter the change_sled_zones and change_sled_disks - rather than mutating in place, we'd need some way to represent operations as "additions, updates, or removals" that are stored separately, and applied when we build the blueprint?

Ah so we decided not to do this in #5493. Instead, what we can do is to both apply the changes, and store a structured log of the changes.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've attempted to resolve this in #5834, or at least start this, with an operation log. It is currently being stringified when the blueprint is built, but it could remain a stronger type eventually, if that would be useful.

nexus/reconfigurator/planning/src/planner.rs Outdated Show resolved Hide resolved
@morlandi7 morlandi7 modified the milestones: 8, 9 May 13, 2024
@smklein smklein merged commit d2af4e4 into main May 30, 2024
19 checks passed
@smklein smklein deleted the expunge-zones-with-expunged-disks branch May 30, 2024 16:34
smklein added a commit that referenced this pull request Jun 6, 2024
Attempts to resolve the discussion in
#5599 (comment)
by using a strongly-typed `Operation` enum instead of arbitrary String
comments.
smklein added a commit that referenced this pull request Jul 15, 2024
Provides an internal API to remove disks, and wires it into omdb.
Additionally, expands omdb commands for visibility.

- `omdb db physical-disks` can be used to view all "control plane
physical disks". This is similar to, but distinct from, the `omdb db
inventory physical-disks` command, as it reports control plane disks
that have been adopted in the control plane. This command is necessary
for identifying the UUID of the associated control plane object, which
is not observable via inventory.
- `omdb nexus sleds expunge-disk` can be used to expunge a physical disk
from a sled. This relies on many prior patches to operate correctly, but
with the combination of: #5987, #5965, #5931, #5952, #5601, #5599, we
can observe the following behavior: expunging a disk leads to all
"users" of that disk (zone filesystems, datasets, zone bundles, etc)
being removed.

I tested this PR on a4x2 using the following steps:

```bash
# Boot a4x2, confirm the Nexus zone is running
# From g0, zlogin oxz_switch

$ omdb db sleds

SERIAL  IP                             ROLE      POLICY      STATE   ID                                   
 g2      [fd00:1122:3344:103::1]:12345  -         in service  active  29fede5f-37e4-4528-bcf2-f3ee94924894 
 g0      [fd00:1122:3344:101::1]:12345  scrimlet  in service  active  6a2c7019-d055-4256-8bad-042b97aa0e5e 
 g1      [fd00:1122:3344:102::1]:12345  -         in service  active  a611b43e-3995-4cd4-9603-89ca6aca3dc5 
 g3      [fd00:1122:3344:104::1]:12345  scrimlet  in service  active  f62f2cfe-d17b-4bd6-ae64-57e8224d3672

# We'll plan on expunging a disk on g1, and observing the effects.
$ export SLED_ID=a611b43e-3995-4cd4-9603-89ca6aca3dc5
$ export OMDB_SLED_AGENT_URL=http://[fd00:1122:3344:102::1]:12345
$ omdb sled-agent zones list

    "oxz_cockroachdb_b3fecda8-2eb8-4ff3-9cf6-90c94fba7c50"
    "oxz_crucible_19831c98-3137-4af4-a93d-fc1a17c138f2"
    "oxz_crucible_6adcb8ec-6c9e-4e8a-a8d4-bbf9ad44e2c4"
    "oxz_crucible_74b2f587-10ce-4131-97fd-9832c52c8a41"
    "oxz_crucible_9e422508-f4d5-4c24-8dde-0080c0916419"
    "oxz_crucible_a47e9625-d189-4001-877a-cc3aa5b1f3eb"
    "oxz_crucible_pantry_c3b4e3cb-3e23-4f5e-921b-04e4801924fd"
    "oxz_external_dns_7e669b6f-a3fe-47a9-addd-20e42c58b8bb"
    "oxz_internal_dns_1a45a6e8-5b03-4ab4-a3db-e83fb7767767"
    "oxz_ntp_209ad0d0-a5e7-4ab8-ac8f-e99902697b32"
    "oxz_oximeter_864efebb-790f-4b7a-8377-b2c82c87f5b8"

$ omdb db physical-disks | grep $SLED_ID
 ID                                    SERIAL                 VENDOR            MODEL               SLED_ID                               POLICY      STATE
 23524716-a331-4d57-aa71-8bd4dbc916f8  synthetic-serial-g1_0  synthetic-vendor  synthetic-model-U2  a611b43e-3995-4cd4-9603-89ca6aca3dc5  in service  active 
 3ca1812b-55e3-47ed-861f-f667f626c8a0  synthetic-serial-g1_3  synthetic-vendor  synthetic-model-U2  a611b43e-3995-4cd4-9603-89ca6aca3dc5  in service  active 
 40139afb-7076-45d9-84cf-b96eefe7acf8  synthetic-serial-g1_1  synthetic-vendor  synthetic-model-U2  a611b43e-3995-4cd4-9603-89ca6aca3dc5  in service  active 
 5c8e33dd-1230-4214-af78-9be892d9f421  synthetic-serial-g1_4  synthetic-vendor  synthetic-model-U2  a611b43e-3995-4cd4-9603-89ca6aca3dc5  in service  active 
 85780bbf-8e2d-481e-9013-34611572f191  synthetic-serial-g1_2  synthetic-vendor  synthetic-model-U2  a611b43e-3995-4cd4-9603-89ca6aca3dc5  in service  active 

# Let's expunge the "0th" disk here.

$ omdb nexus sleds expunge-disk 23524716-a331-4d57-aa71-8bd4dbc916f8 -w
$ omdb nexus blueprints regenerate -w
$ omdb nexus blueprints show $NEW_BLUEPRINT_ID

# Observe that the new blueprint for the sled expunges some zones -- minimally,
# the Crucible zone -- and no longer lists the "g1_0" disk. This should also be
# summarized in the blueprint metadata comment.

$ omdb nexus blueprints target set $NEW_BLUEPRINT_ID enabled -w
$ omdb sled-agent zones list

zones:
    "oxz_crucible_19831c98-3137-4af4-a93d-fc1a17c138f2"
    "oxz_crucible_74b2f587-10ce-4131-97fd-9832c52c8a41"
    "oxz_crucible_9e422508-f4d5-4c24-8dde-0080c0916419"
    "oxz_crucible_a47e9625-d189-4001-877a-cc3aa5b1f3eb"
    "oxz_crucible_pantry_c3b4e3cb-3e23-4f5e-921b-04e4801924fd"
    "oxz_ntp_209ad0d0-a5e7-4ab8-ac8f-e99902697b32"
    "oxz_oximeter_864efebb-790f-4b7a-8377-b2c82c87f5b8"

# As we can see, the expunged zones have been removed.
# We can also access the sled agent logs from g1 to observe that the expected requests have been sent
# to adjust the set of control plane disks and expunge the expected zones.
```

This is a major part of
#4719
Fixes #5370
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants