Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[reconfigurator] Decommission sleds #5698

Merged
merged 21 commits into from
May 13, 2024
Merged

Conversation

jgallagher
Copy link
Contributor

This builds on (and is currently pointed at) #5663, and is on the path to fixing #5625, although there is still more work to do there. (Should be small, and I'll start on it while this is in review to make sure it really fixes it.) I don't plan on landing this before R8 is out the door, but wanted to go ahead and open it up to review.

Much of this is fallout from our discussions about what it means to decommission a sled. Major changes are:

  • SledFilter::All has been renamed to SledFilter::Commissioned, and no longer returns sleds that are in SledState::Decommissioned.
  • The blueprint planner will now update the desired sled state to Decommissioned for sleds which satisfy our conditions. (See do_plan_decommission().)
  • The blueprint planner will carry forward the Omicron zones of decommissioned sleds to child blueprints. Pruning these is Add garbage collection for expunged sleds and zones #5552.
  • The blueprint planner will not carry forward a desired sled state of Decommissioned once the inputs report that the sled has already been decommissioned.
  • The blueprint executor will decommission sleds that the planner said to.
  • Decommissioning a sled implicitly decommissions all its disks. (This matches what happens with sled expungement, and should not interfere with region replacement, which keys off of policy, not state.)

Copy link
Contributor

@sunshowers sunshowers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick review, but looks reasonable to me!

@@ -1456,7 +1456,7 @@ async fn cmd_db_sleds(
Some(filter) => filter,
None => {
eprintln!("note: listing all sleds (use -F to filter, e.g. -F in-service)");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note likely needs to be changed

Copy link
Contributor Author

@jgallagher jgallagher May 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the wording here to note it's "all commissioned sleds" in 8d08368. The followon PR to this (#5733) adds a SledFilter::Decommissioned that allows us to see the sleds that are left out by default here.

Comment on lines +1251 to +1259
// Observe that the disk state is now decommissioned
assert_eq!(
PhysicalDiskState::Decommissioned,
lookup_physical_disk(&datastore, disk1.id()).await.disk_state
);
assert_eq!(
PhysicalDiskState::Decommissioned,
lookup_physical_disk(&datastore, disk2.id()).await.disk_state
);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for both fixing and testing this

mut self,
sled_ids: impl Iterator<Item = SledUuid>,
self,
commissioned_sled_ids: impl Iterator<Item = SledUuid>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: There really isn't any part of this argument that validates that the state of these sled IDs as "commissioned", right?

Should this just be called "added_sleds"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that does seem better. Renamed and expanded the doc comment in 64e8660

Copy link
Contributor

@andrewjstone andrewjstone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. Mostly nits from me.

{
result.found.state()
} else {
return Err(err.bail(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for understanding: I'm not used to this pattern. Is the err.bail here related to running in a transaction?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe so! I see this pattern in other transactions, and err.bail's comments say it sets err to the value we pass here but then returns DieselError::RollbackTransaction.

nexus/db-queries/src/db/datastore/sled.rs Show resolved Hide resolved
nexus/reconfigurator/planning/src/planner.rs Outdated Show resolved Hide resolved
// the case.
for sled_id in self.blueprint.sled_ids_with_zones() {
if !commissioned_sled_ids.contains(&sled_id) {
self.blueprint.expunge_all_zones_for_sled(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When wouldn't this be the case?
If we see this happen, rather than doing an ensure like operation, should we check what the issue was and log an error? That seems like an invariant violation to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed; I changed this to a hard error in 1e629df

datastore: &DataStore,
sled_id: SledUuid,
) -> anyhow::Result<()> {
let (authz_sled,) = LookupPath::new(opctx, datastore)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do the auth here, rather than in the datastore method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DataStore::sled_set_state_to_decommissioned() was written in such a way as to require that - it wants an authz_sled, so we have to do the lookup to call it at all. I'm not sure why that is, though - @sunshowers do you remember?

Base automatically changed from john/blueprint-store-sled-state to main May 13, 2024 17:32
@jgallagher jgallagher merged commit a0cc2ff into main May 13, 2024
21 checks passed
@jgallagher jgallagher deleted the john/mark-sleds-decommissioned branch May 13, 2024 21:24
jgallagher added a commit that referenced this pull request May 14, 2024
This is a pretty small delta on top of #5698 that I've been sitting on
for a few days before getting a chance to test it on a4x2 this
afternoon. I set up a4x2 with three sleds (g0, g1, g3), and went through
3.5 add/remove cycles on sled g2 (3 add/remove pairs and a 4th add). g2
is present in `omdb db sleds`:

```
root@oxz_switch:~# omdb db sleds
 SERIAL  IP                             ROLE      POLICY      STATE   ID
 g3      [fd00:1122:3344:103::1]:12345  scrimlet  in service  active  13b8d30d-b66d-4333-b156-9aa2527e130b
 g2      [fd00:1122:3344:124::1]:12345  -         in service  active  218e4e9c-0a27-4460-a65c-7e93bef531c4
 g1      [fd00:1122:3344:102::1]:12345  -         in service  active  661e9e3e-0beb-43fe-9606-109b7145b258
 g0      [fd00:1122:3344:101::1]:12345  scrimlet  in service  active  9f19235f-bfa4-4032-9cad-608448b4f1a0
```

I added `SledFilter::Decommissioned` specifically for use with omdb,
where we can see the 3 previous times `g2` was present (each with a
different sled ID):

```
root@oxz_switch:~# omdb db sleds -F decommissioned
 SERIAL  IP                             ROLE  POLICY    STATE           ID
 g2      [fd00:1122:3344:121::1]:12345  -     expunged  decommissioned  a7b09236-c8ba-4025-934d-b5f0539d9379
 g2      [fd00:1122:3344:123::1]:12345  -     expunged  decommissioned  ec0e81bf-6366-4273-97bc-47223334dc90
 g2      [fd00:1122:3344:122::1]:12345  -     expunged  decommissioned  f5f8ba44-9681-4e6f-84f2-fd654c83dd23
```

The current blueprint shows history of all four times the sled has been
present, because we don't currently prune expunged zones:

```
root@oxz_switch:~# omdb nexus blueprints show current
blueprint  5bfb182e-df8a-41e5-a8a3-862bb42f8feb
parent:    5b2823c8-ca41-46b5-9efd-ffaae0e6e028

  -----------------------------------------------------------------------------------------------
    zone type         zone ID                                disposition  underlay IP
  -----------------------------------------------------------------------------------------------

  sled 13b8d30d-b66d-4333-b156-9aa2527e130b: blueprint zones at generation 5
    clickhouse        9ee21438-4a2f-499a-b867-58993a36f2f0   in service   fd00:1122:3344:103::6
    cockroach_db      b0f22c64-a8e7-4c83-af26-f30a0dc413d5   in service   fd00:1122:3344:103::3
    crucible          29f3db85-722f-41ed-a66c-66aa31a31591   in service   fd00:1122:3344:103::b
    crucible          40317028-eedf-487f-88ae-821632f05f39   in service   fd00:1122:3344:103::8
    crucible          50153da9-da61-4cd9-829c-18129e5a6c52   in service   fd00:1122:3344:103::9
    crucible          765dc0e2-1850-43b1-ae08-531d51772f14   in service   fd00:1122:3344:103::a
    crucible          bcc91ace-f816-400d-9198-22ed20e00ca3   in service   fd00:1122:3344:103::c
    crucible_pantry   bf1c3443-fbf5-4e16-affb-ee4e2598cbcb   in service   fd00:1122:3344:103::7
    external_dns      ed00e862-094c-449b-bfdb-ddba26f36bb2   in service   fd00:1122:3344:103::4
    internal_dns      6b4f3315-ce80-4d89-a495-ff96c5f573cd   in service   fd00:1122:3344:3::1
    internal_ntp      6b9d837b-9bb9-46f1-8f48-c0c8a7d89882   in service   fd00:1122:3344:103::d
    nexus             327c035e-628e-4880-b434-8d77ca362f21   in service   fd00:1122:3344:103::5

  sled 218e4e9c-0a27-4460-a65c-7e93bef531c4: blueprint zones at generation 3
    crucible          5a7167ff-04ef-47fd-93bc-2adc7c8a7087   in service   fd00:1122:3344:124::26
    crucible          9ca15af1-79dc-4248-86d1-5d7d25987609   in service   fd00:1122:3344:124::23
    crucible          a80accc8-3fdd-42b2-8425-eb246a7f0ba0   in service   fd00:1122:3344:124::22
    crucible          d80a6050-f2e3-4827-8fd0-37c5c88ac87d   in service   fd00:1122:3344:124::25
    crucible          e92e8930-6f81-4f09-a520-415ef896a05f   in service   fd00:1122:3344:124::24
    internal_ntp      a37275e8-eb02-4cb3-b216-a47a892313e7   in service   fd00:1122:3344:124::21

  sled 661e9e3e-0beb-43fe-9606-109b7145b258: blueprint zones at generation 5
    boundary_ntp      8eb79dcb-b6a0-4a24-835e-cf9e55b12495   in service   fd00:1122:3344:102::d
    cockroach_db      294868f1-d76b-4ef6-a87b-399ec06ba9a3   in service   fd00:1122:3344:102::3
    cockroach_db      373fa1d2-597c-4245-a87b-31a7f734c806   in service   fd00:1122:3344:102::4
    crucible          00611d19-0752-4e6f-a57a-43df54089987   in service   fd00:1122:3344:102::c
    crucible          05697e11-b936-4319-b1a6-ba8e46c77988   in service   fd00:1122:3344:102::b
    crucible          612ad32d-321d-4159-9265-4c0eca972e2c   in service   fd00:1122:3344:102::9
    crucible          c03f67ee-7f38-445f-b9a5-dcfd1feb976c   in service   fd00:1122:3344:102::a
    crucible          e2e46038-8ecb-4fe0-be1a-94530842cd65   in service   fd00:1122:3344:102::8
    crucible_pantry   4fea76f9-d88e-43e9-ade7-2345dd77d2a7   in service   fd00:1122:3344:102::7
    internal_dns      504f14cf-42ea-4457-b922-fb400eea1495   in service   fd00:1122:3344:2::1
    nexus             8d3ba915-17f6-4c2f-a94d-31120e89f17b   in service   fd00:1122:3344:102::5
    oximeter          4d98514a-a352-4ddc-996f-912958d9a80d   in service   fd00:1122:3344:102::6

  sled 9f19235f-bfa4-4032-9cad-608448b4f1a0: blueprint zones at generation 5
    boundary_ntp      603a1fe0-1fd1-4fc1-968c-af95e1400f2d   in service   fd00:1122:3344:101::d
    cockroach_db      d0f66442-cc46-48b3-9770-d473517a81f5   in service   fd00:1122:3344:101::3
    cockroach_db      ed701854-eb03-43eb-8a7c-5c1aea73cc51   in service   fd00:1122:3344:101::4
    crucible          56151b6f-fa7d-4459-9540-6b04460d2c64   in service   fd00:1122:3344:101::8
    crucible          8b37469e-2660-498e-bad7-9c9158b4fbae   in service   fd00:1122:3344:101::9
    crucible          b602ce7d-2b92-4e12-8b95-dd2c4cad3397   in service   fd00:1122:3344:101::a
    crucible          d87cae8c-d85d-4a25-b7d6-f7bbce50f6e9   in service   fd00:1122:3344:101::b
    crucible          ea549438-9531-491e-b85b-b92668184c9f   in service   fd00:1122:3344:101::c
    crucible_pantry   51e18a3d-a4ce-4ee1-a900-3f23444c9bfd   in service   fd00:1122:3344:101::7
    external_dns      60f77e70-af63-4b4d-b3b2-62610f707d47   in service   fd00:1122:3344:101::5
    internal_dns      d47c3ade-ae61-49b0-8bb7-217a825517de   in service   fd00:1122:3344:1::1
    nexus             a1df997e-3595-4f24-979c-76232ae16164   in service   fd00:1122:3344:101::6

  sled a7b09236-c8ba-4025-934d-b5f0539d9379: blueprint zones at generation 4
    crucible          8aac1836-3d74-4d88-af41-b9e2fda8d3f7   expunged     fd00:1122:3344:121::23
    crucible          a0468efe-3f76-4244-9d92-03a70608c777   expunged     fd00:1122:3344:121::22
    crucible          afc45a35-0860-41d8-bac1-c06ac442fd43   expunged     fd00:1122:3344:121::24
    crucible          b596d696-6521-4ebb-b0b4-22a772f5d1d0   expunged     fd00:1122:3344:121::25
    crucible          b5f1851a-695f-4223-bc67-d30df30e77d0   expunged     fd00:1122:3344:121::26
    internal_ntp      219fdc9b-fe80-436a-9d81-c8f7bb3f0364   expunged     fd00:1122:3344:121::21

  sled ec0e81bf-6366-4273-97bc-47223334dc90: blueprint zones at generation 4
    crucible          5fe4544e-d44b-45fe-bf99-505007654e97   expunged     fd00:1122:3344:123::22
    crucible          76f11abf-cfbf-41ee-972b-1c38c6babd22   expunged     fd00:1122:3344:123::26
    crucible          87821b5d-ed10-4787-ab24-6b7411bcc5d8   expunged     fd00:1122:3344:123::25
    crucible          8ceb3b5b-eb61-4195-b873-6dac8108fd73   expunged     fd00:1122:3344:123::23
    crucible          c1d0cfb9-9489-4fce-a213-40fb5c1b7b0f   expunged     fd00:1122:3344:123::24
    internal_ntp      86a61155-e468-4906-b0b6-d12fa2f3f4dd   expunged     fd00:1122:3344:123::21

  sled f5f8ba44-9681-4e6f-84f2-fd654c83dd23: blueprint zones at generation 4
    crucible          4c82a05d-bbc0-40b2-8b85-f688b56b2f7b   expunged     fd00:1122:3344:122::22
    crucible          6b6a6770-f552-438d-bca0-b1d8bdd8c2ed   expunged     fd00:1122:3344:122::24
    crucible          83eccc0d-afea-4e27-a906-3a629c2094d1   expunged     fd00:1122:3344:122::23
    crucible          8f6cea3e-5489-4665-a37b-c6336c0e54c8   expunged     fd00:1122:3344:122::26
    crucible          bdd61a6c-50b8-450c-b963-443d77a1c7f4   expunged     fd00:1122:3344:122::25
    internal_ntp      ab8b9602-a861-4985-809d-30e4724a63ab   expunged     fd00:1122:3344:122::21

METADATA:
  created by:            a1df997e-3595-4f24-979c-76232ae16164
  created at:            2024-05-10T19:09:09.547Z
  comment:               sled 218e4e9c-0a27-4460-a65c-7e93bef531c4: add zones
  internal DNS version:  11
  external DNS version:  2
```

Also visible in that blueprint is evidence that we went through the "add
NTP" -> "add crucible" reconfigurator steps. I did not go so far as to
add a Nexus to g2 each time (it's considerably more manually intensive
to do so, which we might need to address if that's a thing we want to
test more thoroughly).

This also fixes a bug in `allocate_sled_underlay_subnet_octets` that I
accidentally introduced in #5675, restoring idempotence up until the
point where the sled has upserted itself. (The comments added should
clarify this.) We can and should still fail on an attempt to add a sled
where we (a) have an allocation and (b) have an entry in the `sled`
table, but no longer fail on attempt to add a sled where we (a) have an
allocation but (b) do NOT have an entry in the `sled` table.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants