Skip to content

Commit

Permalink
[reconfigurator] SledEditor: be more strict about decommissioned sleds (
Browse files Browse the repository at this point in the history
#7234)

This is a followup from
#7204 (comment)
and makes two changes, neither of which should affect behavior:

* `SledEditor` will now fail if a caller attempts to make changes to a
decommissioned sled (internally, this is statically enforced by a type
state enum - a sled in the decommissioned state does not have any
methods that support editing, so we're forced to return an error)
* `SledEditor::set_state()` is now `SledEditor::decommission()`, and it
performs some checks that the sled looks decommissionable

The second bullet is more questionable than I expected it to be:

1. There are some arguments that `SledEditor` shouldn't do any checks
here; in particular, it doesn't have the full context (e.g., any checks
on "should we decommission this sled" that depend on the `PlanningInput`
can't be performed here, because `SledEditor` intentionally doesn't have
access to `PlanningInput`).
2. I wanted to check zones + disks + datasets, but in practice it can
only check zones today; I left a comment (and the commented-out disks +
datasets checks we should do) about why. I think we will eventually be
able to turn these on; the current behavior of removing disks/datasets
from the blueprint for expunged sleds will have to change to fix #7078,
at which point these checks should be valid.

I don't feel super strongly about the checks in `decommission()` or even
this PR as a whole; if this doesn't look like a useful direction, I'd be
fine with discarding it. Please review with a pretty critical eye.
  • Loading branch information
jgallagher authored Dec 16, 2024
1 parent 85feb09 commit ca21fe7
Show file tree
Hide file tree
Showing 6 changed files with 278 additions and 34 deletions.
1 change: 1 addition & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions nexus/reconfigurator/planning/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ illumos-utils.workspace = true
indexmap.workspace = true
internal-dns-resolver.workspace = true
ipnet.workspace = true
itertools.workspace = true
nexus-config.workspace = true
nexus-inventory.workspace = true
nexus-sled-agent-shared.workspace = true
Expand Down
46 changes: 28 additions & 18 deletions nexus/reconfigurator/planning/src/blueprint_builder/builder.rs
Original file line number Diff line number Diff line change
Expand Up @@ -228,6 +228,14 @@ pub struct SledEditCounts {
}

impl SledEditCounts {
pub fn zeroes() -> Self {
Self {
disks: EditCounts::zeroes(),
datasets: EditCounts::zeroes(),
zones: EditCounts::zeroes(),
}
}

fn has_nonzero_counts(&self) -> bool {
let Self { disks, datasets, zones } = self;
disks.has_nonzero_counts()
Expand Down Expand Up @@ -548,7 +556,7 @@ impl<'a> BlueprintBuilder<'a> {
generation: Generation::new(),
datasets: BTreeMap::new(),
});
let editor = SledEditor::new(
let editor = SledEditor::for_existing(
state,
zones.clone(),
disks,
Expand All @@ -565,8 +573,7 @@ impl<'a> BlueprintBuilder<'a> {
// that weren't in the parent blueprint. (These are newly-added sleds.)
for sled_id in input.all_sled_ids(SledFilter::Commissioned) {
if let Entry::Vacant(slot) = sled_editors.entry(sled_id) {
slot.insert(SledEditor::new_empty(
SledState::Active,
slot.insert(SledEditor::for_new_active(
build_preexisting_dataset_ids(sled_id)?,
));
}
Expand Down Expand Up @@ -735,8 +742,8 @@ impl<'a> BlueprintBuilder<'a> {
.retain(|sled_id, _| in_service_sled_ids.contains(sled_id));

// If we have the clickhouse cluster setup enabled via policy and we
// don't yet have a `ClickhouseClusterConfiguration`, then we must create
// one and feed it to our `ClickhouseAllocator`.
// don't yet have a `ClickhouseClusterConfiguration`, then we must
// create one and feed it to our `ClickhouseAllocator`.
let clickhouse_allocator = if self.input.clickhouse_cluster_enabled() {
let parent_config = self
.parent_blueprint
Expand Down Expand Up @@ -815,18 +822,18 @@ impl<'a> BlueprintBuilder<'a> {
}

/// Set the desired state of the given sled.
pub fn set_sled_state(
pub fn set_sled_decommissioned(
&mut self,
sled_id: SledUuid,
desired_state: SledState,
) -> Result<(), Error> {
let editor = self.sled_editors.get_mut(&sled_id).ok_or_else(|| {
Error::Planner(anyhow!(
"tried to set sled state for unknown sled {sled_id}"
))
})?;
editor.set_state(desired_state);
Ok(())
editor
.decommission()
.map_err(|err| Error::SledEditError { sled_id, err })
}

/// Within tests, set an RNG for deterministic results.
Expand Down Expand Up @@ -1046,15 +1053,18 @@ impl<'a> BlueprintBuilder<'a> {
// blueprint
for (disk_id, (zpool, disk)) in database_disks {
database_disk_ids.insert(disk_id);
editor.ensure_disk(
BlueprintPhysicalDiskConfig {
disposition: BlueprintPhysicalDiskDisposition::InService,
identity: disk.disk_identity.clone(),
id: disk_id,
pool_id: *zpool,
},
&mut self.rng,
);
editor
.ensure_disk(
BlueprintPhysicalDiskConfig {
disposition:
BlueprintPhysicalDiskDisposition::InService,
identity: disk.disk_identity.clone(),
id: disk_id,
pool_id: *zpool,
},
&mut self.rng,
)
.map_err(|err| Error::SledEditError { sled_id, err })?;
}

// Remove any disks that appear in the blueprint, but not the database
Expand Down
Loading

0 comments on commit ca21fe7

Please sign in to comment.