Skip to content

Commit

Permalink
sled agent: don't special-case vmm-not-present handling for requests …
Browse files Browse the repository at this point in the history
…to stop (#6698)

When sled agent receives a request to stop a VMM that's not in the
agent's VMM table, return `NoSuchVmm` instead of succeeding. This allows
users manually to recover an instance that was Running prior to a sled
reboot but hasn't yet been moved to Failed by the instance watcher.

Tested manually as follows:

1. Modify sled agent's VMM worker loop so that it doesn't publish VMM
state before exiting; this is needed so that manually unregistering an
instance from a sled doesn't cause it to go to Stopped
2. Launch a dev cluster with both (1) and the change in this PR.
3. Start an instance, then send an HTTP DELETE to sled agent's internal
API to forcibly unregister the VMM.
4. Observe that the instance remains Running in the console.
5. Stop the instance; observe that the "not found, going to Failed"
message is displayed and that the instance then goes to Failed.

Fixes #4511.
  • Loading branch information
gjcolombo authored Sep 27, 2024
1 parent c727c3f commit 69da5d6
Showing 1 changed file with 2 additions and 18 deletions.
20 changes: 2 additions & 18 deletions sled-agent/src/instance_manager.rs
Original file line number Diff line number Diff line change
Expand Up @@ -650,24 +650,8 @@ impl InstanceManagerRunner {
target: VmmStateRequested,
) -> Result<(), Error> {
let Some(instance) = self.get_propolis(propolis_id) else {
match target {
// If the instance isn't registered, then by definition it
// isn't running here. Allow requests to stop or destroy the
// instance to succeed to provide idempotency. This has to
// be handled here (that is, on the "instance not found"
// path) to handle the case where a stop request arrived,
// Propolis handled it, sled agent unregistered the
// instance, and only then did a second stop request
// arrive.
VmmStateRequested::Stopped => {
tx.send(Ok(VmmPutStateResponse { updated_runtime: None }))
.map_err(|_| Error::FailedSendClientClosed)?;
}
_ => {
tx.send(Err(Error::NoSuchVmm(propolis_id)))
.map_err(|_| Error::FailedSendClientClosed)?;
}
}
tx.send(Err(Error::NoSuchVmm(propolis_id)))
.map_err(|_| Error::FailedSendClientClosed)?;
return Ok(());
};
instance.put_state(tx, target).await?;
Expand Down

0 comments on commit 69da5d6

Please sign in to comment.