Skip to content

Commit

Permalink
Split instance state into Instance and VMM tables (#4194)
Browse files Browse the repository at this point in the history
Refactor the definition of an `Instance` throughout the control plane so
that an `Instance` is separate from the `Vmm`s that incarnate it. This
confers several advantages:

- VMMs have their own state that sled agent can update without
necessarily changing their instance's state. It's also possible to
change an instance's active Propolis ID without having to know or update
an instance's Propolis IP or current sled ID, since these change when an
instance's active Propolis ID changes. This removes a great deal of
complexity in sled agent, especially when live migrating an instance,
and also simplifies the live migration saga considerably.
- Resource reservations for instances have much clearer lifetimes: a
reservation can be released when its VMM has moved to a terminal state.
Nexus no longer has to reason about VMM lifetimes from changes to an
instance's Propolis ID columns.
- It's now possible for an Instance not to have an active Propolis at
all! This allows an instance not to reserve sled resources when it's not
running. It also allows an instance to stop and restart on a different
sled.
- It's also possible to get a history of an instance's VMMs for, e.g.,
zone bundle examination purposes ("my VMM had a problem two days ago but
it went away when I stopped and restarted it; can you investigate?").

Rework callers throughout Nexus who depend on knowing an instance's
current state and/or its current sled ID. In many cases (e.g. disk and
NIC attach and detach), the relevant detail is whether the instance has
an active Propolis; for simplicity, augment these checks with "has an
active Propolis ID" instead of trying to grab both instance and VMM
states.

## Known issues/remaining work

- The virtual provisioning table is still updated only at instance
creation/deletion time. Usage metrics that depend on this table might
report strange and wonderful values if a user creates many more
instances than can be started at one time.
- Instances still use the generic "resource attachment" CTE to manage
attaching and detaching disks. Previously these queries looked at
instance states; now they look at an instance's state and whether it has
an active Propolis, but not at the active Propolis's state. This will
need to be revisited in the future to support disk hotplug.
- `handle_instance_put_result` is still very aggressive about setting
instances to the Failed state if sled agent returns errors other than
invalid-request-flavored errors. I think we should reconsider this
behavior, but this change is big enough as it is. I will file a TODO for
this and update the new comments accordingly before this merges.
- The new live migration logic is not tested yet and differs from the
"two-table" TLA+ model in RFD 361. More work will be needed here before
we can declare live migration fully ready for selfhosting.
- It would be nice to have an `omdb vmm` command; for now I've just
updated existing `omdb` commands to deal with the optionality of
Propolises and sleds.

Tests:

- Unit/integration tests
- On a single-machine dev cluster, created two instances and verified
that:
- The instances only have resource reservations while they're running
(and they reserve reservoir space now)
- The instances can reach each other over their internal and external
IPs when they're both running (and can still reach each other even if
you try to delete one while it's active)
- `scadm` shows the appropriate IP mappings being added/deleted as the
instances start/stop
  - The instances' serial consoles work as expected
- Attaching a new disk to an instance is only possible if the instance
is stopped
- Disk snapshot succeeds when invoked on a running instance's attached
disk
  - Deleting an instance detaches its disks
- `omicron-stress` on a single-machine dev cluster ran for about an hour
and created ~800 instances without any instances going to the Failed
state (previously this would happen in the first 5-10 minutes)
  • Loading branch information
gjcolombo authored Oct 12, 2023
1 parent c6955a5 commit 876e8ca
Show file tree
Hide file tree
Showing 64 changed files with 4,694 additions and 3,087 deletions.
43 changes: 27 additions & 16 deletions clients/nexus-client/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -88,22 +88,41 @@ impl From<omicron_common::api::internal::nexus::InstanceRuntimeState>
s: omicron_common::api::internal::nexus::InstanceRuntimeState,
) -> Self {
Self {
run_state: s.run_state.into(),
sled_id: s.sled_id,
propolis_id: s.propolis_id,
dst_propolis_id: s.dst_propolis_id,
propolis_addr: s.propolis_addr.map(|addr| addr.to_string()),
gen: s.gen.into(),
migration_id: s.migration_id,
propolis_gen: s.propolis_gen.into(),
ncpus: s.ncpus.into(),
memory: s.memory.into(),
hostname: s.hostname,
propolis_id: s.propolis_id,
time_updated: s.time_updated,
}
}
}

impl From<omicron_common::api::internal::nexus::VmmRuntimeState>
for types::VmmRuntimeState
{
fn from(s: omicron_common::api::internal::nexus::VmmRuntimeState) -> Self {
Self {
gen: s.gen.into(),
state: s.state.into(),
time_updated: s.time_updated,
}
}
}

impl From<omicron_common::api::internal::nexus::SledInstanceState>
for types::SledInstanceState
{
fn from(
s: omicron_common::api::internal::nexus::SledInstanceState,
) -> Self {
Self {
instance_state: s.instance_state.into(),
propolis_id: s.propolis_id,
vmm_state: s.vmm_state.into(),
}
}
}

impl From<omicron_common::api::external::InstanceState>
for types::InstanceState
{
Expand All @@ -124,14 +143,6 @@ impl From<omicron_common::api::external::InstanceState>
}
}

impl From<omicron_common::api::external::InstanceCpuCount>
for types::InstanceCpuCount
{
fn from(s: omicron_common::api::external::InstanceCpuCount) -> Self {
Self(s.0)
}
}

impl From<omicron_common::api::external::Generation> for types::Generation {
fn from(s: omicron_common::api::external::Generation) -> Self {
Self(i64::from(&s) as u64)
Expand Down
38 changes: 24 additions & 14 deletions clients/sled-agent-client/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -24,16 +24,9 @@ impl From<omicron_common::api::internal::nexus::InstanceRuntimeState>
s: omicron_common::api::internal::nexus::InstanceRuntimeState,
) -> Self {
Self {
run_state: s.run_state.into(),
sled_id: s.sled_id,
propolis_id: s.propolis_id,
dst_propolis_id: s.dst_propolis_id,
propolis_addr: s.propolis_addr.map(|addr| addr.to_string()),
migration_id: s.migration_id,
propolis_gen: s.propolis_gen.into(),
ncpus: s.ncpus.into(),
memory: s.memory.into(),
hostname: s.hostname,
gen: s.gen.into(),
time_updated: s.time_updated,
}
Expand Down Expand Up @@ -85,22 +78,39 @@ impl From<types::InstanceRuntimeState>
{
fn from(s: types::InstanceRuntimeState) -> Self {
Self {
run_state: s.run_state.into(),
sled_id: s.sled_id,
propolis_id: s.propolis_id,
dst_propolis_id: s.dst_propolis_id,
propolis_addr: s.propolis_addr.map(|addr| addr.parse().unwrap()),
migration_id: s.migration_id,
propolis_gen: s.propolis_gen.into(),
ncpus: s.ncpus.into(),
memory: s.memory.into(),
hostname: s.hostname,
gen: s.gen.into(),
time_updated: s.time_updated,
}
}
}

impl From<types::VmmRuntimeState>
for omicron_common::api::internal::nexus::VmmRuntimeState
{
fn from(s: types::VmmRuntimeState) -> Self {
Self {
state: s.state.into(),
gen: s.gen.into(),
time_updated: s.time_updated,
}
}
}

impl From<types::SledInstanceState>
for omicron_common::api::internal::nexus::SledInstanceState
{
fn from(s: types::SledInstanceState) -> Self {
Self {
instance_state: s.instance_state.into(),
propolis_id: s.propolis_id,
vmm_state: s.vmm_state.into(),
}
}
}

impl From<types::InstanceState>
for omicron_common::api::external::InstanceState
{
Expand Down
31 changes: 1 addition & 30 deletions common/src/api/external/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -739,6 +739,7 @@ pub enum ResourceType {
UpdateableComponent,
UserBuiltin,
Zpool,
Vmm,
}

// IDENTITY METADATA
Expand Down Expand Up @@ -866,25 +867,6 @@ impl InstanceState {
InstanceState::Destroyed => "destroyed",
}
}

/// Returns true if the given state represents a fully stopped Instance.
/// This means that a transition from an !is_stopped() state must go
/// through Stopping.
pub fn is_stopped(&self) -> bool {
match self {
InstanceState::Starting => false,
InstanceState::Running => false,
InstanceState::Stopping => false,
InstanceState::Rebooting => false,
InstanceState::Migrating => false,

InstanceState::Creating => true,
InstanceState::Stopped => true,
InstanceState::Repairing => true,
InstanceState::Failed => true,
InstanceState::Destroyed => true,
}
}
}

/// The number of CPUs in an Instance
Expand Down Expand Up @@ -912,17 +894,6 @@ pub struct InstanceRuntimeState {
pub time_run_state_updated: DateTime<Utc>,
}

impl From<crate::api::internal::nexus::InstanceRuntimeState>
for InstanceRuntimeState
{
fn from(state: crate::api::internal::nexus::InstanceRuntimeState) -> Self {
InstanceRuntimeState {
run_state: state.run_state,
time_run_state_updated: state.time_updated,
}
}
}

/// View of an Instance
#[derive(ObjectIdentity, Clone, Debug, Deserialize, Serialize, JsonSchema)]
pub struct Instance {
Expand Down
69 changes: 44 additions & 25 deletions common/src/api/internal/nexus.rs
Original file line number Diff line number Diff line change
Expand Up @@ -29,40 +29,59 @@ pub struct DiskRuntimeState {
pub time_updated: DateTime<Utc>,
}

/// Runtime state of the Instance, including the actual running state and minimal
/// metadata
///
/// This state is owned by the sled agent running that Instance.
/// The "static" properties of an instance: information about the instance that
/// doesn't change while the instance is running.
#[derive(Clone, Debug, Deserialize, Serialize, JsonSchema)]
pub struct InstanceRuntimeState {
/// runtime state of the Instance
pub run_state: InstanceState,
/// which sled is running this Instance
pub sled_id: Uuid,
/// which propolis-server is running this Instance
pub propolis_id: Uuid,
/// the target propolis-server during a migration of this Instance
pub dst_propolis_id: Option<Uuid>,
/// address of propolis-server running this Instance
pub propolis_addr: Option<SocketAddr>,
/// migration id (if one in process)
pub migration_id: Option<Uuid>,
/// The generation number for the Propolis and sled identifiers for this
/// instance.
pub propolis_gen: Generation,
/// number of CPUs allocated for this Instance
pub struct InstanceProperties {
pub ncpus: InstanceCpuCount,
/// memory allocated for this Instance
pub memory: ByteCount,
/// RFC1035-compliant hostname for the Instance.
/// RFC1035-compliant hostname for the instance.
// TODO-cleanup different type?
pub hostname: String,
/// generation number for this state
}

/// The dynamic runtime properties of an instance: its current VMM ID (if any),
/// migration information (if any), and the instance state to report if there is
/// no active VMM.
#[derive(Clone, Debug, Deserialize, Serialize, JsonSchema)]
pub struct InstanceRuntimeState {
/// The instance's currently active VMM ID.
pub propolis_id: Option<Uuid>,
/// If a migration is active, the ID of the target VMM.
pub dst_propolis_id: Option<Uuid>,
/// If a migration is active, the ID of that migration.
pub migration_id: Option<Uuid>,
/// Generation number for this state.
pub gen: Generation,
/// timestamp for this information
/// Timestamp for this information.
pub time_updated: DateTime<Utc>,
}

/// The dynamic runtime properties of an individual VMM process.
#[derive(Clone, Debug, Deserialize, Serialize, JsonSchema)]
pub struct VmmRuntimeState {
/// The last state reported by this VMM.
pub state: InstanceState,
/// The generation number for this VMM's state.
pub gen: Generation,
/// Timestamp for the VMM's state.
pub time_updated: DateTime<Utc>,
}

/// A wrapper type containing a sled's total knowledge of the state of a
/// specific VMM and the instance it incarnates.
#[derive(Clone, Debug, Deserialize, Serialize, JsonSchema)]
pub struct SledInstanceState {
/// The sled's conception of the state of the instance.
pub instance_state: InstanceRuntimeState,

/// The ID of the VMM whose state is being reported.
pub propolis_id: Uuid,

/// The most recent state of the sled's VMM process.
pub vmm_state: VmmRuntimeState,
}

// Oximeter producer/collector objects.

/// Information announced by a metric server, used so that clients can contact it and collect
Expand Down
Loading

0 comments on commit 876e8ca

Please sign in to comment.