Skip to content

Commit

Permalink
Expose Sled Agent API for "control plane disk management", use it (#5172
Browse files Browse the repository at this point in the history
)

# Overview

## Virtual Environment Changes

- Acting on Disks, not Zpools
- Previously, sled agent could operate on "user-supplied zpools", which
were created by `./tools/virtual_hardware.sh`
- Now, in a world where Nexus has more control over zpool allocation,
the configuration can supply "virtual devices" instead of "zpools", to
give RSS/Nexus control over "when zpools actually get placed on these
devices".
  - Impact:
    - `sled-agent/src/config.rs`
    - `smf/sled-agent/non-gimlet/config.toml`
    - `tools/virtual_hardware.sh`

## Sled Agent Changes

- HTTP API
- The Sled Agent exposes an API to "set" and "get" the "control plane
physical disks" specified by Nexus. The set of control plane physical
disks (usable U.2s) are stored into a ledger on the M.2s (as
`omicron-physical-disks.json`). The set of control plane physical disks
also determines "which disks are available to the rest of the sled
agent".
- StorageManager
- **Before**: When physical U.2 disks are detected by the Sled Agent,
they are "auto-formatted if empty", and we notify Nexus about them. This
"upserts" them into the DB, so they are basically automatically adopted
into the control plane.
- **After**: As we've discussed on RFD 457, we want to get to a world
where physical U.2 disks are **detected** by Sled Agent, but not
**used** until RSS/Nexus explicitly tells the Sled Agent to "use this
sled as part of the control plane". This set of "in-use control plane
disks" is stored on a "ledger" file in the M.2.
- **Transition**: On deployed systems, we need to boot up to Nexus, even
though we don't have a ledger of control plane disks. Within the
implementation of `StorageManager::key_manager_ready`, we implement a
workaround: if we detect a system with no ledger, but with zpools, we'll
use that set of zpools unconditionally until told otherwise. This is a
short-term workaround to migrate existing systems, but can be removed
when deployed racks reliably have ledgers for control plane disks.
- StorageManagerTestHarness
- In an effort to reduce "test fakes" and replace them with real
storage, `StorageManagerTestHarness` provides testing utilities for
spinning up vdevs, formatting them with zpools, and managing them. This
helps us avoid a fair bit of bifurcation for "test-only synthetic disks"
vs "real disks", though it does mean many of our tests in the sled-agent
are now 'illumos-only'.
 
## RSS Changes

- RSS is now responsible for provisioning "control plane disks and
zpools" during initial bootstrapping
- RSS informs Nexus about the allocation decisions it makes via the RSS
handoff

## Nexus Changes

- Nexus exposes a smaller API (no notification of "disk add/remove,
zpools add/remove"). It receives a handoff from RSS, and will later be
in charge of provisioning decisions based on inventory.
- Dynamically adding/removing disks/zpools after RSS will be appearing
in a subsequent PR.

---------

Co-authored-by: Andrew J. Stone <[email protected]>
  • Loading branch information
smklein and andrewjstone authored Mar 29, 2024
1 parent e5094dc commit 4eb8153
Show file tree
Hide file tree
Showing 86 changed files with 5,037 additions and 2,632 deletions.
13 changes: 9 additions & 4 deletions .github/buildomat/jobs/deploy.sh
Original file line number Diff line number Diff line change
Expand Up @@ -205,7 +205,7 @@ PXA_END="$EXTRA_IP_END"
export GATEWAY_IP GATEWAY_MAC PXA_START PXA_END

pfexec zpool create -f scratch c1t1d0 c2t1d0
ZPOOL_VDEV_DIR=/scratch ptime -m pfexec ./tools/create_virtual_hardware.sh
VDEV_DIR=/scratch ptime -m pfexec ./tools/create_virtual_hardware.sh

#
# Generate a self-signed certificate to use as the initial TLS certificate for
Expand All @@ -214,7 +214,12 @@ ZPOOL_VDEV_DIR=/scratch ptime -m pfexec ./tools/create_virtual_hardware.sh
# real system, the certificate would come from the customer during initial rack
# setup on the technician port.
#
tar xf out/omicron-sled-agent.tar pkg/config-rss.toml
tar xf out/omicron-sled-agent.tar pkg/config-rss.toml pkg/config.toml

# Update the vdevs to point to where we've created them
sed -E -i~ "s/(m2|u2)(.*\.vdev)/\/scratch\/\1\2/g" pkg/config.toml
diff -u pkg/config.toml{~,} || true

SILO_NAME="$(sed -n 's/silo_name = "\(.*\)"/\1/p' pkg/config-rss.toml)"
EXTERNAL_DNS_DOMAIN="$(sed -n 's/external_dns_zone_name = "\(.*\)"/\1/p' pkg/config-rss.toml)"

Expand All @@ -241,8 +246,8 @@ addresses = \\[\"$UPLINK_IP/24\"\\]
" pkg/config-rss.toml
diff -u pkg/config-rss.toml{~,} || true

tar rvf out/omicron-sled-agent.tar pkg/config-rss.toml
rm -f pkg/config-rss.toml*
tar rvf out/omicron-sled-agent.tar pkg/config-rss.toml pkg/config.toml
rm -f pkg/config-rss.toml* pkg/config.toml*

#
# By default, OpenSSL creates self-signed certificates with "CA:true". The TLS
Expand Down
5 changes: 5 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -444,6 +444,7 @@ update-engine = { path = "update-engine" }
usdt = "0.5.0"
uuid = { version = "1.7.0", features = ["serde", "v4"] }
walkdir = "2.4"
whoami = "1.5"
wicket = { path = "wicket" }
wicket-common = { path = "wicket-common" }
wicketd-client = { path = "clients/wicketd-client" }
Expand Down
11 changes: 1 addition & 10 deletions clients/sled-agent-client/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ progenitor::generate_api!(
// replace directives below?
replace = {
ByteCount = omicron_common::api::external::ByteCount,
DiskIdentity = omicron_common::disk::DiskIdentity,
Generation = omicron_common::api::external::Generation,
MacAddr = omicron_common::api::external::MacAddr,
Name = omicron_common::api::external::Name,
Expand Down Expand Up @@ -230,16 +231,6 @@ impl omicron_common::api::external::ClientError for types::Error {
}
}

impl From<types::DiskIdentity> for omicron_common::disk::DiskIdentity {
fn from(identity: types::DiskIdentity) -> Self {
Self {
vendor: identity.vendor,
serial: identity.serial,
model: identity.model,
}
}
}

impl From<omicron_common::api::internal::nexus::InstanceRuntimeState>
for types::InstanceRuntimeState
{
Expand Down
1 change: 1 addition & 0 deletions common/src/api/external/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -881,6 +881,7 @@ pub enum ResourceType {
ServiceNetworkInterface,
Sled,
SledInstance,
SledLedger,
Switch,
SagaDbg,
Snapshot,
Expand Down
5 changes: 3 additions & 2 deletions common/src/ledger.rs
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
use async_trait::async_trait;
use camino::{Utf8Path, Utf8PathBuf};
use serde::{de::DeserializeOwned, Serialize};
use slog::{debug, info, warn, Logger};
use slog::{debug, error, info, warn, Logger};

#[derive(thiserror::Error, Debug)]
pub enum Error {
Expand Down Expand Up @@ -127,14 +127,15 @@ impl<T: Ledgerable> Ledger<T> {
let mut one_successful_write = false;
for path in self.paths.iter() {
if let Err(e) = self.atomic_write(&path).await {
warn!(self.log, "Failed to write to {}: {e}", path);
warn!(self.log, "Failed to write ledger"; "path" => ?path, "err" => ?e);
failed_paths.push((path.to_path_buf(), e));
} else {
one_successful_write = true;
}
}

if !one_successful_write {
error!(self.log, "No successful writes to ledger");
return Err(Error::FailedToWrite { failed_paths });
}
Ok(())
Expand Down
4 changes: 1 addition & 3 deletions illumos-utils/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ smf.workspace = true
thiserror.workspace = true
tokio.workspace = true
uuid.workspace = true
whoami.workspace = true
zone.workspace = true

# only enabled via the `testing` feature
Expand All @@ -46,6 +47,3 @@ toml.workspace = true
[features]
# Enable to generate MockZones
testing = ["mockall"]
# Useful for tests that want real functionality and ability to run without
# pfexec
tmp_keypath = []
61 changes: 40 additions & 21 deletions illumos-utils/src/zfs.rs
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
//! Utilities for poking at ZFS.
use crate::{execute, PFEXEC};
use camino::Utf8PathBuf;
use camino::{Utf8Path, Utf8PathBuf};
use omicron_common::disk::DiskIdentity;
use std::fmt;

Expand All @@ -28,8 +28,6 @@ pub const ZFS: &str = "/usr/sbin/zfs";
/// the keys and recreate the files on demand when creating and mounting
/// encrypted filesystems. We then zero them and unlink them.
pub const KEYPATH_ROOT: &str = "/var/run/oxide/";
// Use /tmp so we don't have to worry about running tests with pfexec
pub const TEST_KEYPATH_ROOT: &str = "/tmp";

/// Error returned by [`Zfs::list_datasets`].
#[derive(thiserror::Error, Debug)]
Expand Down Expand Up @@ -168,27 +166,34 @@ impl fmt::Display for Keypath {
}
}

#[cfg(not(feature = "tmp_keypath"))]
impl From<&DiskIdentity> for Keypath {
fn from(id: &DiskIdentity) -> Self {
build_keypath(id, KEYPATH_ROOT)
}
}

#[cfg(feature = "tmp_keypath")]
impl From<&DiskIdentity> for Keypath {
fn from(id: &DiskIdentity) -> Self {
build_keypath(id, TEST_KEYPATH_ROOT)
impl Keypath {
/// Constructs a Keypath for the specified disk within the supplied root
/// directory.
///
/// By supplying "root", tests can override the location where these paths
/// are stored to non-global locations.
pub fn new<P: AsRef<Utf8Path>>(id: &DiskIdentity, root: &P) -> Keypath {
let keypath_root = Utf8PathBuf::from(KEYPATH_ROOT);
let mut keypath = keypath_root.as_path();
let keypath_directory = loop {
match keypath.strip_prefix("/") {
Ok(stripped) => keypath = stripped,
Err(_) => break root.as_ref().join(keypath),
}
};
std::fs::create_dir_all(&keypath_directory)
.expect("Cannot ensure directory for keys");

let filename = format!(
"{}-{}-{}-zfs-aes-256-gcm.key",
id.vendor, id.serial, id.model
);
let path: Utf8PathBuf =
[keypath_directory.as_str(), &filename].iter().collect();
Keypath(path)
}
}

fn build_keypath(id: &DiskIdentity, root: &str) -> Keypath {
let filename =
format!("{}-{}-{}-zfs-aes-256-gcm.key", id.vendor, id.serial, id.model);
let path: Utf8PathBuf = [root, &filename].iter().collect();
Keypath(path)
}

#[derive(Debug)]
pub struct EncryptionDetails {
pub keypath: Keypath,
Expand Down Expand Up @@ -332,6 +337,20 @@ impl Zfs {
err: err.into(),
})?;

// We ensure that the currently running process has the ability to
// act on the underlying mountpoint.
if !zoned {
let mut command = std::process::Command::new(PFEXEC);
let user = whoami::username();
let mount = format!("{mountpoint}");
let cmd = command.args(["chown", "-R", &user, &mount]);
execute(cmd).map_err(|err| EnsureFilesystemError {
name: name.to_string(),
mountpoint: mountpoint.clone(),
err: err.into(),
})?;
}

if let Some(SizeDetails { quota, compression }) = size_details {
// Apply any quota and compression mode.
Self::apply_properties(name, &mountpoint, quota, compression)?;
Expand Down
17 changes: 12 additions & 5 deletions illumos-utils/src/zpool.rs
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,12 @@ use std::fmt;
use std::str::FromStr;
use uuid::Uuid;

const ZPOOL_EXTERNAL_PREFIX: &str = "oxp_";
const ZPOOL_INTERNAL_PREFIX: &str = "oxi_";
pub const ZPOOL_EXTERNAL_PREFIX: &str = "oxp_";
pub const ZPOOL_INTERNAL_PREFIX: &str = "oxi_";
const ZPOOL: &str = "/usr/sbin/zpool";

pub const ZPOOL_MOUNTPOINT_ROOT: &str = "/";

#[derive(thiserror::Error, Debug, PartialEq, Eq)]
#[error("Failed to parse output: {0}")]
pub struct ParseError(String);
Expand Down Expand Up @@ -192,7 +194,7 @@ impl Zpool {
let mut cmd = std::process::Command::new(PFEXEC);
cmd.env_clear();
cmd.env("LC_ALL", "C.UTF-8");
cmd.arg(ZPOOL).arg("create");
cmd.arg(ZPOOL).args(["create", "-o", "ashift=12"]);
cmd.arg(&name.to_string());
cmd.arg(vdev);
execute(&mut cmd).map_err(Error::from)?;
Expand Down Expand Up @@ -374,9 +376,14 @@ impl ZpoolName {
/// Returns a path to a dataset's mountpoint within the zpool.
///
/// For example: oxp_(UUID) -> /pool/ext/(UUID)/(dataset)
pub fn dataset_mountpoint(&self, dataset: &str) -> Utf8PathBuf {
pub fn dataset_mountpoint(
&self,
root: &Utf8Path,
dataset: &str,
) -> Utf8PathBuf {
let mut path = Utf8PathBuf::new();
path.push("/pool");
path.push(root);
path.push("pool");
match self.kind {
ZpoolKind::External => path.push("ext"),
ZpoolKind::Internal => path.push("int"),
Expand Down
13 changes: 10 additions & 3 deletions installinator/src/hardware.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ use anyhow::Result;
use sled_hardware::DiskVariant;
use sled_hardware::HardwareManager;
use sled_hardware::SledMode;
use sled_storage::config::MountConfig;
use sled_storage::disk::Disk;
use sled_storage::disk::RawDisk;
use slog::info;
Expand Down Expand Up @@ -49,9 +50,15 @@ impl Hardware {
);
}
DiskVariant::M2 => {
let disk = Disk::new(log, disk, None)
.await
.context("failed to instantiate Disk handle for M.2")?;
let disk = Disk::new(
log,
&MountConfig::default(),
disk,
None,
None,
)
.await
.context("failed to instantiate Disk handle for M.2")?;
m2_disks.push(disk);
}
}
Expand Down
1 change: 1 addition & 0 deletions installinator/src/write.rs
Original file line number Diff line number Diff line change
Expand Up @@ -116,6 +116,7 @@ impl WriteDestination {

let zpool_name = disk.zpool_name().clone();
let control_plane_dir = zpool_name.dataset_mountpoint(
illumos_utils::zpool::ZPOOL_MOUNTPOINT_ROOT.into(),
sled_storage::dataset::INSTALL_DATASET,
);

Expand Down
2 changes: 1 addition & 1 deletion key-manager/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -102,7 +102,7 @@ enum StorageKeyRequest {
/// the sled-agent starts. The `HardwareMonitor` gets the StorageKeyRequester
/// from the bootstrap agent. If this changes, we should remove the `Clone` to
/// limit who has access to the storage keys.
#[derive(Clone)]
#[derive(Debug, Clone)]
pub struct StorageKeyRequester {
tx: mpsc::Sender<StorageKeyRequest>,
}
Expand Down
15 changes: 3 additions & 12 deletions nexus/db-model/src/physical_disk.rs
Original file line number Diff line number Diff line change
Expand Up @@ -29,14 +29,15 @@ pub struct PhysicalDisk {

impl PhysicalDisk {
pub fn new(
id: Uuid,
vendor: String,
serial: String,
model: String,
variant: PhysicalDiskKind,
sled_id: Uuid,
) -> Self {
Self {
identity: PhysicalDiskIdentity::new(Uuid::new_v4()),
identity: PhysicalDiskIdentity::new(id),
time_deleted: None,
rcgen: Generation::new(),
vendor,
Expand All @@ -47,20 +48,10 @@ impl PhysicalDisk {
}
}

pub fn uuid(&self) -> Uuid {
pub fn id(&self) -> Uuid {
self.identity.id
}

// This is slightly gross, but:
// the `authz_resource` macro really expects that the "primary_key"
// for an object can be acquired by "id()".
//
// The PhysicalDisk object does actually have a separate convenience
// UUID, but may be looked by up vendor/serial/model too.
pub fn id(&self) -> (String, String, String) {
(self.vendor.clone(), self.serial.clone(), self.model.clone())
}

pub fn time_deleted(&self) -> Option<DateTime<Utc>> {
self.time_deleted
}
Expand Down
2 changes: 1 addition & 1 deletion nexus/db-queries/src/authz/api_resources.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1060,7 +1060,7 @@ authz_resource! {
authz_resource! {
name = "PhysicalDisk",
parent = "Fleet",
primary_key = (String, String, String),
primary_key = Uuid,
roles_allowed = false,
polar_snippet = FleetChild,
}
Expand Down
6 changes: 4 additions & 2 deletions nexus/db-queries/src/authz/policy_test/resources.rs
Original file line number Diff line number Diff line change
Expand Up @@ -102,10 +102,12 @@ pub async fn make_resources(

make_services(&mut builder).await;

let physical_disk_id =
"c9f923f6-caf3-4c83-96f9-8ffe8c627dd2".parse().unwrap();
builder.new_resource(authz::PhysicalDisk::new(
authz::FLEET,
("vendor".to_string(), "serial".to_string(), "model".to_string()),
LookupType::ByCompositeId("vendor-serial-model".to_string()),
physical_disk_id,
LookupType::ById(physical_disk_id),
));

let device_user_code = String::from("a-device-user-code");
Expand Down
5 changes: 4 additions & 1 deletion nexus/db-queries/src/db/datastore/dataset.rs
Original file line number Diff line number Diff line change
Expand Up @@ -230,7 +230,10 @@ mod test {
// Create a fake zpool that backs our fake datasets.
let zpool_id = Uuid::new_v4();
let zpool = Zpool::new(zpool_id, sled_id, Uuid::new_v4());
datastore.zpool_upsert(zpool).await.expect("failed to upsert zpool");
datastore
.zpool_upsert(opctx, zpool)
.await
.expect("failed to upsert zpool");

// Inserting a new dataset should succeed.
let dataset1 = datastore
Expand Down
Loading

0 comments on commit 4eb8153

Please sign in to comment.