Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[sled agent] API to manage datasets explicitly #6144

Merged
merged 38 commits into from
Aug 28, 2024
Merged
Show file tree
Hide file tree
Changes from 13 commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
5c2e5b0
[wip] Starting sled agent API to manage datasets explicitly
smklein Jul 18, 2024
1709d80
Merge branch 'main' into explicit-datasets
smklein Jul 22, 2024
a80313d
list implementation
smklein Jul 22, 2024
7193d1b
Tests
smklein Jul 22, 2024
9464cc1
schemas n stuff
smklein Jul 22, 2024
fb23666
Merge branch 'main' into explicit-datasets
smklein Jul 23, 2024
29c4eb7
Merge branch 'main' into explicit-datasets
smklein Jul 25, 2024
75f006b
Merge branch 'main' into explicit-datasets
smklein Jul 26, 2024
24d93a8
Fix mismerge
smklein Jul 26, 2024
178e20e
Clippy and helios
smklein Jul 26, 2024
cf3f35c
openapi
smklein Jul 26, 2024
11f49fb
More broad support for datasets
smklein Jul 30, 2024
b999bf7
make it a schema change
smklein Jul 30, 2024
fac6456
Merge branch 'main' into explicit-datasets
smklein Jul 30, 2024
fdf0644
queries
smklein Jul 30, 2024
a042439
Merge branch 'main' into explicit-datasets
smklein Jul 31, 2024
9c27213
Merge branch 'main' into explicit-datasets
smklein Aug 7, 2024
c562125
Optional properties still get set
smklein Aug 7, 2024
c7a5adf
clippy
smklein Aug 7, 2024
181507c
fmt
smklein Aug 7, 2024
8e1df07
str not string
smklein Aug 7, 2024
93134c2
cockroachdb, not cockroach_db
smklein Aug 7, 2024
d88d541
generation numbers
smklein Aug 7, 2024
95ebbb8
review feedback
smklein Aug 7, 2024
994757e
Serialize via string, not tag
smklein Aug 7, 2024
7156ba0
safer API for Dataset::new
smklein Aug 8, 2024
14a2dc4
merge main into branch, manually impl JsonSchema
sunshowers Aug 8, 2024
aa60254
update schema JSONs, add replace
sunshowers Aug 8, 2024
67c5d7e
Merge branch 'main' into explicit-datasets
smklein Aug 10, 2024
927efd1
Merge remote-tracking branch 'origin/explicit-datasets' into explicit…
smklein Aug 10, 2024
3e14cc4
Fix schema
smklein Aug 10, 2024
40f93ae
Merge branch 'main' into explicit-datasets
smklein Aug 22, 2024
077908d
Fix clickhouseserver datasetkind serialization
smklein Aug 26, 2024
a600494
Merge branch 'main' into explicit-datasets
smklein Aug 26, 2024
0a63161
Merge branch 'main' into explicit-datasets
smklein Aug 28, 2024
c9f170e
Make CompressionAlgorithm strongly typed
smklein Aug 28, 2024
f5bc35a
Fixing helios-only tests, clippy
smklein Aug 28, 2024
474ec13
schemas
smklein Aug 28, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
105 changes: 101 additions & 4 deletions common/src/api/internal/shared.rs
Original file line number Diff line number Diff line change
Expand Up @@ -704,33 +704,130 @@ pub struct ResolvedVpcRouteSet {

/// Describes the purpose of the dataset.
#[derive(
Debug, Serialize, Deserialize, JsonSchema, Clone, Copy, PartialEq, Eq,
Debug,
Serialize,
Deserialize,
JsonSchema,
Clone,
PartialEq,
Eq,
Ord,
PartialOrd,
Hash,
)]
#[serde(rename_all = "snake_case")]
#[serde(tag = "type", rename_all = "snake_case")]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will adding this tag break any serialization? I'm thinking of code which attempts to deserialize these values, but which is not necessarily running against the same version of the code here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we should just always require tag in serde impls to ensure forward compatibility, in case any particular variant starts having associated data in the future. (The requirement to use a tag, if any variant has associated data, is imposed by the openapi tooling.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went down the route of "serialize/deserialize via string", as @sunshowers recommended later.

pub enum DatasetKind {
Crucible,
// Durable datasets for zones

// This renaming exists for backwards compatibility -- this enum variant
// was serialized to "all-zones-request" as "cockroach_db" and should
// stay that way, unless we perform an explicit schema change.
#[serde(rename = "cockroach_db")]
Cockroach,
Crucible,
Clickhouse,
ClickhouseKeeper,
ExternalDns,
InternalDns,

// Zone filesystems
ZoneRoot,
Zone {
name: String,
},

// Other datasets
Debug,
}

impl DatasetKind {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A chunk of this implementation was moved from sled-storage/src/dataset.rs.

History lesson: We used to have types defined that would be usable by Nexus (common/src/api/internal), and types that were internal to the sled agent (sled-storage/src/dataset.rs) for defining dataset types/kinds.

This PR merges both concepts, but tweaks some names to avoid changing any schemas that are saved on M.2s.

pub fn dataset_should_be_encrypted(&self) -> bool {
match self {
// We encrypt all datasets except Crucible.
//
// Crucible already performs encryption internally, and we
// avoid double-encryption.
DatasetKind::Crucible => false,
_ => true,
}
}

/// Returns true if this dataset is delegated to a non-global zone.
pub fn zoned(&self) -> bool {
use DatasetKind::*;
match self {
Cockroach | Crucible | Clickhouse | ClickhouseKeeper
| ExternalDns | InternalDns => true,
ZoneRoot | Zone { .. } | Debug => false,
}
}

/// Returns the zone name, if this is dataset for a zone filesystem.
smklein marked this conversation as resolved.
Show resolved Hide resolved
///
/// Otherwise, returns "None".
pub fn zone_name(&self) -> Option<String> {
smklein marked this conversation as resolved.
Show resolved Hide resolved
if let DatasetKind::Zone { name } = self {
Some(name.clone())
} else {
None
}
}
}

// Be cautious updating this implementation:
//
// - It should align with [DatasetKind::FromStr], below
// - The strings here are used here comprise the dataset name, stored durably
// on-disk
impl fmt::Display for DatasetKind {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
use DatasetKind::*;
let s = match self {
Crucible => "crucible",
Cockroach => "cockroach",
Cockroach => "cockroachdb",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When is the Display implementation used and when is a value serialized? I'm wondering why this variant is "cockroachdb" here but "cockroach_db" when serialized.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, i would have a slight preference for the Display and Serialize strings being identical...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 as well. For ZoneKind, which is related, we already have 4 different string representations X_X:

Copy link
Collaborator Author

@smklein smklein Aug 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So my mishaps with naming here were intended to keep backwards compatibility between:

  • Names for cockroach we have in deployed systems (e.g., the zfs list-ed name is cockroachdb)
  • Names for cockroach we have stored on-disk in configuration files (e.g., grepping for cockroach_db in the schema directory of Omicron shows that it's used in e.g., all-zones-requests.json, which is used for bootstrapping "what zones get auto-launched when we reboot")

To be clear, I think I misinterpreted this in my original PR.

I was attempting to merge DatasetKind (from this file) and DatasetType (from sled-storage/src/dataset.rs) without breaking backwards compatibility. It was in the all-zones-requests.json configuration file where I saw the underscored "cockroach_db" name and tried to keep compatibility.

HOWEVER, upon closer inspection, the cockroach_db name actually comes from OmicronZoneType in nexus-sled-agent-shared/src/inventory.rs, so I should be able to just completely stick with the cockroachdb name (no underscores) in this DatasetKind variant.

Updated in 93134c2 to just use cockroachdb in serialize + display

Clickhouse => "clickhouse",
ClickhouseKeeper => "clickhouse_keeper",
ExternalDns => "external_dns",
InternalDns => "internal_dns",
ZoneRoot => "zone",
Zone { name } => {
write!(f, "zone/{}", name)?;
return Ok(());
}
smklein marked this conversation as resolved.
Show resolved Hide resolved
Debug => "debug",
};
write!(f, "{}", s)
}
}

#[derive(Debug, thiserror::Error)]
pub enum DatasetKindParseError {
#[error("Dataset unknown: {0}")]
UnknownDataset(String),
}

impl FromStr for DatasetKind {
type Err = DatasetKindParseError;

fn from_str(s: &str) -> Result<Self, Self::Err> {
use DatasetKind::*;
let kind = match s {
"crucible" => Crucible,
"cockroachdb" | "cockroach_db" => Cockroach,
smklein marked this conversation as resolved.
Show resolved Hide resolved
"clickhouse" => Clickhouse,
"clickhouse_keeper" => ClickhouseKeeper,
"external_dns" => ExternalDns,
"internal_dns" => InternalDns,
_ => {
return Err(DatasetKindParseError::UnknownDataset(
s.to_string(),
))
}
};
Ok(kind)
}
}

/// Identifiers for a single sled.
///
/// This is intended primarily to be used in timeseries, to identify
Expand Down
131 changes: 130 additions & 1 deletion common/src/disk.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4,15 +4,20 @@

//! Disk related types shared among crates

use omicron_uuid_kinds::DatasetUuid;
use omicron_uuid_kinds::ZpoolUuid;
use schemars::JsonSchema;
use serde::{Deserialize, Serialize};
use uuid::Uuid;

use crate::{
api::external::Generation, ledger::Ledgerable, zpool_name::ZpoolKind,
api::external::Generation,
ledger::Ledgerable,
zpool_name::{ZpoolKind, ZpoolName},
};

pub use crate::api::internal::shared::DatasetKind;

#[derive(
Clone,
Debug,
Expand Down Expand Up @@ -69,6 +74,130 @@ impl OmicronPhysicalDisksConfig {
}
}

#[derive(
Debug,
PartialEq,
Eq,
Hash,
Serialize,
Deserialize,
Clone,
JsonSchema,
PartialOrd,
Ord,
)]
pub struct DatasetName {
// A unique identifier for the Zpool on which the dataset is stored.
pool_name: ZpoolName,
// A name for the dataset within the Zpool.
kind: DatasetKind,
}

impl DatasetName {
pub fn new(pool_name: ZpoolName, kind: DatasetKind) -> Self {
Self { pool_name, kind }
}

pub fn pool(&self) -> &ZpoolName {
&self.pool_name
}

pub fn dataset(&self) -> &DatasetKind {
&self.kind
}

/// Returns the full name of the dataset, as would be returned from
/// "zfs get" or "zfs list".
///
/// If this dataset should be encrypted, this automatically adds the
/// "crypt" dataset component.
pub fn full_name(&self) -> String {
// Currently, we encrypt all datasets except Crucible.
//
// Crucible already performs encryption internally, and we
// avoid double-encryption.
if self.kind.dataset_should_be_encrypted() {
self.full_encrypted_name()
} else {
self.full_unencrypted_name()
}
}

fn full_encrypted_name(&self) -> String {
format!("{}/crypt/{}", self.pool_name, self.kind)
}

fn full_unencrypted_name(&self) -> String {
format!("{}/{}", self.pool_name, self.kind)
}
}

/// Configuration information necessary to request a single dataset
#[derive(
Clone,
Debug,
Deserialize,
Serialize,
JsonSchema,
PartialEq,
Eq,
Hash,
PartialOrd,
Ord,
)]
pub struct DatasetConfig {
/// The UUID of the dataset being requested
pub id: DatasetUuid,

/// The dataset's name
pub name: DatasetName,

/// The compression mode to be supplied, if any
pub compression: Option<String>,
smklein marked this conversation as resolved.
Show resolved Hide resolved

/// The upper bound on the amount of storage used by this dataset
pub quota: Option<usize>,

/// The lower bound on the amount of storage usable by this dataset
pub reservation: Option<usize>,
}

#[derive(
Clone, Debug, Deserialize, Serialize, JsonSchema, PartialEq, Eq, Hash,
)]
pub struct DatasetsConfig {
/// generation number of this configuration
///
/// This generation number is owned by the control plane (i.e., RSS or
/// Nexus, depending on whether RSS-to-Nexus handoff has happened). It
/// should not be bumped within Sled Agent.
///
/// Sled Agent rejects attempts to set the configuration to a generation
/// older than the one it's currently running.
///
/// Note that "Generation::new()", AKA, the first generation number,
/// is reserved for "no datasets". This is the default configuration
/// for a sled before any requests have been made.
pub generation: Generation,

pub datasets: Vec<DatasetConfig>,
}

impl Default for DatasetsConfig {
fn default() -> Self {
Self { generation: Generation::new(), datasets: vec![] }
}
}

impl Ledgerable for DatasetsConfig {
fn is_newer_than(&self, other: &Self) -> bool {
self.generation > other.generation
}

// No need to do this, the generation number is provided externally.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm as someone unfamiliar with this code I'm not qutie sure what it means here -- I see that there's a generation field. Do you mean it gets updated by whatever owns this config?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the generation field exists in the config, so we don't actually need (nor want) to change anything in the implementation of Ledgerable::generation_bump.

The "Ledgerable" trait exists to help us write a file to many disks, and assure that we'll read back the latest one, if it exists. Part of this is deciphering: If we failed halfway through a write, and we see two distinct ledgers, which should we use?

There are some configurations (basically, in the bootstore only now) that want to bump the generation number on every single write, so this just tracks "what's the latest thing that was written".

However, in this case -- and many others, for Nexus-supplied data -- we already have a generation from Nexus, and we don't need to have a duplicate generation number for writing things to disk. As long as the data is unmodified from Nexus -> Sled Agent -> Disk, we can safely re-use this field.

fn generation_bump(&mut self) {}
}

/// Uniquely identifies a disk.
#[derive(
Debug,
Expand Down
44 changes: 37 additions & 7 deletions illumos-utils/src/zfs.rs
Original file line number Diff line number Diff line change
Expand Up @@ -203,7 +203,8 @@ pub struct EncryptionDetails {
#[derive(Debug, Default)]
pub struct SizeDetails {
pub quota: Option<usize>,
pub compression: Option<&'static str>,
pub reservation: Option<usize>,
pub compression: Option<String>,
}

#[cfg_attr(any(test, feature = "testing"), mockall::automock, allow(dead_code))]
Expand Down Expand Up @@ -274,10 +275,18 @@ impl Zfs {
) -> Result<(), EnsureFilesystemError> {
let (exists, mounted) = Self::dataset_exists(name, &mountpoint)?;
if exists {
if let Some(SizeDetails { quota, compression }) = size_details {
if let Some(SizeDetails { quota, reservation, compression }) =
size_details
{
// apply quota and compression mode (in case they've changed across
// sled-agent versions since creation)
Self::apply_properties(name, &mountpoint, quota, compression)?;
Self::apply_properties(
name,
&mountpoint,
quota,
reservation,
compression,
)?;
}

if encryption_details.is_none() {
Expand Down Expand Up @@ -351,9 +360,17 @@ impl Zfs {
})?;
}

if let Some(SizeDetails { quota, compression }) = size_details {
if let Some(SizeDetails { quota, reservation, compression }) =
size_details
{
// Apply any quota and compression mode.
Self::apply_properties(name, &mountpoint, quota, compression)?;
Self::apply_properties(
name,
&mountpoint,
quota,
reservation,
compression,
)?;
}

Ok(())
Expand All @@ -363,7 +380,8 @@ impl Zfs {
name: &str,
mountpoint: &Mountpoint,
quota: Option<usize>,
compression: Option<&'static str>,
reservation: Option<usize>,
smklein marked this conversation as resolved.
Show resolved Hide resolved
compression: Option<String>,
) -> Result<(), EnsureFilesystemError> {
if let Some(quota) = quota {
if let Err(err) =
Expand All @@ -377,8 +395,20 @@ impl Zfs {
});
}
}
if let Some(reservation) = reservation {
if let Err(err) =
Self::set_value(name, "reservation", &format!("{reservation}"))
{
return Err(EnsureFilesystemError {
name: name.to_string(),
mountpoint: mountpoint.clone(),
// Take the execution error from the SetValueError
err: err.err.into(),
});
}
}
if let Some(compression) = compression {
smklein marked this conversation as resolved.
Show resolved Hide resolved
if let Err(err) = Self::set_value(name, "compression", compression)
if let Err(err) = Self::set_value(name, "compression", &compression)
{
return Err(EnsureFilesystemError {
name: name.to_string(),
Expand Down
3 changes: 3 additions & 0 deletions nexus/db-model/src/dataset.rs
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@ pub struct Dataset {

pub kind: DatasetKind,
pub size_used: Option<i64>,
zone_name: Option<String>,
smklein marked this conversation as resolved.
Show resolved Hide resolved
}

impl Dataset {
Expand All @@ -49,6 +50,7 @@ impl Dataset {
pool_id: Uuid,
addr: Option<SocketAddrV6>,
kind: DatasetKind,
zone_name: Option<String>,
smklein marked this conversation as resolved.
Show resolved Hide resolved
) -> Self {
let size_used = match kind {
DatasetKind::Crucible => Some(0),
Expand All @@ -63,6 +65,7 @@ impl Dataset {
port: addr.map(|addr| addr.port().into()),
kind,
size_used,
zone_name,
}
}

Expand Down
Loading
Loading