-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[sled agent] API to manage datasets explicitly #6144
Changes from 25 commits
5c2e5b0
1709d80
a80313d
7193d1b
9464cc1
fb23666
29c4eb7
75f006b
24d93a8
178e20e
cf3f35c
11f49fb
b999bf7
fac6456
fdf0644
a042439
9c27213
c562125
c7a5adf
181507c
8e1df07
93134c2
d88d541
95ebbb8
994757e
7156ba0
14a2dc4
aa60254
67c5d7e
927efd1
3e14cc4
40f93ae
077908d
a600494
0a63161
c9f170e
f5bc35a
474ec13
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||
---|---|---|---|---|
|
@@ -10,13 +10,14 @@ use crate::{ | |||
}; | ||||
use oxnet::{IpNet, Ipv4Net, Ipv6Net}; | ||||
use schemars::JsonSchema; | ||||
use serde::{Deserialize, Serialize}; | ||||
use serde::{de, Deserialize, Deserializer, Serialize, Serializer}; | ||||
use std::{ | ||||
collections::{HashMap, HashSet}, | ||||
fmt, | ||||
net::{IpAddr, Ipv4Addr, Ipv6Addr}, | ||||
str::FromStr, | ||||
}; | ||||
use strum::EnumCount; | ||||
use uuid::Uuid; | ||||
|
||||
/// The type of network interface | ||||
|
@@ -704,33 +705,142 @@ pub struct ResolvedVpcRouteSet { | |||
|
||||
/// Describes the purpose of the dataset. | ||||
#[derive( | ||||
Debug, Serialize, Deserialize, JsonSchema, Clone, Copy, PartialEq, Eq, | ||||
Debug, JsonSchema, Clone, PartialEq, Eq, Ord, PartialOrd, Hash, EnumCount, | ||||
)] | ||||
#[serde(rename_all = "snake_case")] | ||||
#[serde(tag = "type", rename_all = "snake_case")] | ||||
pub enum DatasetKind { | ||||
Crucible, | ||||
// Durable datasets for zones | ||||
#[serde(rename = "cockroachdb")] | ||||
Cockroach, | ||||
Crucible, | ||||
Clickhouse, | ||||
ClickhouseKeeper, | ||||
ExternalDns, | ||||
InternalDns, | ||||
|
||||
// Zone filesystems | ||||
ZoneRoot, | ||||
Zone { | ||||
name: String, | ||||
}, | ||||
|
||||
// Other datasets | ||||
Debug, | ||||
} | ||||
|
||||
impl Serialize for DatasetKind { | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh hmm, I think you'll also want to update the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just to confirm -- what about the JsonSchema implementation should we update? It's currently being derived, and is used as part of internal APIs, but I don't think it needs to match the serialization + to/from string implementations. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Updated the PR with a fixed JSON schema. Basically the issue was that JSON schemas are meant to be a description of the serialization format, and they were out of sync since we'd manually implemented I've changed the schema to be just |
||||
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error> | ||||
where | ||||
S: Serializer, | ||||
{ | ||||
serializer.serialize_str(&self.to_string()) | ||||
} | ||||
} | ||||
|
||||
impl<'de> Deserialize<'de> for DatasetKind { | ||||
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error> | ||||
where | ||||
D: Deserializer<'de>, | ||||
{ | ||||
let s = String::deserialize(deserializer)?; | ||||
s.parse().map_err(de::Error::custom) | ||||
} | ||||
} | ||||
|
||||
impl DatasetKind { | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. A chunk of this implementation was moved from History lesson: We used to have types defined that would be usable by Nexus ( This PR merges both concepts, but tweaks some names to avoid changing any schemas that are saved on M.2s. |
||||
pub fn dataset_should_be_encrypted(&self) -> bool { | ||||
match self { | ||||
// We encrypt all datasets except Crucible. | ||||
// | ||||
// Crucible already performs encryption internally, and we | ||||
// avoid double-encryption. | ||||
DatasetKind::Crucible => false, | ||||
_ => true, | ||||
} | ||||
} | ||||
|
||||
/// Returns true if this dataset is delegated to a non-global zone. | ||||
pub fn zoned(&self) -> bool { | ||||
use DatasetKind::*; | ||||
match self { | ||||
Cockroach | Crucible | Clickhouse | ClickhouseKeeper | ||||
| ExternalDns | InternalDns => true, | ||||
ZoneRoot | Zone { .. } | Debug => false, | ||||
} | ||||
} | ||||
|
||||
/// Returns the zone name, if this is a dataset for a zone filesystem. | ||||
/// | ||||
/// Otherwise, returns "None". | ||||
pub fn zone_name(&self) -> Option<&str> { | ||||
if let DatasetKind::Zone { name } = self { | ||||
Some(name) | ||||
} else { | ||||
None | ||||
} | ||||
} | ||||
} | ||||
|
||||
// Be cautious updating this implementation: | ||||
// | ||||
// - It should align with [DatasetKind::FromStr], below | ||||
// - The strings here are used here comprise the dataset name, stored durably | ||||
// on-disk | ||||
impl fmt::Display for DatasetKind { | ||||
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { | ||||
use DatasetKind::*; | ||||
let s = match self { | ||||
Crucible => "crucible", | ||||
Cockroach => "cockroach", | ||||
Cockroach => "cockroachdb", | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. When is the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. +1, i would have a slight preference for the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. +1 as well. For
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So my mishaps with naming here were intended to keep backwards compatibility between:
To be clear, I think I misinterpreted this in my original PR. I was attempting to merge HOWEVER, upon closer inspection, the Updated in 93134c2 to just use |
||||
Clickhouse => "clickhouse", | ||||
ClickhouseKeeper => "clickhouse_keeper", | ||||
ExternalDns => "external_dns", | ||||
InternalDns => "internal_dns", | ||||
ZoneRoot => "zone", | ||||
Zone { name } => { | ||||
write!(f, "zone/{}", name)?; | ||||
return Ok(()); | ||||
} | ||||
smklein marked this conversation as resolved.
Show resolved
Hide resolved
|
||||
Debug => "debug", | ||||
}; | ||||
write!(f, "{}", s) | ||||
} | ||||
} | ||||
|
||||
#[derive(Debug, thiserror::Error)] | ||||
pub enum DatasetKindParseError { | ||||
#[error("Dataset unknown: {0}")] | ||||
UnknownDataset(String), | ||||
} | ||||
|
||||
impl FromStr for DatasetKind { | ||||
type Err = DatasetKindParseError; | ||||
|
||||
fn from_str(s: &str) -> Result<Self, Self::Err> { | ||||
use DatasetKind::*; | ||||
let kind = match s { | ||||
"cockroachdb" => Cockroach, | ||||
"crucible" => Crucible, | ||||
"clickhouse" => Clickhouse, | ||||
"clickhouse_keeper" => ClickhouseKeeper, | ||||
"external_dns" => ExternalDns, | ||||
"internal_dns" => InternalDns, | ||||
"zone" => ZoneRoot, | ||||
"debug" => Debug, | ||||
other => { | ||||
if let Some(name) = other.strip_prefix("zone/") { | ||||
Zone { name: name.to_string() } | ||||
} else { | ||||
return Err(DatasetKindParseError::UnknownDataset( | ||||
s.to_string(), | ||||
)); | ||||
} | ||||
} | ||||
}; | ||||
Ok(kind) | ||||
} | ||||
} | ||||
|
||||
/// Identifiers for a single sled. | ||||
/// | ||||
/// This is intended primarily to be used in timeseries, to identify | ||||
|
@@ -753,6 +863,7 @@ pub struct SledIdentifiers { | |||
|
||||
#[cfg(test)] | ||||
mod tests { | ||||
use super::*; | ||||
use crate::api::internal::shared::AllowedSourceIps; | ||||
use oxnet::{IpNet, Ipv4Net, Ipv6Net}; | ||||
use std::net::{Ipv4Addr, Ipv6Addr}; | ||||
|
@@ -797,4 +908,48 @@ mod tests { | |||
serde_json::from_str(r#"{"allow":"any"}"#).unwrap(), | ||||
); | ||||
} | ||||
|
||||
#[test] | ||||
fn test_dataset_kind_serialization() { | ||||
let kinds = [ | ||||
DatasetKind::Crucible, | ||||
DatasetKind::Cockroach, | ||||
DatasetKind::Clickhouse, | ||||
DatasetKind::ClickhouseKeeper, | ||||
DatasetKind::ExternalDns, | ||||
DatasetKind::InternalDns, | ||||
DatasetKind::ZoneRoot, | ||||
DatasetKind::Zone { name: String::from("myzone") }, | ||||
DatasetKind::Debug, | ||||
]; | ||||
|
||||
assert_eq!(kinds.len(), DatasetKind::COUNT); | ||||
|
||||
for kind in &kinds { | ||||
// To string, from string | ||||
let as_str = kind.to_string(); | ||||
let from_str = | ||||
DatasetKind::from_str(&as_str).unwrap_or_else(|_| { | ||||
panic!("Failed to convert {kind} to and from string") | ||||
}); | ||||
assert_eq!( | ||||
*kind, from_str, | ||||
"{kind} failed to convert to/from a string" | ||||
); | ||||
|
||||
// Serialize, deserialize | ||||
let ser = serde_json::to_string(&kind) | ||||
.unwrap_or_else(|_| panic!("Failed to serialize {kind}")); | ||||
let de: DatasetKind = serde_json::from_str(&ser) | ||||
.unwrap_or_else(|_| panic!("Failed to deserialize {kind}")); | ||||
assert_eq!(*kind, de, "{kind} failed serialization"); | ||||
|
||||
// Test that serialization is equivalent to stringifying. | ||||
assert_eq!( | ||||
format!("\"{as_str}\""), | ||||
ser, | ||||
"{kind} does not match stringification/serialization" | ||||
); | ||||
} | ||||
} | ||||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,15 +4,21 @@ | |
|
||
//! Disk related types shared among crates | ||
|
||
use omicron_uuid_kinds::DatasetUuid; | ||
use omicron_uuid_kinds::ZpoolUuid; | ||
use schemars::JsonSchema; | ||
use serde::{Deserialize, Serialize}; | ||
use std::collections::BTreeMap; | ||
use uuid::Uuid; | ||
|
||
use crate::{ | ||
api::external::Generation, ledger::Ledgerable, zpool_name::ZpoolKind, | ||
api::external::Generation, | ||
ledger::Ledgerable, | ||
zpool_name::{ZpoolKind, ZpoolName}, | ||
}; | ||
|
||
pub use crate::api::internal::shared::DatasetKind; | ||
|
||
#[derive( | ||
Clone, | ||
Debug, | ||
|
@@ -69,6 +75,130 @@ impl OmicronPhysicalDisksConfig { | |
} | ||
} | ||
|
||
#[derive( | ||
Debug, | ||
PartialEq, | ||
Eq, | ||
Hash, | ||
Serialize, | ||
Deserialize, | ||
Clone, | ||
JsonSchema, | ||
PartialOrd, | ||
Ord, | ||
)] | ||
pub struct DatasetName { | ||
// A unique identifier for the Zpool on which the dataset is stored. | ||
pool_name: ZpoolName, | ||
// A name for the dataset within the Zpool. | ||
kind: DatasetKind, | ||
} | ||
|
||
impl DatasetName { | ||
pub fn new(pool_name: ZpoolName, kind: DatasetKind) -> Self { | ||
Self { pool_name, kind } | ||
} | ||
|
||
pub fn pool(&self) -> &ZpoolName { | ||
&self.pool_name | ||
} | ||
|
||
pub fn dataset(&self) -> &DatasetKind { | ||
&self.kind | ||
} | ||
|
||
/// Returns the full name of the dataset, as would be returned from | ||
/// "zfs get" or "zfs list". | ||
/// | ||
/// If this dataset should be encrypted, this automatically adds the | ||
/// "crypt" dataset component. | ||
pub fn full_name(&self) -> String { | ||
// Currently, we encrypt all datasets except Crucible. | ||
// | ||
// Crucible already performs encryption internally, and we | ||
// avoid double-encryption. | ||
if self.kind.dataset_should_be_encrypted() { | ||
self.full_encrypted_name() | ||
} else { | ||
self.full_unencrypted_name() | ||
} | ||
} | ||
|
||
fn full_encrypted_name(&self) -> String { | ||
format!("{}/crypt/{}", self.pool_name, self.kind) | ||
} | ||
|
||
fn full_unencrypted_name(&self) -> String { | ||
format!("{}/{}", self.pool_name, self.kind) | ||
} | ||
} | ||
|
||
/// Configuration information necessary to request a single dataset | ||
#[derive( | ||
Clone, | ||
Debug, | ||
Deserialize, | ||
Serialize, | ||
JsonSchema, | ||
PartialEq, | ||
Eq, | ||
Hash, | ||
PartialOrd, | ||
Ord, | ||
)] | ||
pub struct DatasetConfig { | ||
/// The UUID of the dataset being requested | ||
pub id: DatasetUuid, | ||
|
||
/// The dataset's name | ||
pub name: DatasetName, | ||
|
||
/// The compression mode to be supplied, if any | ||
pub compression: Option<String>, | ||
smklein marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
/// The upper bound on the amount of storage used by this dataset | ||
pub quota: Option<usize>, | ||
|
||
/// The lower bound on the amount of storage usable by this dataset | ||
pub reservation: Option<usize>, | ||
} | ||
|
||
#[derive( | ||
Clone, Debug, Deserialize, Serialize, JsonSchema, PartialEq, Eq, Hash, | ||
)] | ||
pub struct DatasetsConfig { | ||
/// generation number of this configuration | ||
/// | ||
/// This generation number is owned by the control plane (i.e., RSS or | ||
/// Nexus, depending on whether RSS-to-Nexus handoff has happened). It | ||
/// should not be bumped within Sled Agent. | ||
/// | ||
/// Sled Agent rejects attempts to set the configuration to a generation | ||
/// older than the one it's currently running. | ||
/// | ||
/// Note that "Generation::new()", AKA, the first generation number, | ||
/// is reserved for "no datasets". This is the default configuration | ||
/// for a sled before any requests have been made. | ||
pub generation: Generation, | ||
|
||
pub datasets: BTreeMap<DatasetUuid, DatasetConfig>, | ||
} | ||
|
||
impl Default for DatasetsConfig { | ||
fn default() -> Self { | ||
Self { generation: Generation::new(), datasets: BTreeMap::new() } | ||
} | ||
} | ||
|
||
impl Ledgerable for DatasetsConfig { | ||
fn is_newer_than(&self, other: &Self) -> bool { | ||
self.generation > other.generation | ||
} | ||
|
||
// No need to do this, the generation number is provided externally. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm as someone unfamiliar with this code I'm not qutie sure what it means here -- I see that there's a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, the The "Ledgerable" trait exists to help us write a file to many disks, and assure that we'll read back the latest one, if it exists. Part of this is deciphering: If we failed halfway through a write, and we see two distinct ledgers, which should we use? There are some configurations (basically, in the bootstore only now) that want to bump the generation number on every single write, so this just tracks "what's the latest thing that was written". However, in this case -- and many others, for Nexus-supplied data -- we already have a |
||
fn generation_bump(&mut self) {} | ||
} | ||
|
||
/// Uniquely identifies a disk. | ||
#[derive( | ||
Debug, | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will adding this tag break any serialization? I'm thinking of code which attempts to deserialize these values, but which is not necessarily running against the same version of the code here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if we should just always require
tag
in serde impls to ensure forward compatibility, in case any particular variant starts having associated data in the future. (The requirement to use a tag, if any variant has associated data, is imposed by the openapi tooling.)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went down the route of "serialize/deserialize via string", as @sunshowers recommended later.