-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Serialize blueprints in the database #4899
Conversation
These were performed by nexus when blueprints were in memory, but are now performed by the datastore
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! I checked everything here except the function that writes out the SQL directly. I can come back to this but wanted to get this in now.
.get(&blueprint_id) | ||
.cloned() | ||
.ok_or_else(|| blueprint.not_found()) | ||
self.db_datastore.blueprint_read(opctx, &blueprint).await |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could consider making BlueprintMetadata
a resource with its own authz
type and lookup
type so that people could use the usual LookupPath::new(..).blueprint_id(blueprint_id).fetch().await
API instead of a function like this. I'm not sure if that's a good idea and it's definitely not important now.
nexus/db-model/src/deployment.rs
Outdated
#[derive(Queryable, Clone, Debug, Selectable, Insertable)] | ||
#[diesel(table_name = bp_target)] | ||
pub struct BpTarget { | ||
pub version: i64, // i64 only for db serialization; should never be negative |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this use one of the SqlU*
types instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, switched to SqlU32
. I wasn't 100% sure that was big enough to never worry about exhausting it, but I think I've convinced myself it's fine. Even if runaway automation set a new target every second (which seems pretty unlikely, since that first requires generating and inserting a new blueprint in its entirety), u32 is fine for 100+ years.
// Error messages generated from the above sentinel values. | ||
const NO_SUCH_BLUEPRINT_ERROR_MESSAGE: &str = | ||
"could not parse \"no-such-blueprint\" as type uuid: \ | ||
uuid: incorrect UUID length: no-such-blueprint"; | ||
const PARENT_NOT_TARGET_ERROR_MESSAGE: &str = | ||
"could not parse \"parent-not-target\" as type uuid: \ | ||
uuid: incorrect UUID length: parent-not-target"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure these are stable messages. But I guess as long as we test the behavior, we should be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed on both counts. FWIW this is not a new technique; I cribbed it from
omicron/nexus/db-queries/src/db/queries/network_interface.rs
Lines 89 to 94 in d5dace6
// Error message generated when we're attempting to operate on an instance, | |
// either inserting or deleting an interface, and that instance does not exist | |
// at all or has been destroyed. These are the same thing from the point of view | |
// of the client's API call. | |
const NO_INSTANCE_ERROR_MESSAGE: &'static str = | |
"could not parse \"no-instance\" as type uuid: uuid: incorrect UUID length: no-instance"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Epic work @jgallagher. I particularly like the tests!
pub bp_nic_id: Option<Uuid>, | ||
pub dns_gz_address: Option<ipv6::Ipv6Addr>, | ||
pub dns_gz_address_index: Option<SqlU32>, | ||
pub ntp_ntp_servers: Option<Vec<String>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is ntp_ntp
intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is - these are NTP zone's NTP servers.
use uuid::Uuid; | ||
|
||
#[derive(Debug)] | ||
pub(crate) struct OmicronZone { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a question here, nothing to do in response.
It looks like this is essentially flattening the OmicronZoneType
enum into a struct full of options. Is this because it's not really possible to represent ZoneType
in the DB without flattening?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not that it's not possible, it's just that it's a lot of work with dubious payoff. At one point I think Dave had 3 different impls at least partly done / sketched out: fully (or mostly) normalized, this one, and "just stuff JSON in there". This seemed like the most reasonable one: we get almost all the benefits of putting this into the DB schema, without all the diesel pain of working with a dozen tables.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, got it. This is what you guys were discussing the other day. Thanks.
ensure!(expected_id == nic_row.id, "caller provided wrong NIC"); | ||
Ok(nic_row.into_network_interface_for_zone(self.id)?) | ||
} | ||
(None, None) => Err(anyhow!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Long rambling nit below:
I was confused by the (None, None)
case, even though it's documented well in the block comment. I think this was partially my brain not functioning when reviewing yesterday and it became clear when I read the rest of the function.
However, while I like the pattern I still find it somewhat confusing to mix an early return of a result with an error that will be returned later. I think this could be made clearer if you inlined a comment for the (None, None)
case. Maybe that's just moving the last paragraph of the block comment and slightly rewording it. While I typically dislike inlined comments like this for small match statements it may work better here. I'm interested in other's opinions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added an inline comment in 0c14431
) { | ||
let conn = datastore.pool_connection_for_tests().await.unwrap(); | ||
|
||
macro_rules! query_count { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From embedded closures to embedded macros, the John Gallagher story.
This replaces the in-memory blueprint storage added as a placeholder in #4804 with cockroachdb-backed tables. Both the tables and related queries are heavily derived from the similar tables in the inventory system (particularly serializing omicron zones and their related properties). The tables are effectively identical as of this PR, but we opted to keep the separate because we expect them to diverge some over time (e.g., inventory might start collecting additional per-zone properties that don't exist for blueprints, such as uptime).
The big exception to "basically the same as inventory" is the
bp_target
table which tracks the current (and past) target blueprint. Inserting into this table has some subtleties, and we use a CTE to check and enforce the invariants. This is the first diesel/CTE I've written; it's based on other similar CTEs in Nexus, but I'd still appreciate a particularly careful look there.Fixes #4793.