-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update external DNS during blueprint execution #5212
Conversation
This reverts commit c7fad86.
…igurator-external-dns
nexus/src/app/deployment.rs
Outdated
@@ -122,7 +123,7 @@ impl super::Nexus { | |||
&sled_rows, | |||
&zpool_rows, | |||
&ip_pool_range_rows, | |||
NEXUS_REDUNDANCY, | |||
NEXUS_REDUNDANCY + 1, // XXX-dap |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm planning to remove this before integration -- this is just for testing.
internal_dns_version INT8 NOT NULL, | ||
-- identifies the latest external DNS version when blueprint planning began | ||
-- XXX-dap migration code must set the value for existing blueprints | ||
external_dns_version INT8 NOT NULL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will obviously fix this before integration but wanted to save it for the last step. The plan will be to do something similar to internal_dns_version (schema version 36.0.0), setting the external DNS version to 1 for existing blueprints.
@@ -1177,6 +1180,7 @@ mod tests { | |||
let blueprint = BlueprintBuilder::build_initial_from_collection( | |||
&collection, | |||
Generation::new(), | |||
Generation::new(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely not a blocker on this PR: This makes me wonder if we should have typed generations like we're getting typed UUIDs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That does seem good.
let err = OptionalError::new(); | ||
let conn = self.pool_connection_authorized(opctx).await?; | ||
self.transaction_retry_wrapper("nexus_external_addresses") | ||
.transaction(&conn, |conn| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume the transaction was here to (attempt to?) ensure we got a consistent view of DNS zones + external IPs; does that not matter in practice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it should matter? The list of DNS zones cannot actually change today at all. If it did, I think you'd already have a problem because we're constructing a list of DNS names based on those zones, and that could become out of date by the time we try to write the DNS changes. The set of external IPs could, too, though only via Reconfigurator. Eventually I think we want to have even the Silo DNS changes happen through Reconfigurator, and then we'll know it will at least converge to correct. Anyway, since the list of DNS zones cannot change today, I think it makes sense to do this and revisit if/when we support that.
@@ -45,6 +47,7 @@ use ref_cast::RefCast; | |||
use uuid::Uuid; | |||
|
|||
/// Filter a "silo_list" query based on silos' discoverability | |||
#[derive(Clone, Copy)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe while we're here:
#[derive(Clone, Copy)] | |
#[derive(Debug, Clone, Copy)] |
let dendrite_port = | ||
cptestctx.dendrite.get(&switch_location).unwrap().port; | ||
let mgd_port = cptestctx.mgd.get(&switch_location).unwrap().port; | ||
overrides.override_switch_zone_ip(sled_id, ip); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could all the override_*
methods also be #[cfg(test)]
, and/or not be pub
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 7028e0b.
/// structure is empty and returns the default (production) values. The test | ||
/// suite overrides these values. | ||
#[derive(Debug, Default)] | ||
pub struct Overridables { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just thinking out loud, not requesting a change: All these fields are related to the switch zone. If we were to add other kinds of test overrides, would it get unwieldy if we stashed them in here too? Wondering if this would be clearer if it were named something like SwitchZoneAddresses
. Just Overrideables
implies to me that non-tests might also want to override these values, but really we're passing in the switch zone addresses with a way for tests only to set them to unusual values, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, your summary is correct. I also thought about calling it TestOverridables
though without context it didn't seem clearer. SwitchZoneAddresses
is okay though without context I usually think "address" means "IP address" and not TCP ports.
I'm not sure the problem is intrinsically related to the switch zone. It's anything whose addresses or ports we cannot determine from the configuration that's available. It's not a problem for Omicron zones because their addresses and ports are part of the blueprint configuration. It is a problem for the switch zone components because they're not part of that config. In principle there could be other things in that bucket. If we needed to know how to reach the internal DNS servers, we'd almost need this (because in real systems those are at fixed addresses), but we don't because they happen to be in OmicronZoneConfig too. Basically it would be anything that we run on some fixed, known address/port in production, but that runs in the test suite at some dynamic host/port.
I'm indifferent among these names. OverridablesForTest
? Though the actual production values are behind that object, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't love OverrideablesForTest
for the reason you mentioned, although maybe it'd work better if we added a realize_blueprint_with_test_overrides
? Change realize_blueprint
to not take this argument at all, but instead call straight through to realize_blueprint_with_test_overrides
with Default::default()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the change to realize_blueprint
and I've done that in 7028e0b.
let external_dns_config_blueprint = blueprint_external_dns_config( | ||
blueprint, | ||
&nexus_external_ips, | ||
&silos.iter().collect::<Vec<_>>(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tiny nit - collecting another Vec
of silos seems weird when we just did that on line 76. Could blueprint_external_dns_config
take &[Silo]
instead? Or maybe even impl Iterator<...>
s instead of slices so we don't have to collect at all?
@@ -315,40 +437,70 @@ fn dns_compute_update( | |||
|
|||
#[cfg(test)] | |||
mod test { | |||
use super::blueprint_dns_config; | |||
use super::blueprint_internal_dns_config; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing wrong with this, but I suspect if you used super::*
you could drop a bunch of these duplicates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If that's considered more idiomatic I can switch to that but if it's an arbitrary choice I really prefer being explicit instead. I've gotten surprisingly burned by trying to move code out of a block that used use super::*
. It's not a big deal though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it probably is? use super::*
shows up over 100 times in omicron, and I have it in my "generate a mod tests
" snippet. But I don't feel strongly about this at all.
.await | ||
.expect("fetching initial external DNS"); | ||
|
||
// Now, use it to construct an initial blueprint. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really tiny nitpick: "use it" is referring back to the collection, not the DNS contents we just fetched, right? Could we swap these two chunks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kept the order to emphasize/ensure that we're collecting state before we change anything, but I have fixed the comment.
assert_eq!(new_zones.len(), 1); | ||
let new_zone_id = *new_zones[0]; | ||
|
||
// Set this blueprint as the current target. We set it to disabled |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar tiny nit as above, with apologies: "this blueprint" sounds like blueprint2
(the one we just made), but first we have to set blueprint
as the target. Could we set blueprint
up above after we create, then keep the comment and just deal with blueprint2
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 7028e0b.
/// _within_ the external DNS zone (i.e., without that zone's suffix) | ||
/// | ||
/// This specific naming scheme is determined under RFD 357. | ||
pub fn silo_dns_name(name: &omicron_common::api::external::Name) -> String { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this go above the mod test
block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Depends on #5239.
Fixes #5068 and #5220.