-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[nexus] put DNS servers in DNS, so you can DNS while you DNS #5033
Conversation
after a bunch of messing with it i made the executive decision to just get rid of that test. the test for DNS propagation exercises it, and it's harder to test now
I think this might have gotten cut-off!
I think it's wise to defer the deletion to a separate branch, just to keep things decoupled. Happy with this direction though. |
// TODO(eliza): it would be nicer if `Resolver` had a method | ||
// returning an iterator instead of a `Vec`, so we didn't | ||
// have to drain the Vec and then collect it into a new | ||
// one... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be very doable, though it doesn't need to be part of this PR. The lookup_all_ipv6
method is a convenience function within omicron, acting on an iterator type internally, which we could totally return.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, i might do that, although i didn't think it was particularly important to minimize allocation overhead in the background task --- this code doesn't seem particularly hot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The lookup methods on Resolver
could definitely use a fresh pass. But agreed, it's not a blocker here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -467,29 +476,70 @@ pub mod test { | |||
SocketAddr::V4(_) => panic!("expected v6 address"), | |||
SocketAddr::V6(a) => a, | |||
}; | |||
|
|||
let update = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since DNS is now both the thing that configures us and the thing that we're configuring, I feel like this could use a comment. Maybe something like:
let update = { | |
// In order to test that DNS gets propagated to a newly-added server, we | |
// first need to update the source of truth about DNS (the database). | |
// Then we need to wait for that to get propagated (by this same | |
// mechanism) to the existing DNS servers. Only then would we expect | |
// the mechanism to see the new DNS server and then propagate | |
// configuration to it. | |
let update = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good call, I'll add that!
In terms of #4987: I agree with not removing all of the |
Depends on #5033. As described in #4947, we would like to remove the `services` table, as its only remaining use is DNS propagation. PR #5033 changes DNS propagation to no longer use the `services` table. Now that we're no longer consuming this table, this commit removes the one method which queries that table (`Datastore::services_list_kind`) and its two remaining callers, the OMDB commands that query the table. I've also removed the test for inserting and querying this table. Note that I have *not* yet removed the code in RSS that actually populates this table when the rack is set up. I thought it might be desirable to still populate this data in case we have to roll back to a previous version of Nexus that uses the `services` table. If this isn't something we care about, I can remove that code as well, allowing us to remove the `Datastore` methods for inserting to `services`.
Co-authored-by: David Pacheco <[email protected]>
One option would be removing the |
Currently, the DNS propagation background task in Nexus uses the
services
table to enumerate the list of DNS servers that it's responsible for keeping up to date. However, we'd really like to get rid of saidservices
table (see #4947), and the DNS propagation code is the only remaining user of theservices
table.Therefore, this branch changes how DNS servers are discovered for DNS propagation. Rather than discovering DNS server addresses from the
services
table, theDnsWatcher
background task now discovers DNS servers...using internal DNS. As described in #4889, this may seem like a cyclical dependency, but, because the initial set of internal DNS servers operate at known addresses -- by design -- so that they can always be discovered. And they have to be up and discoverable for Nexus to even come up and find CockroachDB. So, internal DNS can safely be assumed to be up if Nexus has come up at all. Now, theservices
table is no longer used, andThis change breaks the existing tests
nexus::app::background::init::test_dns_propagation_basic
andnexus::app::background::dns_servers::test_basic
. I've rewritten thetest_dns_propagation_basic
test to test the new expected behavior:DnsWatcher
background task then picks up the DNS records for the new DNS server by querying the existing known DNS serverThe
dns_servers::test_basic
test tested the discovery of DNS server addresses from the database. Because these no longer come from the db, and now come from internal DNS, this test would now end up exercising most of the functionality tested intest_dns_propagation_basic
. I didn't think it was necessary to have two tests for this, so I made the judgement call to deletedns_servers::test_basic
. If we think having a more isolated test that exercises only the DNS watcher task and not the DNS propagation task, we could put this back and create records on the DNS server by manually hitting its API with new configs, but I didn't think this was really worth the effort.I've also removed the
Datastore::upsert_service
method, which was used only for test code and is now dead. I considered deleting all code related to querying theservices
table in this branch as well. However, I noticed that it's still populated when initializing the rack, and thatomdb
has commands for querying that table. I wasn't sure if there were alternative data sources for theomdb
debugging commands yet, so I didn't remove them. If the data provided by those commands is available elsewhere, or if their only real purpose is just to print the state of this table, I'm happy to delete them in this branch, as well.Closes #4889