Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zone-setup cleanup/refactoring #6015

Closed
wants to merge 3 commits into from
Closed

Conversation

jgallagher
Copy link
Contributor

I need to make some changes to zone-setup to support Reconfigurator rebalancing boundary NTP zones, and found it a little trickier to work with than expected. I think there were two reasons for this:

  • The use of clap's builder interface instead of derive (addressed in 3708774), which requires a surprising amount of boilerplate and custom parsing functions, most of which are now gone
  • Using Result<(), CmdError::Failure> throughout, when really only main cares about that and everywhere else could use anyhow::Result<()> (mostly addressed in cac26dd); I also tried to make sure we always attach some .context() to errors (the old code did this in some cases but not all)

There were some other minor spot-cleanups to remove unnecessary clone/to_string allocations, fixing up long log lines that rustfmt doesn't handle, etc. But I think there should be no material changes that affect omicron at large. One change in the first commit from clap builder to derive is that the --long-form spelling of some command line arguments is different (e.g., what used to be --opte_gateway is now just --gateway), but AFAICT we only ever use the short form (-g in this example, which remained unchanged) in our SMF manifests. I'm hoping either CI or review (or both) will catch any accidental changes!

@jgallagher jgallagher requested a review from karencfv July 8, 2024 19:57
Copy link
Contributor

@karencfv karencfv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for cleaning this up! I love how tidy it got.

I have a single concern though, the self assembling switch zone PR makes significant changes/additions to the zone-setup CLI https://github.com/oxidecomputer/omicron/pull/5593/files#diff-9297c246fb20e267fcf814402399019ea520a0e889c86aa01dba2eda7518c7c7

I'm waiting to merge this once release 9 is out the door, and to be frank I'm a little afraid of it falling into merge/rebase hell. Would it be too annoying to hold this until that PR gets merged?

#[arg(short, long, value_parser = parse_string_rejecting_unknown)]
interface: String,
#[arg(short, long)]
gateway: Ipv4Addr,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My intention with naming the flag --opte-gateway instead of --gateway was to make it clear that this is a different "gateway" that the one we use for the common networking setup command. I'm OK changing the flag itself, but perhaps we can add a flag description that mentions this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I thought the --opte- prefixes on all of these arguments was superfluous since they're only valid to the opte-interface subcommand. I'm 100% on board with adding descriptions, if you think that's enough; it's also easy to put the prefixes back if you think that would be better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There weren't any descriptions on these flags before - what should they be? 😅

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I thought the --opte- prefixes on all of these arguments was superfluous since they're only valid to the opte-interface subcommand

I totally agree with this. I guess this all comes from my own experience working with the zones where several things were called "gateway" but they all meant different things and it was a bit confusing to know which went where. The quickest solution I found was to differentiate this one by just calling it "opte-gateway", but I totally get that it looks weird as a flag.

I'm not specially attached to any solution, but perhaps keeping it as gateway and writing a long description with something simple like gateway is unique to OPTE as it is for external connectivity using boundary services makes sense? Happy to go with whatever you think is best

Comment on lines +39 to +41
// Some of our `String` cli arguments may be set to the literal "unknown" if
// sled-agent hasn't populated our SMF config correctly. Reject this at argument
// parsing time.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't always the case unfortunately. Especially when dealing with the switch zone

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure where you're going with this - is the comment and/or parser wrong or need more details?

Copy link
Contributor

@karencfv karencfv Jul 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I think I was thinking we were rejecting all "unknown"s everywhere, but I see that's not the case. We can ignore this :)

args: ChronySetupArgs,
log: &Logger,
) -> anyhow::Result<()> {
println!("running chrony setup: {args:?}");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ha! looks like this was a bug from the old code 😅

We should be logging this as info!() like the other ones

let str_line =
format!("pool {} iburst maxdelay 0.1 maxsources 16\n", s);
contents.push_str(&str_line)
writeln!(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah yes, this is much nicer than what I had here before

@jgallagher
Copy link
Contributor Author

I'm waiting to merge this once release 9 is out the door, and to be frank I'm a little afraid of it falling into merge/rebase hell. Would it be too annoying to hold this until that PR gets merged?

Nope - no objection to holding this. It's the first of several stacked PRs for boundary NTP work, most of which aren't done yet anyway!

Comment on lines +52 to +57
#[arg(short, long, value_parser = parse_string_rejecting_unknown)]
datalink: String,
#[arg(short, long)]
gateway: Ipv6Addr,
#[arg(short, long)]
static_addr: Ipv6Addr,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure some of these flags (and in the other commands as well) are required.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the flags are still required, except for --allow for chrony-setup (because its type is Option<...>).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahhhh gotcha gotcha, I think I was expecting a required = true, but if just setting without Option<...> has the same result, then it's all good

@karencfv
Copy link
Contributor

Nope - no objection to holding this. It's the first of several stacked PRs for boundary NTP work, most of which aren't done yet anyway!

Awesome, thanks!

@jgallagher
Copy link
Contributor Author

This needed several changes after the switch zone work (and I've let it sit for quite a while!); I think it'll be cleaner if I close this and reopen it for a new review.

@jgallagher jgallagher closed this Jul 29, 2024
jgallagher added a commit that referenced this pull request Jul 31, 2024
(This is a rebased and expanded version of #6015 now that the
self-assembling switch zone has landed.)

I need to make some changes to zone-setup to support Reconfigurator
rebalancing boundary NTP zones, and found it a little trickier to work
with than expected. I think there were two reasons for this:

* The use of clap's builder interface instead of derive (addressed in
3708774),
which requires a surprising amount of boilerplate and custom parsing
functions, most of which are now gone
* Using Result<(), CmdError::Failure> throughout, when really only main
cares about that and everywhere else could use anyhow::Result<()>
(mostly addressed in
cac26dd);
I also tried to make sure we always attach some .context() to errors
(the old code did this in some cases but not all)

I also made a cleanup pass through the new switch zone user setup code;
the biggest win there was reusing `illumos_utils::execute()` to check
for success and build detailed error messages on failure. There are some
other minor changes like long flags with different names (AFAICT we only
ever use the short flags in SMF manifests, so this shouldn't require
other changes) and potential bugfixes (e.g., the switch zone setup code
was only correct for 0 or 1 profile, and should now be correct for 2+ if
we ever need more than one).

While all these _should_ be correct, I want to run this through madrid
before merging. I think it's ready for review, though; will update here
once I do the madrid test.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants