Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimizations to improve dataset ensuring performance #7236

Merged
merged 13 commits into from
Dec 13, 2024
Merged

Conversation

smklein
Copy link
Collaborator

@smklein smklein commented Dec 12, 2024

Major part of #7217

Optimizations made:

  • zfs get is queried, so properties are samples for all datasets of interest ahead-of-time. No subsequent processes are exec'd for each dataset that needs no changes.
  • The "dataset ensure" process is now concurrent

These optimizations should significantly improve the latency of the "no (or few) changes necessary" cases.

Optimizations still left to be made:

  • Making blueprint execution concurrent, rather than blocking one-sled-at-a-time
  • Patching the illumos_utils "Zfs::ensure_filesystem" to re-use the pre-fetched properties, and minimize re-querying

//
// If one or more dataset doesn't exist, we can still read stdout to
// see about the ones that do exist.
let output = cmd.output().map_err(|err| {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's an easy example to see what's happening:

$ zfs get -o name,property,value used rpool does_not_exist
cannot open 'does_not_exist': dataset does not exist
NAME   PROPERTY  VALUE
rpool  used      256G

"Cannot open 'does_not_exist'" goes to stderr, the rest goes to stdout, and can be parsed.
Since we want to use this API to get as much info as possible about datasets which may or may not exist, it's critical to keep going, even if one or more datasets are missing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ignoring any errors and moving forward with parsing stdout seems pretty dicey. I assume there are lots of ways zfs get could fail that are more serious than being asked about datasets that don't exist. Do we know it's okay to ignore all of those too?

(To be clear: I'm not sure what the alternative is. I guess we could do something like "on error, try to parse stderr, and if it's only dataset does not exist logs then proceed"? Or maybe we could call zfs get without listing any datasets at all, then filter the returned list ourselves to match which?)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I don't know how to proceed here. The exit code is "1", so there isn't much information to glean from this.

I looked through the zfs get flags and didn't see anything that allows this omission. I think my preference with these tools would be to rely on other mechanisms to detect problems with the datasets, and leave the listing as a best-effort thing.

Namely: If a dataset that should exist actually doesn't, for any reason, we'll try to re-create it later as a part of blueprint execution. Then we'll see errors at "ensure" time, rather than "listing" time.

I guess we could do something like "on error, try to parse stderr, and if it's only dataset does not exist logs then proceed"?

This is possible - to be clear, would this proposal be:

  • Call zfs get on several datasets (like this PR is currently doing)
  • If we see the command fail, parse stderr
  • Only treat it as "OK" if stderr contains the lines cannot open 'XXX': dataset does not exist for one of our datasets?

Small quirk here -- if this failed for any datasets we're trying to query (e.g., maybe a zpool is dying?) it would stop inventory collection for the entire sled. So at the end of the day, I think we need to proceed, parsing what we can, from the set of all datasets we can observe.

Or maybe we could call zfs get without listing any datasets at all, then filter the returned list ourselves to match which?

I also considered this, and would be somewhat okay with this approach. My only beef is that it would need to be fully recursive to be able to see everything, so we'd be e.g. listing all crucible datasets every time -- and that means there could be a lot of data here.

For example, here are the results on a "non-pathological system" -- dogfood:

pilot host exec -c "zfs get -o name,value name | wc -l" {7..25}
 7  BRM27230045        ok: 108
 8  BRM44220011        ok: 4364
 9  BRM44220005        ok: 4190
10  BRM42220009        ok: 263
11  BRM42220006        ok: 4298
12  BRM42220057        ok: 4463
13  BRM42220018        ok: 21
14  BRM42220051        ok: 4281
16  BRM42220014        ok: 4397
17  BRM42220017        ok: 4286
21  BRM42220031        ok: 4315
23  BRM42220016        ok: 4379

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is possible - to be clear, would this proposal be:

* Call `zfs get` on several datasets (like this PR is currently doing)

* If we see the command fail, parse stderr

* Only treat it as "OK" if stderr contains the lines `cannot open 'XXX': dataset does not exist` for one of our datasets?

Small quirk here -- if this failed for any datasets we're trying to query (e.g., maybe a zpool is dying?) it would stop inventory collection for the entire sled. So at the end of the day, I think we need to proceed, parsing what we can, from the set of all datasets we can observe.

That is what I was proposing, yes, but your quirk is a great point.

I also considered this, and would be somewhat okay with this approach. My only beef is that it would need to be fully recursive to be able to see everything, so we'd be e.g. listing all crucible datasets every time -- and that means there could be a lot of data here.

That's certainly not awesome.

I think I'm inclined to go with what you have (i.e., ignore errors and parse what we can), although in the back of my mind I'm worried that's going to bite us eventually. Is there any chance (not on this PR) we could get changes to the zfs CLI or a ZFS API that would let us do what we want to do here with less ambiguity?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible/reasonable here to specify a parent dataset that does exist and use -r to get all its children? Then we shouldn't expect any errors.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible/reasonable here to specify a parent dataset that does exist and use -r to get all its children? Then we shouldn't expect any errors.

I think this bumps into the problem I mentioned about recursive children - we do want to know about e.g. crucible, but using -r will list all regions, and all properties on all regions, which is in the ballpark of 10,000 - 100,000 lines of output.

// Gather properties about these datasets, if they exist.
//
// This pre-fetching lets us avoid individually querying them later.
let old_datasets = Zfs::get_dataset_properties(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This query, as well as the logic below, is the primary optimization made by this PR.

We call zfs get once for all datasets we care about, and if no changes are necessary, that's the only process created.

sled-storage/src/manager.rs Outdated Show resolved Hide resolved
@smklein smklein requested a review from jgallagher December 12, 2024 02:08
Copy link
Contributor

@jgallagher jgallagher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for picking this up! Having to go through a CLI is pretty rough to begin with; "now change how we do it so it's fast" is a tall order.

.filter(|(_prop, source)| {
// If a quota has not been set explicitly, it has a default
// source and a value of "zero". Rather than parsing the value
// as zero, it should be ignored.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this comment wrong before? (What is source in the two -/0 cases now checked?)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment was kinda right (the default value for new datasets is zero + default), but incomplete. There are other cases where a "quota set to none" shows up.

Regardless, I think this code is now more holistically correct -- regardless of source, we should treat "0" / "-" as "no quota has been set" (and same for reservation).

illumos-utils/src/zfs.rs Show resolved Hide resolved
illumos-utils/src/zfs.rs Show resolved Hide resolved
illumos-utils/src/zfs.rs Outdated Show resolved Hide resolved
illumos-utils/src/zfs.rs Show resolved Hide resolved
sled-storage/src/manager.rs Outdated Show resolved Hide resolved
sled-storage/src/manager.rs Outdated Show resolved Hide resolved
sled-storage/src/manager.rs Outdated Show resolved Hide resolved
sled-storage/src/manager.rs Outdated Show resolved Hide resolved
sled-storage/src/manager.rs Show resolved Hide resolved
@smklein
Copy link
Collaborator Author

smklein commented Dec 12, 2024

Thanks for all the feedback! I think I've addressed all the comments in the latest couple commits

sled-storage/src/manager.rs Outdated Show resolved Hide resolved
sled-storage/src/manager.rs Outdated Show resolved Hide resolved
illumos-utils/src/zfs.rs Outdated Show resolved Hide resolved
@smklein smklein enabled auto-merge (squash) December 13, 2024 20:54
@smklein smklein merged commit de8bc93 into main Dec 13, 2024
17 checks passed
@smklein smklein deleted the sluggish-datasets branch December 13, 2024 23:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants