-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Skip test_disk_usage
if there's no /usr/bin/du
#6479
Conversation
The `zone_bundle::disk_usage` function in sled-agent executes `du(1)` to measure disk usage for a path. Presently, the path to the `du(1)` executable is hard-coded to `/usr/bin/du`. On some systems, such as NixOS, the `du(1)` executable may not live at the path `/usr/bin/du`. We could, potentially, change the `disk_usage` function to find a `du(1)` binary on the PATH, or fall back to other locations...but, on Helios, it is always at `/usr/bin/du`, so hardcoding it is a good solution for the environment that sled-agents are actually intended to run in. However, hardcoding the path means that this test will always fail on systems where there is no `/usr/bin/du`. Therefore, this commit changes the test to first check if `/usr/bin/du` exists, and return early if it does not, printing a warning indicating that the test is skipped. If, someday, Rust's libtest grows a mechanism for tests to mark themselves as skipped at runtime dynamically (rather than using the `#[ignore]` attribute at compile-time), we should probably switch to using that hypothetical mechanism here. See this [internals.rust-lang.org thread][1]. If folks aren't happy with making the test skip itself, we *could* change the implementation to worth with `du(1)` being at a different location by finding it from the PATH, potentially falling back to `/usr/bin/du` if it isn't on the PATH. My thought process was that it was better to avoid complicating the production sled-agent code just to support environments that the software will never actually run on for test purposes, but I'm open to being convinced otherwise. [1]: https://internals.rust-lang.org/t/pre-rfc-skippable-tests/14611
Another option is changing the dev shell environment provided by the Nix flake to use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the caveat that I know zilch about NixOS, I approve!
if !Utf8Path::new(DU).exists() { | ||
eprintln!("No {DU} executable found, skipping disk usage test!"); | ||
return; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we put this behind a #[cfg(target_os = "linux")]
? Super conservative, but would make me slightly more comfortable :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, although, perhaps we want #[cfg(not(target_os = "illumos"))]
? Would we also want this behavior on MacOS/Windows?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was just trying to preserve the existing behavior on the largest number of systems we already run the test on, that's all. Implementing du
in Rust is also fine to me, just sounds like more work, but there may be a good library for it.
Would we consider implementing the |
Actually, here's a thought for another potential alternative: @sunshowers, is there a way to tell Nextest to skip a particular test using an environment variable? The Nix dev shell could set an env var that disables this test, so that it's only skipped on NixOS and you could get a message saying it was skipped, instead of a "PASS"... |
That would certainly be the ideal approach, especially given that there's already code for selecting different omicron/sled-agent/src/zone_bundle.rs Lines 1554 to 1575 in b78cbab
It would be a bit more work just to fix a test that doesn't work on my weird OS for insane people, but it also seems like a better solution in the long run; certainly less flaky than shelling out. On systems where there's a |
I'm gonna take a crack at writing a pure-Rust |
Long requested but not currently supported :( |
Currently, the `zone_bundle::disk_usage` function in sled-agent shells out to `du(1)` and parses its output. This works fine, but it's a bit flaky, especially when running the corresponding tests on "weird" platforms such as NixOS, where the `du(1)` binary doesn't live in `/usr/bin/du` (see #6479). This commit replaces the code that shells out to `/usr/bin/du` with a pure-Rust rewrite. There's a unit test that asserts that the disk usage calculated by the new implementation matches the value calculated by `du(1)`, with code for handling the potential inaccuracy due to `du(1)` operating in terms of 512B blocks on macOS. Closes #6479
@davepacheco #6482 replaces the use of |
FWIW, I think this might be possible by adding a "nixos" profile to the nextest config file that skips the test, and having the devshell set |
That would work but is kind of a big hammer, and the default set would need to stay up-to-date across profiles. I like the RiiR idea too. |
The
zone_bundle::disk_usage
function in sled-agent executesdu(1)
tomeasure disk usage for a path. Presently, the path to the
du(1)
executable is hard-coded to
/usr/bin/du
. On some systems, such asNixOS, the
du(1)
executable may not live at the path/usr/bin/du
. Wecould, potentially, change the
disk_usage
function to find adu(1)
binary on the PATH, or fall back to other locations...but, on Helios, it
is always at
/usr/bin/du
, so hardcoding it is a good solution for theenvironment that sled-agents are actually intended to run in.
However, hardcoding the path means that this test will always fail on
systems where there is no
/usr/bin/du
. Therefore, this commit changesthe test to first check if
/usr/bin/du
exists, and return early if itdoes not, printing a warning indicating that the test is skipped.
If, someday, Rust's libtest grows a mechanism for tests to mark
themselves as skipped at runtime dynamically (rather than using the
#[ignore]
attribute at compile-time), we should probably switch tousing that hypothetical mechanism here. See this
internals.rust-lang.org thread.
If folks aren't happy with making the test skip itself, we could
change the implementation to worth with
du(1)
being at a differentlocation by finding it from the PATH, potentially falling back to
/usr/bin/du
if it isn't on the PATH. My thought process was that itwas better to avoid complicating the production sled-agent code just to
support environments that the software will never actually run on for
test purposes, but I'm open to being convinced otherwise.