-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[sled-agent] Replace du(1)
with a Rust function
#6482
Changes from 2 commits
7894e2b
a265af8
ab6aaf0
b4e8731
772cd16
92718e0
8fe21ff
dcf0764
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1545,61 +1545,54 @@ async fn compute_bundle_utilization( | |
|
||
// Return the number of bytes occupied by the provided directory. | ||
// | ||
// This returns an error if: | ||
// | ||
// - The "du" command fails | ||
// - Parsing stdout fails | ||
// - Parsing the actual size as a u64 fails | ||
// This returns an error if reading any file or directory within the provided | ||
// directory fails. | ||
async fn disk_usage(path: &Utf8PathBuf) -> Result<u64, BundleError> { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it's probably worth considering a different name here, given the function appears to be assessing the apparent size of a tree of files rather than the actual disk usage. I definitely would have guessed the other way based on the name! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ugh, that's a good point --- There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We probably want There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I don't believe it was, because we were using the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ahh, okay, I can just rename the function, then. Unless we want the other behavior --- I'm not totally sure. @bnaecker, this is your code, right? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It is, yeah. I have no preference on the name, that was chosen since it was calling out to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 772cd16 renames it to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can that be &Utf8Path btw? I know you didn't touch that specific line There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It can, yeah — I just didn't want to mess with the surrounding code as part of a quick fix. Happy to go and change it as well. |
||
// Each OS implements slightly different `du` options. | ||
// | ||
// Linux and illumos support the "apparent" size in bytes, though using | ||
// different options. macOS doesn't support bytes at all, and has a minimum | ||
// block size of 512. | ||
// | ||
// We'll suffer the lower resolution on macOS, and get higher resolution on | ||
// the others. | ||
cfg_if::cfg_if! { | ||
if #[cfg(target_os = "illumos")] { | ||
const BLOCK_SIZE: u64 = 1; | ||
const DU_ARG: &str = "-A"; | ||
} else if #[cfg(target_os = "linux")] { | ||
const BLOCK_SIZE: u64 = 1; | ||
const DU_ARG: &str = "-b"; | ||
} else if #[cfg(target_os = "macos")] { | ||
const BLOCK_SIZE: u64 = 512; | ||
const DU_ARG: &str = "-k"; | ||
} else { | ||
compile_error!("unsupported target OS"); | ||
let path = path.clone(); | ||
// We could, alternatively, just implement this using `tokio::fs` to read | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I have not looked at this much yet but I wonder if we should just use the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ah, i didn't realize that was in our dependency tree, I'll take a look at it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. b4e8731 changes this to use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Looking at https://github.com/uutils/coreutils/blob/8fe93ded741f8257e9d0a5bce12f904e09227a59/src/uu/du/src/du.rs#L319, at least There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also that code's recursive, huh. I'd definitely write this using a heap-allocated stack personally. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
That was what my original implementation did! :) |
||
// the directory and so on. However, the `tokio::fs` APIs are basically just | ||
// a wrapper around `spawn_blocking`-ing code that does the `std::fs` | ||
// functions. So, putting the whole thing in one big `spawn_blocking` | ||
// closure that just does all the blocking fs ops lets us spend less time | ||
// creating and destroying tiny blocking tasks. | ||
tokio::task::spawn_blocking(move || { | ||
let mut sum = 0; | ||
// Rather than recursing, we use a VecDeque as a queue of paths to scan | ||
// next. This is so that if we're walking a very deeply nested directory | ||
// structure, we don't overflow our stack. | ||
// | ||
// TODO(eliza): this could, conceivably, be parallelized by spawning a | ||
// bunch of blocking tasks to walk each sub-directory, instead of using | ||
// a queue. But, is optimizing this really all that important? It's | ||
// probably far from the slowest part of zone bundle collection anyway... | ||
let mut dirs_to_scan = std::collections::VecDeque::new(); | ||
dirs_to_scan.push_back(path.to_owned()); | ||
|
||
while let Some(dir) = dirs_to_scan.pop_front() { | ||
let mk_dir_error = |err| BundleError::ReadDirectory { | ||
directory: dir.clone(), | ||
err, | ||
}; | ||
for entry in std::fs::read_dir(&dir).map_err(mk_dir_error)? { | ||
let entry = entry.map_err(mk_dir_error)?; | ||
let mk_error = |err| BundleError::ReadDirectory { | ||
directory: Utf8PathBuf::try_from(entry.path()) | ||
.unwrap_or_else(|_| path.clone()), | ||
err, | ||
}; | ||
if entry.file_type().map_err(mk_error)?.is_dir() { | ||
let path = entry.path().try_into()?; | ||
dirs_to_scan.push_back(path); | ||
} else { | ||
sum += entry.metadata().map_err(mk_error)?.len() | ||
hawkw marked this conversation as resolved.
Show resolved
Hide resolved
|
||
} | ||
} | ||
} | ||
} | ||
const DU: &str = "/usr/bin/du"; | ||
let args = &[DU_ARG, "-s", path.as_str()]; | ||
let output = Command::new(DU).args(args).output().await.map_err(|err| { | ||
BundleError::Command { cmd: format!("{DU} {}", args.join(" ")), err } | ||
})?; | ||
let err = |msg: &str| { | ||
BundleError::Cleanup(anyhow!( | ||
"failed to fetch disk usage for {}: {}", | ||
path, | ||
msg, | ||
)) | ||
}; | ||
if !output.status.success() { | ||
return Err(err("du command failed")); | ||
} | ||
let Ok(s) = std::str::from_utf8(&output.stdout) else { | ||
return Err(err("non-UTF8 stdout")); | ||
}; | ||
let Some(line) = s.lines().next() else { | ||
return Err(err("no lines in du output")); | ||
}; | ||
let Some(part) = line.trim().split_ascii_whitespace().next() else { | ||
return Err(err("no disk usage size computed in output")); | ||
}; | ||
part.parse() | ||
.map(|x: u64| x.saturating_mul(BLOCK_SIZE)) | ||
.map_err(|_| err("failed to parse du output")) | ||
|
||
Ok(sum) | ||
}) | ||
.await | ||
.expect("spawned blocking task should not be cancelled or panic") | ||
} | ||
|
||
// Return the quota for a ZFS dataset, or the available size. | ||
|
@@ -1659,8 +1652,7 @@ async fn zfs_quota(path: &Utf8PathBuf) -> Result<u64, BundleError> { | |
|
||
#[cfg(test)] | ||
mod tests { | ||
use super::disk_usage; | ||
use super::Utf8PathBuf; | ||
use super::*; | ||
|
||
#[tokio::test] | ||
async fn test_disk_usage() { | ||
|
@@ -1676,6 +1668,98 @@ mod tests { | |
let path = Utf8PathBuf::from("/some/nonexistent/path"); | ||
assert!(disk_usage(&path).await.is_err()); | ||
} | ||
|
||
#[tokio::test] | ||
#[cfg(any( | ||
target_os = "illumos", | ||
target_os = "linux", | ||
target_os = "macos" | ||
))] | ||
async fn test_disk_usage_matches_du() { | ||
// Each OS implements slightly different `du` options. | ||
// | ||
// Linux and illumos support the "apparent" size in bytes, though using | ||
// different options. macOS doesn't support bytes at all, and has a minimum | ||
// block size of 512. | ||
// | ||
// We'll suffer the lower resolution on macOS, and get higher resolution on | ||
// the others. | ||
cfg_if::cfg_if! { | ||
if #[cfg(target_os = "illumos")] { | ||
const BLOCK_SIZE: u64 = 1; | ||
const DU_ARG: &str = "-A"; | ||
} else if #[cfg(target_os = "linux")] { | ||
const BLOCK_SIZE: u64 = 1; | ||
const DU_ARG: &str = "-b"; | ||
} else if #[cfg(target_os = "macos")] { | ||
const BLOCK_SIZE: u64 = 512; | ||
const DU_ARG: &str = "-k"; | ||
} else { | ||
compile_error!("unsupported target OS"); | ||
} | ||
} | ||
|
||
async fn disk_usage_du(path: &Utf8PathBuf) -> Result<u64, BundleError> { | ||
const DU: &str = "du"; | ||
let args = &[DU_ARG, "-s", path.as_str()]; | ||
let output = | ||
Command::new(DU).args(args).output().await.map_err(|err| { | ||
BundleError::Command { | ||
cmd: format!("{DU} {}", args.join(" ")), | ||
err, | ||
} | ||
})?; | ||
let err = |msg: &str| { | ||
BundleError::Cleanup(anyhow!( | ||
"failed to fetch disk usage for {}: {}", | ||
path, | ||
msg, | ||
)) | ||
}; | ||
if !output.status.success() { | ||
return Err(err("du command failed")); | ||
} | ||
let Ok(s) = std::str::from_utf8(&output.stdout) else { | ||
return Err(err("non-UTF8 stdout")); | ||
}; | ||
let Some(line) = s.lines().next() else { | ||
return Err(err("no lines in du output")); | ||
}; | ||
let Some(part) = line.trim().split_ascii_whitespace().next() else { | ||
return Err(err("no disk usage size computed in output")); | ||
}; | ||
part.parse() | ||
.map(|x: u64| x.saturating_mul(BLOCK_SIZE)) | ||
.map_err(|_| err("failed to parse du output")) | ||
} | ||
|
||
let path = | ||
Utf8PathBuf::from(concat!(env!("CARGO_MANIFEST_DIR"), "/src")); | ||
let t0 = std::time::Instant::now(); | ||
let usage = dbg!(disk_usage(&path).await) | ||
.expect("disk usage for src dir failed"); | ||
eprintln!("disk_usage({path}) took {:?}", t0.elapsed()); | ||
|
||
let t0 = std::time::Instant::now(); | ||
let du_usage = | ||
dbg!(disk_usage_du(&path).await).expect("running du failed!"); | ||
eprintln!("du -s {path} took {:?}", t0.elapsed()); | ||
|
||
// Round down the Rust disk usage result to `du`'s block size on this | ||
hawkw marked this conversation as resolved.
Show resolved
Hide resolved
|
||
// system. | ||
let usage = if BLOCK_SIZE > 1 { | ||
eprintln!("rounding up to `du(1)`'s block size of {BLOCK_SIZE}B"); | ||
dbg!(usage.div_ceil(BLOCK_SIZE) * BLOCK_SIZE) | ||
} else { | ||
usage | ||
}; | ||
|
||
assert_eq!( | ||
usage, du_usage, | ||
"expected `disk_usage({path})` to == `du -s {path}`\n\ | ||
{usage}B (`disk_usage`) != {du_usage}B (`du`)", | ||
); | ||
} | ||
} | ||
|
||
#[cfg(all(target_os = "illumos", test))] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: Is there a reason that these are not doc comments?