-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
deps unexpectedly rebuilt after running cargo nextest run
differently
#4392
Comments
One problem is that I hadn't run
I don't know if that's enough to cause this. |
Doubt that to be honest -- I don't think it's enough to cause this. |
I noticed a similar thing today, just in one
I would have expected |
Investigation resultsOK, I spent some time looking at this and there are two big things that pop out: 1. Excluding xtask from hakariWe exclude the 2. Host and target deps built with separate optionsOne of the promises hakari makes is to be able to unify builds across host (build) and target (normal/dev) deps. This sadly doesn't work for us at all. That's because we build target (normal/dev) deps and host (build) deps with separate sets of options:
Either of the two options above would make a difference for compile times. The number of units (things to build, what shows up in the progress bar in Expand for diffdiff --git a/Cargo.toml b/Cargo.toml
index b18b20aec..291a1a56d 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -392,12 +392,16 @@ zip = { version = "0.6.6", default-features = false, features = ["deflate","bzip
zone = { version = "0.3", default-features = false, features = ["async"] }
[profile.dev]
-panic = "abort"
# See https://github.com/oxidecomputer/omicron/issues/4009 for some background context here.
# By reducing the debug level (though keeping enough to have meaningful
# backtraces), we reduce incremental build time and binary size significantly.
debug = "line-tables-only"
+# Setting build-override with line-tables-only means that we don't build two
+# copies of dependencies shared at build and runtime
+[profile.dev.build-override]
+debug = "line-tables-only"
+
# `bindgen` is used by `samael`'s build script; building it with optimizations
# makes that build script run ~5x faster, more than offsetting the additional
# build time added to `bindgen` itself. With the command Before: 1230 units (2m 25s fresh build) This is a pretty decent difference. As a third option, we could just accept that we're going to have different builds on the host and target platforms forever, and tell hakari to not try and unify dependencies across the platforms. How I investigatedI used the
Then, I investigated the two outputs, starting from |
OK, so the |
Is the assumption here that the reason we don't get the backtrace is because the generation of a core dump which would have such a backtrace is disabled? |
Ah I was thinking in terms of backtraces printed by Rust itself, e.g. by running |
All of the cases above should now be handled with #4535. Thanks again for all your patience. |
This PR has a few changes that make builds and test runs significantly faster: 1. Remove `xtask` from the list of default-members. This makes it so that `cargo nextest run` and `cargo nextest run -p <package>` use more dependency feature sets in common. 2. Move `opt-level` settings from `profile.test` to `profile.dev`. Again, this results in more cache hits. 3. Set `profile.dev.panic` to `unwind`. This is to unify build units across dev and test builds: tests are always built with `panic = "unwind"` so that proper backtraces can be printed out. Release builds stay as `abort`. 4. For a belt-and-suspenders approach, make the `crdb-seed` script use the `test` profile. If there are any divergences between `dev` and `test` in the future, then crdb-seed should share its build cache with the tests it was presumably invoked for. 5. Set `profile.dev.build-override.debug` to `line-tables-only`. This, along with 3, means that target (normal/dev) and build (host) dependencies are now unified. All of this comes together for a pretty sweet improvement. See #4392 for more details and how I investigated this issue. ## Impact With a fresh build on Linux with mold, I ran three commands in sequence: 1. `cargo nextest run --no-run` 2. `cargo nextest run -p nexus-db-queries` 3. `cargo build -p omicron-nexus` The results were: | **command** | **phase** | **before** | **before, cumul.** | **after** | **after, cumul.** | |-----------------------------------------|-------------------|-----------:|-------------------:|----------:|------------------:| | `cargo nextest run` | build | 173s | 173s | 158s | 158s | | `cargo nextest run -p nexus-db-queries` | build | 61s | 234s | 51s | 209s | | `cargo nextest run -p nexus-db-queries` | `crdb-seed` build | 21s | 255s | 1s | 210s | | `cargo build -p omicron-nexus` | build | 99s | 354s | 69s | 279s | So the cumulative time spent on these three commands went from 354s to 279s. That's a 1.26x speedup. And this should also make other commands better as well (omicron-nexus is a bit of a weird case because it takes a very long time to compile by itself, and that 69s in the "after" column is entirely building omicron-nexus).
I noticed all this working on branch "dap/nexus-inventory-2". Commit 376a37b is sync'd up with "main" from this morning. From there, I ran a full test suite run that failed one test:
I then tried to fix this test, which changed these files:
nexus/db-model/src/schema.rs
,schema/crdb/dbinit.sql
,schema/crdb/9.0.0/
(a whole tree).Then I tried to rerun just the failing test. But it looks like a whole lot of stuff got rebuilt, including a bunch of deps from outside the workspace:
This was unexpected -- I thought hakari was supposed to prevent that. I wondered if mine was out of date. But I think not?
The text was updated successfully, but these errors were encountered: