Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compile integration tests as a single binary #2914

Merged
merged 1 commit into from
Feb 7, 2024
Merged

Conversation

jyn514
Copy link
Contributor

@jyn514 jyn514 commented Feb 2, 2024

closes #2912

Checklist

If applicable:

  • I have updated CHANGELOG.md
  • I have updated the documentation (README.md, docs/, demos/)
  • I have updated the config schema (cli/src/config-schema.json)
  • I have added tests to cover my changes

@jyn514 jyn514 force-pushed the par-tests branch 2 times, most recently from 3a88e64 to 5daccac Compare February 2, 2024 08:42
Copy link
Contributor

@yuja yuja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this is a bit adhoc, I like the end result.

@arxanas any thoughts?

lib/tests/runner.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@ilyagr ilyagr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, this is fast 🚀. I might no longer have to be paranoid about linking speed and all the ways phases of the moon that can affect it. LGTM, but I'll let others officially approve.

Aside: I wish Rust more obviously encouraged this style, or defaulted to building one binary for the tests.

cli/tests/test_no_abandoned_test_files.rs Outdated Show resolved Hide resolved
cli/tests/test_no_abandoned_test_files.rs Outdated Show resolved Hide resolved
@arxanas
Copy link
Contributor

arxanas commented Feb 3, 2024

Can you compare the resulting runtimes for running the whole test suite and running a single test? The improved compilation times are admirable, but if it turns out that the full test suite :: 5m -> 4m but incremental tests :: 2s -> 4s then it might not be worth it. Ideally I would have incremental rebuild+test times be <1s. When developing jj, I extensively relied on re-running individual tests repeatedly, so that feedback loop is very important to me. Others might feel differently about those trade-offs.

@ilyagr
Copy link
Contributor

ilyagr commented Feb 3, 2024

The benchmark in the commit message is touch lib/src/lib.rs ; cargo t --test runner --no-run, and I'm assuming hyperfine throws out the first run like a good benchmarking tool should. So, it's all about incremental compilation, I think.

@ilyagr
Copy link
Contributor

ilyagr commented Feb 3, 2024

I also unscientifically tried touch lib/src/lib.rs ; cargo insta test --test-runner nextest -- test_abandon. It feels fast with this PR, though which linker you use might again make a difference.

I found another minor downside: the fish shell has completions for cargo test --test <TAB>, but not for cargo test -- <TAB> (or they don't work with this PR). In this way, doing a subset of tests is less convenient for me with this PR (though I think the speedup is worth it for me).

@arxanas
Copy link
Contributor

arxanas commented Feb 3, 2024

Regarding touch, @jyn514 would probably know better than me, but in my blog post I actually modified the file in the worry that cargo might do something content-addressed as part of compilation and it wouldn't represent a "real" incremental change.

@jyn514
Copy link
Contributor Author

jyn514 commented Feb 3, 2024

Regarding touch, @jyn514 would probably know better than me, but in my blog post I actually modified the file in the worry that cargo might do something content-addressed as part of compilation and it wouldn't represent a "real" incremental change.

cargo does not do content-addressed hashing, no. you can see this if you set CARGO_LOG=info:

; touch lib/src/lib.rs ; CARGO_LOG=info cargo t --no-run
   0.089419238s  INFO cargo::core::compiler::fingerprint: stale: changed "/home/jyn/src/jj/lib/src/lib.rs"
   0.089482460s  INFO cargo::core::compiler::fingerprint:           (vs) "/home/jyn/.local/lib/cargo/target/debug/.fingerprint/jj-lib-49a46e34c360fc2e/dep-lib-jj-lib"
   0.089485415s  INFO cargo::core::compiler::fingerprint:                FileTime { seconds: 1706861032, nanos: 988871571 } < FileTime { seconds: 1706931264, nanos: 137617386 }

Can you compare the resulting runtimes for running the whole test suite and running a single test? The improved compilation times are admirable, but if it turns out that the full test suite :: 5m -> 4m but incremental tests :: 2s -> 4s then it might not be worth it. Ideally I would have incremental rebuild+test times be <1s. When developing jj, I extensively relied on re-running individual tests repeatedly, so that feedback loop is very important to me. Others might feel differently about those trade-offs.

sure; but the times are small enough i don't think times on my computer will be comparable to yours, i'd recommend running this on your own machine to see what the times are.

before this change:

; hyperfine 'touch lib/src/lib.rs && cargo t --test test_working_copy'
Benchmark 1: touch lib/src/lib.rs && cargo t --test test_working_copy
  Time (mean ± σ):      3.543 s ±  0.168 s    [User: 2.597 s, System: 1.262 s]
  Range (min … max):    3.400 s …  3.847 s    10 runs

after this change:

; hyperfine 'touch lib/src/lib.rs && cargo t --test runner -- test_working_copy'
Benchmark 1: touch lib/src/lib.rs && cargo t --test runner -- test_working_copy
  Time (mean ± σ):      4.129 s ±  0.120 s    [User: 3.636 s, System: 1.593 s]
  Range (min … max):    3.933 s …  4.346 s    10 runs

so only about half a second of regression, from 3.5 to 4 seconds.

(using cargo test is intentional, nextest is actually a bit slower than the built-in framework for single tests)

@jyn514
Copy link
Contributor Author

jyn514 commented Feb 3, 2024

I found another minor downside: the fish shell has completions for cargo test --test <TAB>, but not for cargo test -- <TAB> (or they don't work with this PR). In this way, doing a subset of tests is less convenient for me with this PR (though I think the speedup is worth it for me).

yeah, this is unfortunate. i don't know a way to fix this upstream in fish; it would have to parse the mod declarations (which is not super hard but requires it to be aware of cargo's directory structure instead of just running cargo metadata).

@jyn514
Copy link
Contributor Author

jyn514 commented Feb 3, 2024

Can you compare the resulting runtimes for running the whole test suite

it takes about 13.5 seconds for me to run the tests with cargo nextest run after they're compiled, both before and after this PR. so the total time is going from ~45 to ~18 seconds (and it's better than 18 in practice because it's rare for the very last test run to be the only one that fails).

cli/tests/runner.rs Outdated Show resolved Hide resolved
@martinvonz
Copy link
Member

Thanks for the PR, @jyn514, and thanks for reviewing and testing to everyone else. @arxanas, are you okay with this? It sounds like we're just waiting for your approval, but no rush.

Copy link
Member

@martinvonz martinvonz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm giving my approval now anyway. It sounds like most people are for this change. We can always roll it back if it turns out we missed something. Thanks again, jyn!

@jyn514 jyn514 enabled auto-merge (rebase) February 7, 2024 01:48
@ilyagr ilyagr disabled auto-merge February 7, 2024 01:51
@ilyagr
Copy link
Contributor

ilyagr commented Feb 7, 2024

I think Martin will cut a release in the next couple of days. I'm likely being overly cautious, but perhaps we can delay merging this until the release, just in case?

P.S. The tests are failing because a few new test files were added in the last week.

@martinvonz
Copy link
Member

I think Martin will cut a release in the next couple of days. I'm likely being overly cautious, but perhaps we can delay merging this until the release, just in case?

It doesn't seem that this can break the binaries, so what's your concern? If it breaks the CI actions, I'd rather learn that tomorrow than in a month.

this greatly speeds up the time to run all tests, at the cost of slightly larger recompile times for individual tests.

this unfortunately adds the requirement that all tests are listed in `runner.rs` for the crate.
to avoid forgetting, i've added a new test that ensures the directory is in sync with the file.

 ## benchmarks

before this change, recompiling all tests took 32-50 seconds and running a single test took 3.5 seconds:

```
; hyperfine 'touch lib/src/lib.rs && cargo t --test test_working_copy'
  Time (mean ± σ):      3.543 s ±  0.168 s    [User: 2.597 s, System: 1.262 s]
  Range (min … max):    3.400 s …  3.847 s    10 runs
```

after this change, recompiling all tests take 4 seconds:
```
;  hyperfine 'touch lib/src/lib.rs ; cargo t --test runner --no-run'
  Time (mean ± σ):      4.055 s ±  0.123 s    [User: 3.591 s, System: 1.593 s]
  Range (min … max):    3.804 s …  4.159 s    10 runs
```
and running a single test takes about the same:
```
; hyperfine 'touch lib/src/lib.rs && cargo t --test runner -- test_working_copy'
  Time (mean ± σ):      4.129 s ±  0.120 s    [User: 3.636 s, System: 1.593 s]
  Range (min … max):    3.933 s …  4.346 s    10 runs
```

about 1.4 seconds of that is the time for the runner, of which .4 is the time for the linker. so
there may be room for further improving the times.
@ilyagr
Copy link
Contributor

ilyagr commented Feb 7, 2024

OK, this is definitely up to Martin. (I was vaguely worried about the CI) Please feel free to merge, and thank you again!

@jyn514 jyn514 merged commit d66fcf2 into jj-vcs:main Feb 7, 2024
15 checks passed
@jyn514 jyn514 deleted the par-tests branch February 7, 2024 02:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

FR: make tests faster to recompile
5 participants