Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build end-to-end tests and live tests by default #6469

Merged
merged 7 commits into from
Aug 29, 2024

Conversation

davepacheco
Copy link
Collaborator

Thanks to recent nextest work by @sunshowers, we can add end-to-end-tests and live-tests to the workspace default members (so that they get built with cargo check, for example) without running their tests by default. These had previously been excluded from the default members because their tests don't work in dev environments.

@davepacheco davepacheco requested a review from sunshowers August 28, 2024 13:43
@davepacheco
Copy link
Collaborator Author

In a workspace without these changes:

dap@ivanova omicron-work $ cargo check --all-targets
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 2.02s
dap@ivanova omicron-work $ touch end-to-end-tests/src/lib.rs 
dap@ivanova omicron-work $ cargo check --all-targets
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 1.30s
dap@ivanova omicron-work $ touch touch live-tests/tests/common/mod.rs
dap@ivanova omicron-work $ cargo check --all-targets
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 1.30s

In one with these changes:

dap@ivanova omicron-merge $ touch end-to-end-tests/src/lib.rs 
dap@ivanova omicron-merge $ cargo check --all-targets
    Checking end-to-end-tests v0.1.0 (/home/dap/omicron-merge/end-to-end-tests)
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 3.11s

dap@ivanova omicron-merge $ touch live-tests/tests/common/mod.rs 
dap@ivanova omicron-merge $ cargo check --all-targets
    Checking omicron-live-tests v0.1.0 (/home/dap/omicron-merge/live-tests)
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 2.29s

@davepacheco davepacheco marked this pull request as draft August 28, 2024 13:46
@davepacheco
Copy link
Collaborator Author

The cargo nextest list output is also unchanged before and after this.

@davepacheco davepacheco marked this pull request as ready for review August 28, 2024 14:56
@iliana
Copy link
Contributor

iliana commented Aug 28, 2024

This also needs updated:

let non_default = workspace
.packages
.iter()
.filter_map(|package| {
[
// Including xtask causes hakari to not work as well and build
// times to be longer (omicron#4392).
"xtask",
// The tests here should not be run by default, as they require
// a running control plane.
"end-to-end-tests",
"omicron-live-tests",
]
.contains(&package.name.as_str())
.then_some(&package.id)
})
.collect::<BTreeSet<_>>();

@davepacheco
Copy link
Collaborator Author

  • The build-and-test job completed successfully, ran over 1400 tests, and didn't run these tests (which is good).
  • The helios-deploy job completed successfully, having run the end-to-end tests. This is good and not surprising anyway because it doesn't use nextest.

However, I tried running the live-tests archive and it doesn't run the live tests. The archive was built with:

running: /home/dap/.rustup/toolchains/1.80.1-x86_64-unknown-illumos/bin/cargo "nextest" "archive" "--package" "omicron-live-tests" "--archive-file" "/dangerzone/omicron_tmp/.tmpJkfSyK/live-tests-archive/omicron-live-tests.tar.zst"

but when I run this in the usual way, it uses the default profile and omits those tests:

$ cargo nextest run --archive-file live-tests-archive/omicron-live-tests.tar.zst 
  Extracting 1 binary, 1 build script output directory, and 1 linked path to /dangerzone/omicron_tmp/nextest-archive-DDbfn3
   Extracted 36 files to /dangerzone/omicron_tmp/nextest-archive-DDbfn3 in 0.50s
info: experimental features enabled: setup-scripts
------------
 Nextest run ID e7d05843-f8e5-45bc-b148-802d01be3d4a with nextest profile: default
    Starting 0 tests across 0 binaries (1 binary skipped via profile.default.default-set)
------------
     Summary [   0.000s] 0 tests run: 0 passed, 0 skipped
warning: no tests to run -- this will become an error in the future
(hint: use `--no-tests` to customize)

If I try to pick a specific test, it (reasonably) says you can't use -p with an archive file:

 $ cargo nextest run -p omicron-live-tests --archive-file live-tests-archive/omicron-live-tests.tar.zst 
error: the argument '--archive-file <PATH>' cannot be used with:
  --package <PACKAGES>
  --workspace
  --exclude <EXCLUDE>
  --all
  --lib
  --bin <BIN>
  --bins
  --example <EXAMPLE>
  --examples
  --test <TEST>
  --tests
  --bench <BENCH>
  --benches
  --all-targets
  --features <FEATURES>
  --all-features
  --no-default-features
  --build-jobs <N>
  --release
  --cargo-profile <NAME>
  --target <TRIPLE>
  --target-dir <DIR>
  --unit-graph
  --timings[=<FMTS>]
  --frozen
  --locked
  --offline
  --cargo-quiet...
  --cargo-verbose...
  --ignore-rust-version
  --future-incompat-report
  -Z <FLAG>

Usage: cargo nextest run --package <PACKAGES> --archive-file <PATH> [FILTERS]... [-- <FILTERS_AND_ARGS>...]

For more information, try '--help'.

I thought this was a bug in the archive stuff but actually I can't run the live tests in the usual way either:

$ cargo nextest run -p omicron-live-tests
info: experimental features enabled: setup-scripts
    Finished `test` profile [unoptimized + debuginfo] target(s) in 0.88s
------------
 Nextest run ID 06770e7a-8255-4e23-87a3-0ca6dc38f634 with nextest profile: default
    Starting 0 tests across 0 binaries (1 binary skipped via profile.default.default-set)
------------
     Summary [   0.000s] 0 tests run: 0 passed, 0 skipped
warning: no tests to run -- this will become an error in the future
(hint: use `--no-tests` to customize)

Rereading https://nexte.st/docs/running/#running-a-subset-of-tests-by-default, it looks like maybe I need to be using --bound=all? I found this surprising but it does seem to work:

$ cargo nextest run --bound=all -p omicron-live-tests
info: experimental features enabled: setup-scripts
    Finished `test` profile [unoptimized + debuginfo] target(s) in 0.91s
------------
 Nextest run ID d5bffcdc-44af-4bcd-be2a-8ae645027626 with nextest profile: default
    Starting 1 test across 1 binary
        FAIL [   0.033s] omicron-live-tests::test_nexus_add_remove test_nexus_add_remove

--- STDOUT:              omicron-live-tests::test_nexus_add_remove test_nexus_add_remove ---

running 1 test
test test_nexus_add_remove ... FAILED

failures:

failures:
    test_nexus_add_remove

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.01s


--- STDERR:              omicron-live-tests::test_nexus_add_remove test_nexus_add_remove ---
log file: /dangerzone/omicron_tmp/test_nexus_add_remove-e761fd3ae73c0feb-test_nexus_add_remove.18458.0.log
note: configured to log to "/dangerzone/omicron_tmp/test_nexus_add_remove-e761fd3ae73c0feb-test_nexus_add_remove.18458.0.log"
note: using DNS server for subnet fd00:1122:3344::/48
thread 'test_nexus_add_remove' panicked at live-tests/tests/test_nexus_add_remove.rs:37:1:
setting up LiveTestContext: check_execution_environment(): failed to look up internal DNS in the internal
DNS servers.

 Are you trying to run this in a development environment?  This test can only
be run on deployed systems and only from a context with connectivity to the
underlay network.

 raw error: proto error: io error: No route to host (os error 148)
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

   Canceling due to test failure
------------
     Summary [   0.034s] 1 test run: 0 passed, 1 failed, 0 skipped
        FAIL [   0.033s] omicron-live-tests::test_nexus_add_remove test_nexus_add_remove
error: test run failed

(I expected that failure -- I'm just trying to exercise the filtering behavior here.)

So then I tried to use --bound all when creating the archive file, but that's not accepted:

running: /home/dap/.rustup/toolchains/1.80.1-x86_64-unknown-illumos/bin/cargo "nextest" "archive" "--bound" "all" "--package" "omicron-live-tests" "--archive-file" "/dangerzone/omicron_tmp/.tmpl2Vnxi/live-tests-archive/omicron-live-tests.tar.zst"
error: unexpected argument '--bound' found

Usage: cargo nextest archive [OPTIONS] --archive-file <PATH>

For more information, try '--help'.
Error: failed: exit status: 2

So then I tried passing it to cargo nextest run --archive-file, and that appears to work:

$ cargo nextest run --bound=all --archive-file live-tests-archive/omicron-live-tests.tar.zst 
  Extracting 1 binary, 1 build script output directory, and 1 linked path to /dangerzone/omicron_tmp/nextest-archive-Fka6Cf
   Extracted 36 files to /dangerzone/omicron_tmp/nextest-archive-Fka6Cf in 0.55s
info: experimental features enabled: setup-scripts
------------
 Nextest run ID 54ef6ceb-9905-4dab-8eab-651c9f520cb9 with nextest profile: default
    Starting 1 test across 1 binary
        FAIL [   0.035s] omicron-live-tests::test_nexus_add_remove test_nexus_add_remove

--- STDOUT:              omicron-live-tests::test_nexus_add_remove test_nexus_add_remove ---

running 1 test
test test_nexus_add_remove ... FAILED

failures:

failures:
    test_nexus_add_remove

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.01s


--- STDERR:              omicron-live-tests::test_nexus_add_remove test_nexus_add_remove ---
log file: /dangerzone/omicron_tmp/test_nexus_add_remove-e761fd3ae73c0feb-test_nexus_add_remove.19899.0.log
note: configured to log to "/dangerzone/omicron_tmp/test_nexus_add_remove-e761fd3ae73c0feb-test_nexus_add_remove.19899.0.log"
note: using DNS server for subnet fd00:1122:3344::/48
thread 'test_nexus_add_remove' panicked at live-tests/tests/test_nexus_add_remove.rs:37:1:
setting up LiveTestContext: check_execution_environment(): failed to look up internal DNS in the internal
DNS servers.

 Are you trying to run this in a development environment?  This test can only
be run on deployed systems and only from a context with connectivity to the
underlay network.

 raw error: proto error: io error: No route to host (os error 148)
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

   Canceling due to test failure
------------
     Summary [   0.036s] 1 test run: 0 passed, 1 failed, 0 skipped
        FAIL [   0.035s] omicron-live-tests::test_nexus_add_remove test_nexus_add_remove
error: test run failed

So I guess the change needed here is to the instructions that get printed out by cargo xtask live-test.

@davepacheco
Copy link
Collaborator Author

davepacheco commented Aug 28, 2024

From 8b77172, I've retested like this:

  • cargo nextest list reports the same thing it did before this change.
  • touching files in end-to-end-tests or live-tests and then running cargo check --all-targets causes them to be re-checked (see above, it didn't before this change).
  • Running the live-tests archive file by hand using the instructions printed out by cargo xtask live-tests does attempt to run the live tests. (I only tested in this in my dev environment so the tests themselves failed but I think that's fine because it's orthogonal to all this.)
  • When CI finishes I'll confirm again that it ran all the expected tests and none of the unexpected tests in both build-and-test (the main test suite) and helios-deploy (the end-to-end tests).

@@ -35,6 +35,9 @@ clickhouse-cluster = { max-threads = 1 }
# behaviors that conflict with each other. They need to be run serially.
live-tests = { max-threads = 1 }

[profile.default]
default-set = 'all() - package(omicron-live-tests) - package(end-to-end-tests)'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Id recommend also adding a live-tests profile with exactly 'package(omicron-live-tests)' as the default set, so folks can run this with -P live-tests.

@@ -134,8 +134,11 @@ pub fn run_cmd(_args: Args) -> Result<()> {
// TMPDIR=/var/tmp puts stuff on disk, cached as needed, rather than the
// default /tmp which requires that stuff be in-memory. That can lead to
// great sadness if the tests wind up writing a lot of data.
//
// --bound=all is required to cause nextest to *not* ignore these tests,
Copy link
Contributor

@sunshowers sunshowers Aug 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be an issue with the profile. The bound argument was needed because the UI gets pretty confusing if you try to make it more magical. But it's better to just use a profile per use case I think.

@davepacheco
Copy link
Collaborator Author

Alright, with f9bb4eb:

  • I built a live-tests archive, shipped it to an a4x2 system, ran it with the commands emitted by cargo xtask live-tests, and it passed tests.
  • I've checked that cargo nextest list has not changed from before this PR to after.
  • When CI finishes I'll check that both build-and-test and helios-deploy ran the tests we expected.

@davepacheco
Copy link
Collaborator Author

nextest 0.9.77 renames a few of these fields. I'll be taking another lap shortly (when the release binaries are available).

@davepacheco
Copy link
Collaborator Author

6038bf9 brings this to nextest 0.9.77. With that commit:

  • I built a live-tests archive, shipped it to an a4x2 system, ran it with the commands emitted by cargo xtask live-tests, and it passed tests.
  • I've checked that cargo nextest list has not changed from before this PR to after.
  • When CI finishes I'll check that both build-and-test and helios-deploy ran the tests we expected.

Copy link
Contributor

@sunshowers sunshowers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

live-tests/README.adoc Outdated Show resolved Hide resolved
@davepacheco
Copy link
Collaborator Author

Confirmed that helios-deploy ran the expected end-to-end-tests.
Confirmed that build-and-test did not run the end-to-end or live tests, but did run 1442 other tests.

@davepacheco davepacheco merged commit b7ab537 into main Aug 29, 2024
24 checks passed
@davepacheco davepacheco deleted the dap/live-tests-follow-up branch August 29, 2024 00:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants