Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test failure: wicketd::mod integration_tests::updates::test_update_races 400 "Updates currently running" #4668

Closed
rcgoodfellow opened this issue Dec 11, 2023 · 3 comments · Fixed by #4675
Labels
Test Flake Tests that work. Wait, no. Actually yes. Hang on. Something is broken.

Comments

@rcgoodfellow
Copy link
Contributor

Seeing intermittent failures in CI for the wicketd::mod integration_tests::updates::test_update_races

Link to full output

5174 | 2023-12-11T18:05:02.738Z | FAIL [   2.621s] wicketd::mod integration_tests::updates::test_update_races
-- | -- | --
5175 | 2023-12-11T18:05:02.738Z |  
5176 | 2023-12-11T18:05:02.738Z | --- STDOUT:              wicketd::mod integration_tests::updates::test_update_races ---
5177 | 2023-12-11T18:05:02.738Z |  
5178 | 2023-12-11T18:05:02.738Z | running 1 test
5179 | 2023-12-11T18:05:02.738Z | test integration_tests::updates::test_update_races ... FAILED
5180 | 2023-12-11T18:05:02.738Z |  
5181 | 2023-12-11T18:05:02.738Z | failures:
5182 | 2023-12-11T18:05:02.738Z |  
5183 | 2023-12-11T18:05:02.738Z | failures:
5184 | 2023-12-11T18:05:02.738Z | integration_tests::updates::test_update_races
5185 | 2023-12-11T18:05:02.738Z |  
5186 | 2023-12-11T18:05:02.738Z | test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 4 filtered out; finished in 2.60s
5187 | 2023-12-11T18:05:02.738Z |  
5188 | 2023-12-11T18:05:02.738Z |  
5189 | 2023-12-11T18:05:02.738Z | --- STDERR:              wicketd::mod integration_tests::updates::test_update_races ---
5190 | 2023-12-11T18:05:02.739Z | log file: /var/tmp/omicron_tmp/mod-608758777f20938b-test_artifact_upload_while_updating.116906.0.log
5191 | 2023-12-11T18:05:02.739Z | note: configured to log to "/var/tmp/omicron_tmp/mod-608758777f20938b-test_artifact_upload_while_updating.116906.0.log"
5192 | 2023-12-11T18:05:02.739Z | hint: Generated a random key:
5193 | 2023-12-11T18:05:02.739Z | hint:
5194 | 2023-12-11T18:05:02.739Z | hint:   ed25519:6741d4fe636340108ecbf622379d8971d7eb042c72dc7d1c681c14a663f88168
5195 | 2023-12-11T18:05:02.739Z | hint:
5196 | 2023-12-11T18:05:02.739Z | hint: To modify this repository, you will need this key. Use the -k/--key
5197 | 2023-12-11T18:05:02.741Z | hint: command line flag or the TUFACEOUS_KEY environment variable:
5198 | 2023-12-11T18:05:02.826Z | hint:
5199 | 2023-12-11T18:05:02.826Z | hint:   export TUFACEOUS_KEY=ed25519:6741d4fe636340108ecbf622379d8971d7eb042c72dc7d1c681c14a663f88168
5200 | 2023-12-11T18:05:02.827Z | hint:
5201 | 2023-12-11T18:05:02.827Z | hint: To prevent this default behavior, use --no-generate-key.
5202 | 2023-12-11T18:05:02.827Z | thread 'integration_tests::updates::test_update_races' panicked at wicketd/tests/integration_tests/updates.rs:495:10:
5203 | 2023-12-11T18:05:02.827Z | no updates currently running: Error Response: status: 400 Bad Request; headers: {"content-type": "application/json", "x-request-id": "191984ee-975f-4941-bcfb-38a1a98bd5b0", "content-length": "120", "date": "Mon, 11 Dec 2023 18:05:00 GMT"}; value: Error { error_code: None, message: "Updates currently running for {running_sps:?}", request_id: "191984ee-975f-4941-bcfb-38a1a98bd5b0" }
5204 | 2023-12-11T18:05:02.827Z | stack backtrace:
5205 | 2023-12-11T18:05:02.827Z | 0: rust_begin_unwind
5206 | 2023-12-11T18:05:02.827Z | at /rustc/79e9716c980570bfd1f666e3b16ac583f0168962/library/std/src/panicking.rs:597:5
5207 | 2023-12-11T18:05:02.827Z | 1: core::panicking::panic_fmt
5208 | 2023-12-11T18:05:02.828Z | at /rustc/79e9716c980570bfd1f666e3b16ac583f0168962/library/core/src/panicking.rs:72:14
5209 | 2023-12-11T18:05:02.828Z | 2: core::result::unwrap_failed
5210 | 2023-12-11T18:05:02.828Z | at /rustc/79e9716c980570bfd1f666e3b16ac583f0168962/library/core/src/result.rs:1652:5
5211 | 2023-12-11T18:05:02.828Z | 3: core::result::Result<T,E>::expect
5212 | 2023-12-11T18:05:02.828Z | at /rustc/79e9716c980570bfd1f666e3b16ac583f0168962/library/core/src/result.rs:1034:23
5213 | 2023-12-11T18:05:02.828Z | 4: {async_block#0}
5214 | 2023-12-11T18:05:02.828Z | at ./tests/integration_tests/updates.rs:491:5
5215 | 2023-12-11T18:05:02.828Z | 5: poll<&mut dyn core::future::future::Future<Output=()>>
5216 | 2023-12-11T18:05:02.828Z | at /rustc/79e9716c980570bfd1f666e3b16ac583f0168962/library/core/src/future/future.rs:125:9
5217 | 2023-12-11T18:05:02.828Z | 6: {closure#0}<core::pin::Pin<&mut dyn core::future::future::Future<Output=()>>>
5218 | 2023-12-11T18:05:02.828Z | at /home/build/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.34.0/src/runtime/park.rs:282:63
5219 | 2023-12-11T18:05:02.829Z | 7: with_budget<core::task::poll::Poll<()>, tokio::runtime::park::{impl#4}::block_on::{closure_env#0}<core::pin::Pin<&mut dyn core::future::future::Future<Output=()>>>>
5220 | 2023-12-11T18:05:02.829Z | at /home/build/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.34.0/src/runtime/coop.rs:107:5
5221 | 2023-12-11T18:05:02.829Z | 8: budget<core::task::poll::Poll<()>, tokio::runtime::park::{impl#4}::block_on::{closure_env#0}<core::pin::Pin<&mut dyn core::future::future::Future<Output=()>>>>
5222 | 2023-12-11T18:05:02.830Z | at /home/build/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.34.0/src/runtime/coop.rs:73:5
5223 | 2023-12-11T18:05:02.830Z | 9: block_on<core::pin::Pin<&mut dyn core::future::future::Future<Output=()>>>
5224 | 2023-12-11T18:05:02.830Z | at /home/build/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.34.0/src/runtime/park.rs:282:31
5225 | 2023-12-11T18:05:02.830Z | 10: tokio::runtime::context::blocking::BlockingRegionGuard::block_on
5226 | 2023-12-11T18:05:02.830Z | at /home/build/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.34.0/src/runtime/context/blocking.rs:66:9
5227 | 2023-12-11T18:05:02.830Z | 11: tokio::runtime::scheduler::multi_thread::MultiThread::block_on::{{closure}}
5228 | 2023-12-11T18:05:02.830Z | at /home/build/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.34.0/src/runtime/scheduler/multi_thread/mod.rs:87:13
5229 | 2023-12-11T18:05:02.830Z | 12: tokio::runtime::context::runtime::enter_runtime
5230 | 2023-12-11T18:05:02.830Z | at /home/build/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.34.0/src/runtime/context/runtime.rs:65:16
5231 | 2023-12-11T18:05:02.830Z | 13: tokio::runtime::scheduler::multi_thread::MultiThread::block_on
5232 | 2023-12-11T18:05:02.830Z | at /home/build/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.34.0/src/runtime/scheduler/multi_thread/mod.rs:86:9
5233 | 2023-12-11T18:05:02.831Z | 14: tokio::runtime::runtime::Runtime::block_on
5234 | 2023-12-11T18:05:02.831Z | at /home/build/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.34.0/src/runtime/runtime.rs:350:45
5235 | 2023-12-11T18:05:02.831Z | 15: test_update_races
5236 | 2023-12-11T18:05:02.831Z | at ./tests/integration_tests/updates.rs:509:5
5237 | 2023-12-11T18:05:02.831Z | 16: mod::integration_tests::updates::test_update_races::{{closure}}
5238 | 2023-12-11T18:05:02.831Z | at ./tests/integration_tests/updates.rs:401:30
5239 | 2023-12-11T18:05:02.831Z | 17: core::ops::function::FnOnce::call_once
5240 | 2023-12-11T18:05:02.831Z | at /rustc/79e9716c980570bfd1f666e3b16ac583f0168962/library/core/src/ops/function.rs:250:5
5241 | 2023-12-11T18:05:02.831Z | 18: core::ops::function::FnOnce::call_once
5242 | 2023-12-11T18:05:02.831Z | at /rustc/79e9716c980570bfd1f666e3b16ac583f0168962/library/core/src/ops/function.rs:250:5
5243 | 2023-12-11T18:05:02.831Z | note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
5244 | 2023-12-11T18:05:02.832Z |  
5245 | 2023-12-11T18:05:02.832Z | PASS [   4.402s] wicketd::mod integration_tests::updates::test_installinator_fetch
5246 | 2023-12-11T18:05:02.832Z | PASS [   3.983s] wicketd::mod integration_tests::updates::test_updates
5247 | 2023-12-11T18:05:02.861Z | PASS [  17.547s] omicron-passwords test::test_smoke
5248 | 2023-12-11T18:05:05.554Z | PASS [  16.311s] omicron-test-utils dev::db::test::test_setup_database_default_dir
5249 | 2023-12-11T18:05:11.336Z | PASS [  16.360s] oximeter-db client::tests::test_single_node
5250 | 2023-12-11T18:05:15.780Z | PASS [  20.677s] oximeter-db client::tests::test_ensure_schema_walks_through_multiple_steps_replicated
5251 | 2023-12-11T18:06:15.782Z | SLOW [> 60.000s] oximeter-db client::tests::test_replicated
5252 | 2023-12-11T18:07:15.783Z | SLOW [>120.000s] oximeter-db client::tests::test_replicated
5253 | 2023-12-11T18:08:15.787Z | SLOW [>180.000s] oximeter-db client::tests::test_replicated
5254 | 2023-12-11T18:09:17.826Z | SLOW [>240.000s] oximeter-db client::tests::test_replicated
5255 | 2023-12-11T18:10:15.786Z | SLOW [>300.000s] oximeter-db client::tests::test_replicated
5256 | 2023-12-11T18:11:15.787Z | SLOW [>360.000s] oximeter-db client::tests::test_replicated
5257 | 2023-12-11T18:12:15.789Z | SLOW [>420.000s] oximeter-db client::tests::test_replicated
5258 | 2023-12-11T18:12:27.191Z | PASS [ 431.410s] oximeter-db client::tests::test_replicated
5259 | 2023-12-11T18:12:27.191Z | ------------
5260 | 2023-12-11T18:12:27.191Z | Summary [1960.776s] 940 tests run: 939 passed (13 slow, 1 leaky), 1 failed, 3 skipped
5261 | 2023-12-11T18:12:27.191Z | FAIL [   2.621s] wicketd::mod integration_tests::updates::test_update_races
5262 | 2023-12-11T18:12:27.198Z | error: test run failed
5263 | 2023-12-11T18:12:27.201Z | Command exited with non-zero status 100
5264 | 2023-12-11T18:12:27.201Z | Command being timed: "timeout 2h cargo nextest run --profile ci --locked --verbose"
5265 | 2023-12-11T18:12:27.201Z | User time (seconds): 9613.19
5266 | 2023-12-11T18:12:27.201Z | System time (seconds): 1399.51
5267 | 2023-12-11T18:12:27.201Z | Percent of CPU this job got: 560%
5268 | 2023-12-11T18:12:27.201Z | Elapsed (wall clock) time (h:mm:ss or m:ss): 32:45.70
5269 | 2023-12-11T18:12:27.201Z | Average shared text size (kbytes): 0
5270 | 2023-12-11T18:12:27.202Z | Average unshared data size (kbytes): 0
5271 | 2023-12-11T18:12:27.202Z | Average stack size (kbytes): 0
5272 | 2023-12-11T18:12:27.202Z | Average total size (kbytes): 0
5273 | 2023-12-11T18:12:27.202Z | Maximum resident set size (kbytes): 800360
5274 | 2023-12-11T18:12:27.202Z | Average resident set size (kbytes): 0
5275 | 2023-12-11T18:12:27.202Z | Major (requiring I/O) page faults: 340
5276 | 2023-12-11T18:12:27.202Z | Minor (reclaiming a frame) page faults: 50413726
5277 | 2023-12-11T18:12:27.202Z | Voluntary context switches: 43252700
5278 | 2023-12-11T18:12:27.202Z | Involuntary context switches: 10429010
5279 | 2023-12-11T18:12:27.202Z | Swaps: 0
5280 | 2023-12-11T18:12:27.202Z | File system inputs: 1488
5281 | 2023-12-11T18:12:27.202Z | File system outputs: 32225304
5282 | 2023-12-11T18:12:27.202Z | Socket messages sent: 0
5283 | 2023-12-11T18:12:27.205Z | Socket messages received: 0
5284 | 2023-12-11T18:12:27.205Z | Signals delivered: 0
5285 | 2023-12-11T18:12:27.205Z | Page size (bytes): 4096
5286 | 2023-12-11T18:12:27.205Z | Exit status: 100
5287 | 2023-12-11T18:12:27.205Z | process exited: duration 2814052 ms, exit code 100
@rcgoodfellow rcgoodfellow added the Test Flake Tests that work. Wait, no. Actually yes. Hang on. Something is broken. label Dec 11, 2023
@davepacheco
Copy link
Collaborator

Looks superficially like #4590 (fixed late last week) but the message here is different. @sunshowers can you take a look?

@rcgoodfellow
Copy link
Contributor Author

It looks like I have the fix to #4590 (#4641) in my branch.

@sunshowers
Copy link
Contributor

Thanks, I see the issue. There's a small likelihood of a race here, will require some contortions to fix.

sunshowers added a commit that referenced this issue Dec 12, 2023
The issue here was that we'd send a notification to the test that the
task was
finished, as the last thing in the task. This meant that there was a
period
where `task.is_finished()` was false but the test thought the task had
finished
and proceeded accordingly.

To fix this, use an `Arc<AtomicBool>` to indicate that the task is
finished.

Fixes #4668.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Test Flake Tests that work. Wait, no. Actually yes. Hang on. Something is broken.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants