Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a timeout to TestWorld #1467

Merged
merged 4 commits into from
Dec 3, 2024

Conversation

andyleiserson
Copy link
Collaborator

There is a risk that this will introduce flakiness, if there are tests where the tail of the runtime distribution exceeds the timeout. On the other hand, we know that stalls are a predominant failure mode of our protocols, and a clear failure message is a lot nicer than a test job that hangs until somebody notices and kills it.

Copy link
Collaborator

@akoshelev akoshelev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

having a timeout makes sense to me, I've seen a couple of tests that take more than 60 secs to run.

It is much slower than in the single thread config.
Copy link

codecov bot commented Dec 3, 2024

Codecov Report

Attention: Patch coverage is 91.17647% with 9 lines in your changes missing coverage. Please review.

Project coverage is 93.81%. Comparing base (d3c7469) to head (6174f48).
Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
ipa-core/src/test_fixture/world.rs 87.27% 7 Missing ⚠️
ipa-core/src/protocol/dp/mod.rs 66.66% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1467      +/-   ##
==========================================
- Coverage   93.84%   93.81%   -0.03%     
==========================================
  Files         234      234              
  Lines       41826    41867      +41     
==========================================
+ Hits        39250    39276      +26     
- Misses       2576     2591      +15     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@andyleiserson
Copy link
Collaborator Author

having a timeout makes sense to me, I've seen a couple of tests that take more than 60 secs to run.

The new timeout applies to each TestWorld protocol execution using one of the helpers (world.semi_honest, etc.). Some of the long running tests are proptests that do this multiple times. There are also some long-running tests that don't use TestWorld at all, including the e2e::complete_query_* tests in query::processor.

After a few rounds of increasing timeouts due to failures in CI (wow the CI machines are slow), there was only one test that seemed to need more than a 60s timeout -- the large_batch test in dzkp_validator when running with multi-threading. This test is much slower in the multi-threading config than the single-thread config, I am guessing because the work it does for each record is very small compared to the overhead of spawning tasks, so I just disabled it in the multi-threading config.

I don't think it's worth being aggressive with timeouts for unit tests (the goal is to catch hangs, not identify performance regressions), so I'm happy to increase the timeouts further if there are places where you think they are too aggressive.

@andyleiserson andyleiserson merged commit dca5be7 into private-attribution:main Dec 3, 2024
12 checks passed
@andyleiserson andyleiserson deleted the timeout branch December 3, 2024 20:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants