Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

discussion: Shall we let the CI run more rounds of the tests for highly critical/complicated PRs #4941

Closed
lmatz opened this issue Aug 29, 2022 · 5 comments
Assignees

Comments

@lmatz
Copy link
Contributor

lmatz commented Aug 29, 2022

Shall we let the CI run more rounds of tests for highly critical/complicated PRs?

i.e., We associate a PR with a GitHub label critical. Buildkite figures out the label, and then test multiple rounds/spend more time in CI.

Originally posted by @lmatz in #4857 (comment)

For deterministic ones, can we simulate more cases, e.g. use multiple random seeds.....?

@wangrunji0408
Copy link
Contributor

After #4917, deterministic parallel e2e tests will be run 10 times with different seeds for each PR.
We will continue optimizing the compile time and execution time to allow running more rounds in a reasonable time.
Meanwhile, we will explore the ways to improve the efficiency, to hunt bugs within less rounds.

@lmatz
Copy link
Contributor Author

lmatz commented Aug 29, 2022

I see, more efficient testing and more effective bug detection are definitely the way to go.

I guess I am just not sure how good the ways to improve the efficiency, to hunt bugs within less rounds. would eventually be, considering it depends on the indeterministic nature of the bug. Not in the sense of how much is improved relatively, which I am very much optimistic, but how good it will be at the absolute scale.

Is 10 times good enough?
Since we largely cannot predict the chance of detecting the bug, I feel we need to be on the pessimistic side. And we don't want to run the same amount of tests for a simple/deterministic PR.

@wangrunji0408
Copy link
Contributor

Good point! I think we should first evaluate the simulator itself about how effective it can hunt bugs. We can manually construct several common bugs or use existing bugs to see how many rounds the simulator needs to run before catching it. After quantifying the effectiveness, we can determine the number of rounds in CI with confidence.

@fuyufjh fuyufjh modified the milestone: release-0.1.13 Aug 31, 2022
@fuyufjh fuyufjh changed the title Shall we let the CI run more rounds of the tests for highly critical/complicated PRs discussion: Shall we let the CI run more rounds of the tests for highly critical/complicated PRs Sep 5, 2022
@lmatz
Copy link
Contributor Author

lmatz commented Sep 26, 2022

Seems we have demand for longevity testing #4966 (review), it is now done manually.

After #5330 is done, maybe we try it first.

As the longevity test takes a lot of time, we may schedule the test for these critical PRs at night only.

A trivial way to achieve this is to assign high-priority to all the other non-critical PRs, and use the standard label for critical ones only.
But we may find a better way to achieve this.

@github-actions
Copy link
Contributor

This issue has been open for 60 days with no activity. Could you please update the status? Feel free to continue discussion or close as not planned.

@lmatz lmatz closed this as not planned Won't fix, can't repro, duplicate, stale May 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants