-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE] CI Test should always terminate after 1 hour #14680
Comments
@lupyuen maybe we also need to put a maximum number of minutes for a job to run. GitHub Actions timeout |
@simbit18 Yep right now it quits after 6 hours: https://github.com/NuttX/nuttx/actions/runs/11714861244 |
@lupyuen so we should put
|
@simbit18 Hmmm suppose right after CI Test there's another build. If CI Test runs for all 3 hours, then the build after CI Test will never run. So actually I prefer if CI Test could terminate itself (after 1 hour) and let other builds run. Unless we always park CI Test at the end of the job? |
This in my opinion is the right solution The GitHub Actions timeout is only for safety and not to fall back into the tunnel #14376 |
It's happening again: |
Wonder if this will work for GitHub CI? I'm testing it for macOS Build Farm: ## If CI Test Hangs: Kill it after 1 hour
( sleep 3600 ; echo Killing pytest... ; pkill -f pytest )&
## Run the CI Job
./cibuild.sh -i -c -A -R testlist/$job.dat |
Yep this kills the CI Test after 2 hours! (Assuming our jobs are not supposed to exceed 2 hours) We changed build.yml: cd sources/nuttx/tools/ci
if [ "X${{matrix.boards}}" = "Xcodechecker" ]; then
./cibuild.sh -c -A -N -R --codechecker testlist/${{matrix.boards}}.dat
else
## Inserted this
( sleep 7200 ; echo Killing pytest... ; pkill -f pytest )&
./cibuild.sh -c -A -N -R -S testlist/${{matrix.boards}}.dat
fi (Build Log says "Killing pytest... Terminated" and fails correctly later)
|
CI Test will sometimes run for 6 hours (before getting auto-terminated by GitHub): - apache#14808 - apache#14680 This is a problem because: - It will increase our usage of GitHub Runners. Which may overrun the [GitHub Actions Budget](https://infra.apache.org/github-actions-policy.html) allocated by ASF. - Suppose right after CI Test there's another build. If CI Test runs for all 6 hours, then the build after CI Test will never run. For this PR: We assume that Every CI Job (e.g. risc-v-05) will complete normally within 2 hours. If any CI Job exceeds 2 hours: This PR will kill the CI Test Process `pytest` and allow the next build to run.
CI Test will sometimes run for 6 hours (before getting auto-terminated by GitHub): - #14808 - #14680 This is a problem because: - It will increase our usage of GitHub Runners. Which may overrun the [GitHub Actions Budget](https://infra.apache.org/github-actions-policy.html) allocated by ASF. - Suppose right after CI Test there's another build. If CI Test runs for all 6 hours, then the build after CI Test will never run. For this PR: We assume that Every CI Job (e.g. risc-v-05) will complete normally within 2 hours. If any CI Job exceeds 2 hours: This PR will kill the CI Test Process `pytest` and allow the next build to run.
CI Test will sometimes run for 6 hours (before getting auto-terminated by GitHub): - apache#14808 - apache#14680 This is a problem because: - It will increase our usage of GitHub Runners. Which may overrun the [GitHub Actions Budget](https://infra.apache.org/github-actions-policy.html) allocated by ASF. - Suppose right after CI Test there's another build. If CI Test runs for all 6 hours, then the build after CI Test will never run. For this PR: We assume that Every CI Job (e.g. risc-v-05) will complete normally within 2 hours. If any CI Job exceeds 2 hours: This PR will kill the CI Test Process `pytest` and allow the next build to run.
Is your feature request related to a problem? Please describe.
CI Test will sometimes run for 6 hours (before getting killed by GitHub):
This is not so great because:
Describe the solution you'd like
CI Test should complete within 1 hour. It should gracefully terminate itself (and report an error) if the runtime exceeds 1 hour.
Describe alternatives you've considered
Right now I'm manually killing all CI Jobs that run over 3 hours. And restarting the Ubuntu PCs in our NuttX Build Farm.
Verification
The text was updated successfully, but these errors were encountered: