-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[testing] Reduce flaky tests by retrying git failures #9907
Comments
From refinement:
|
Hit this again on 1.3 and 1.4 today |
@emmyoop It looks like Given this information, I'm not sure if adding retries to our test runner (tox in this case) would improve the situation. Similar issue in a GCP repo: GoogleCloudPlatform/python-docs-samples#3485 (comment) Thoughts? |
Opened a new issue in dbt-labs/docs.getdbt.com: dbt-labs/docs.getdbt.com#5504 |
…ts due to network failures (#10143) (cherry picked from commit 751139d) Co-authored-by: Kshitij Aranke <[email protected]>
…ts due to network failures (#10142) (cherry picked from commit 751139d) Co-authored-by: Kshitij Aranke <[email protected]>
hey @aranke , it looks like this opened a docs issue -- can I double check what customer-facing changes are needed? from skimming this issue, it looks like this is more internal testing? |
Housekeeping
Short description
We have a lot of tests that are failing because of Git connection issues. Sometimes tox fails to install all dependencies and that causes the entire test run to fail without actually running any tests. This makes our monitoring noisy.
Suggested approach: leveraging something the
nick-fields/retry@v3
action (example but in the tox invocation here)Acceptance criteria
Anytime we use git when testing, have retry logic
Suggested Tests
This task is specifically for tests
-- can force a test to fail in a commit & observe the retry works as expected at the integration group level
Impact to Other Teams
Adapters team won't be impacted but may be interested if we come up with a solution
Will backports be required?
backport as far as we can to reduce this noise
Context
log output from test failing on tox
Sample of tests marked as flaky but are likely just connection issues. There may not be a solution when there's a longer GitHub outage. Look through #7808 for other possible failures.
#9906
max retries exceeded
#9905
#9903
timeout
#9902
#9900
Note: integration tests are run with the workflow_dispatch trigger in scheduled testing here. typically it would be run with workflow_call trigger but isn't because it's special (comment)
The text was updated successfully, but these errors were encountered: