Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Work around codecov.io connection failures #396

Closed
wants to merge 11 commits into from
Closed

Work around codecov.io connection failures #396

wants to merge 11 commits into from

Conversation

rly
Copy link
Contributor

@rly rly commented Jul 9, 2020

Motivation

codecov.io has known random connection issues such that around 10% of the time, coverage reports from CI services fail to be uploaded to codecov.io. See:
codecov/codecov-python#158
https://community.codecov.io/t/unreliable-coverage-report-uploads/322
https://community.codecov.io/t/github-pr-not-being-updated-with-final-coverage-data/1425/15
h5py/h5py#1398

The most likely issue is that the codecov server is randomly blocking requests from the CI machine. This could be due to rate limiting or a long-term ban on the IP address. When testing this PR, re-running the codecov script on failure did not seem to resolve the issue: (Error: HTTPSConnectionPool(host='codecov.io', port=443): Max retries exceeded with url: /upload/v4<...>(Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x000001FA35482DC0>: Failed to establish a new connection: [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond'))
`)

The error may also result from proxy issues or network issues, though this seems less likely given the repeated failures and other people's experiences. So, this might not be solvable on our end. The only resolution might be to re-run the CI if uploading to codecov fails.

This PR tries re-running the codecov script on failure, after a delay to get around rate limits and intermittent network failures, but really, I think the problem needs to be addressed by codecov.

Checklist

  • Have you checked our Contributing document?
  • Have you ensured the PR clearly describes the problem and the solution?
  • Is your contribution compliant with our coding style? This can be checked running flake8 from the source directory.
  • Have you checked to ensure that there aren't other open Pull Requests for the same change?
  • Have you included the relevant issue number using "Fix #XXX" notation where XXX is the issue number? By including "Fix #XXX" you allow GitHub to close issue #XXX when the PR is merged.

@rly rly requested a review from a team July 9, 2020 00:20
oruebel
oruebel previously approved these changes Jul 9, 2020
@codecov
Copy link

codecov bot commented Jul 9, 2020

Codecov Report

Merging #396 into dev will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##              dev     #396   +/-   ##
=======================================
  Coverage   75.70%   75.70%           
=======================================
  Files          33       33           
  Lines        6651     6651           
  Branches     1454     1454           
=======================================
  Hits         5035     5035           
  Misses       1216     1216           
  Partials      400      400           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7eff8c7...a17ce44. Read the comment docs.

@rly rly marked this pull request as ready for review July 9, 2020 04:35
@rly rly requested a review from oruebel July 9, 2020 07:40
@rly
Copy link
Contributor Author

rly commented Jul 22, 2020

Analysis of the failed CI runs shows that the attempted fix here does not resolve the problem. It is likely that the codecov server has blocked some CI servers from pinging it, so even repeated attempts after a delay do not resolve the issue and simply slow down CI. So I am closing this PR unless the issue returns and can be resolved by these changes.

@rly rly closed this Jul 22, 2020
@rly rly deleted the fx/codecov_flaky branch September 8, 2022 00:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants