-
-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fail PRs on links that became broken, but create issues for external links that just started breaking? #134
Comments
Not that I know of. #17 is at least related. If we introduce an argument for only checking changes, we might as well be able to add one for new files. |
Once we got that functionality you could have two runs: one for new links with |
Scope: PR diff (only links from changes)Your PR triggered workflow can provide It won't work reliably for some link syntax, such as:
For those it's probably a bit more reliable to at least check an entire document affected by passing the relevant files as inputs to Scope: Production branch (scheduled checks)The other concern, a scheduled workflow can run hourly/daily to check the production branch content. If you have any links to Github services that get checked these appear to be rate-limited, and I think that if you hit that it may risk impacting other workflows.
Ignoring known broken links from previous checksIn both cases, you could store the broken links detected on your production branch (and optionally reported on an issue by a bot) into a file that these workflows cache to restore/read from as links to ignore/exclude if necessary. If broken links are rare, it may be simpler to maintain the exclusion list yourself though. |
Is it possible to use the cache feature to require a pre-existing link to be broken for, say, 3 days in a row before reporting it as a failure? That way if a site just goes down for a little bit or just has an intermittent 503 it doesn't cause the link checker to fail. |
The cache file is to avoid repeated checks. For example, if the cache is valid for 24h and you run lychee multiple times within that time it would not check a link but immediately return the cached result. In that sense it's the opposite of what you want. Instead, you could run lychee without a cache and store the output in a file, which you can then cache on Github. On the next run you would download that file again and compare the contents with the latest run to determine which links are broken for longer. This would require some manual scripting work. |
Thank you all for the valuable input on this topic. After reviewing the discussion, it's clear that the core challenge here revolves around managing broken links in PRs and the
However, it's important to note that the implementation of some of these suggestions extends beyond the current scope of the lychee-action. Specifically, the idea of requiring a link to be broken for a certain duration before reporting it as a failure, while interesting, involves a level of complexity and manual scripting that is currently out of scope. As a workaround, I recommend considering the approach of having two separate runs for link checking: one for new links with Focusing on a workaround seems the best course of action. |
Is there an obvious recommended way to report issues for any link that became broken over time, and fail the PR for any introduced links that are immediately broken?
I want to specifically block any PRs that introduce broken links (mostly internal but doesn't have to be), and at the same time not break the master build if a website becomes unavailable.
The text was updated successfully, but these errors were encountered: