Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Auto merge of #13946 - iliana:13555-object-not-found, r=weihanglo
fetch specific commits even if the github fast path fails ### What does this PR try to resolve? This PR fixes #13555, which describes a regression from 1.64.0 to 1.65.0 where the inability to fetch commit information from api.github.com (the "GitHub fast path") silently changes Cargo's behavior. Cargo can fetch a specific Git commit from a remote without having to fetch all refs. Prior to #10807, this functionality required a repository hosted on github.com and providing the full commit hash (usually available from the Cargo.lock); after that change, any revision (including abbreviated revisions) that could be resolved by GitHub's API could be fetched directly. However, this logic requires the "GitHub fast path", which was not intended to be robust, to successfully return the resolved commit hash; if a client is currently rate-limited by api.github.com (very common in CI and shared cloud / corporate environments) this fails and Cargo falls back to fetching all refs. Usually this is not noticeable. However, GitHub allows fetching commits that are related to the repository but not actually part of any of its refs, including commits pushed to a fork. This results in the same command working fine in some environments where api.github.com is accessible, and not working in other environments that are rate-limited, which is very confusing and difficult to debug. This change adds another branch to cover the regression case: if we are going through the GitHub fast path with a full commit hash, return early indicating that we need to fetch it. (Previously: ~~when the GitHub fast path was unsuccessful, the user is not using the unstable shallow clone options, and we have a full commit hash and expect to be able to fetch it directly because we know it's a github.com repository.~~) ### How should we test and review this PR? I have been testing this PR by temporarily adding a `0.0.0.0 api.github.com` entry to my `/etc/hosts`, which causes the GitHub fast path to always fail, then running: ``` target/release/cargo install --git https://github.com/haha-business/unstable-test-repo.git --rev c9040898c9183ddbb9402dcbf749ed06d6ea90ad ``` This refers to [a particular commit on a fork of the repo](iliana/unstable-test-repo@c904089) which won't be found by the fallback path or current Cargo. **Note** that you will need to delete `~/.cargo/git/checkouts/unstable-test-repo-*` and `~/.cargo/git/db/unstable-test-repo-*` after a successful run with this change in order to reproduce the broken behavior of the current release. I am having trouble getting the test suite to run at all on my system so I haven't experimented with writing a specific test for this case, but I probably should. ### Additional information This uses the same logic as the unstable shallow clone support to detect if the revision is a full commit hash. This is not compatible with SHA-256 commit hashes; `git2::Oid` specifically expects a 40-character hexadecimal string. Given that the change introducing this bug was meant to future-proof SHA-256 support (despite only doing so for GitHub repositories), it might be good to make the logic more explicit within Cargo and allow either 40- or 64-character hex strings. I wanted to keep this change focused on the regression fix, but in testing, pretty much every Git repository I could think of (including non-forges, like git.kernel.org and some repositories I host on my own infrastructure with cgit) supports fetching directly from a commit, so it would be ideal to eventually relax the GitHub requirement for this functionality. However, it would need some sort of fallback logic because I suspect the HTTP [dumb protocol](https://git-scm.com/book/en/v2/Git-Internals-Transfer-Protocols) doesn't support commit references, and I haven't researched when this functionality was added to the smart protocol.
- Loading branch information