Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: cmd/go: reduce calls to module proxy /list for deprecation checks in run / install #71414

Open
redloaf opened this issue Jan 24, 2025 · 3 comments
Milestone

Comments

@redloaf
Copy link

redloaf commented Jan 24, 2025

Proposal Details

Background

We run a fairly large (~16TB/day) go module proxy and have noticed a recent increase in requests to the list endpoint, which is inherently not long-term cacheable. Large public proxies cache responses with a short TTL (e.g. the official proxy.golang.org caches for 60s). For us, using Athens with S3, this traffic increase caused us to hit rate limits of the S3 ListObjectsV2 API, increasing latency, causing timeouts, and failing build steps.

The timing of this seemed to correlate with increasing adoption of go1.23. We traced this increase in list API calls to a new behavior added in go run and go install wherein these tools now check if the requested module has been deprecated. You can see this if you install an older version of a deprecated library. For example, go install -x github.com/golang/protobuf/[email protected] prints these additional lines when running go1.23:

# get https://proxy.golang.org/github.com/golang/protobuf/@v/list
# get https://proxy.golang.org/github.com/golang/protobuf/@v/list: 200 OK
# get https://proxy.golang.org/github.com/golang/protobuf/@v/v1.5.4.info
# get https://proxy.golang.org/github.com/golang/protobuf/@v/v1.5.4.info: 200 OK
# get https://proxy.golang.org/github.com/golang/protobuf/@v/v1.5.4.mod
# get https://proxy.golang.org/github.com/golang/protobuf/@v/v1.5.4.mod: 200 OK
go: module github.com/golang/protobuf is deprecated: Use the "google.golang.org/protobuf" module instead.

Proposal

This is a nice feature, but it would be better if it was one we could opt into or out of. Some unattended workloads install or run specific versions of packages, and the addition of a call to a no-cache or low-TTL go module proxy endpoint has reduced the reliability of the go install and go run commands in these workloads. We handle deprecated modules using separate tooling, so these deprecation messages are largely ignored.

Alternatively, the go module proxy API could be extended to include a version deprecation check endpoint and could maybe fall back to the go 1.23 behavior if that endpoint responds with an error. This would allow for longer-duration caching of deprecation check results without impeding the freshness of the list endpoint.

Related, it seems any go commands that fetch modules could benefit from retries with exponential backoff. The deprecation checks in go 1.23 result in more overall requests to the go module proxy and now include a call to the most error-prone endpoint. Especially if the current behavior is preserved, failed requests should be retried, perhaps after trying all subsequent proxies.

@gopherbot gopherbot added this to the Proposal milestone Jan 24, 2025
@ianlancetaylor ianlancetaylor changed the title proposal: cmd/go/internal/load: go install/run deprecation check module proxy behavior proposal: cmd/go: go install/run deprecation check module proxy behavior Jan 24, 2025
@ianlancetaylor ianlancetaylor added the GoCommand cmd/go label Jan 24, 2025
@ianlancetaylor
Copy link
Member

CC @matloob @samthanawalla

@seankhliao
Copy link
Member

seankhliao commented Jan 24, 2025

isn't this an Athens deficiency of not separating index/metadata (which is requested much more frequently and would benefit from a local db) from full module storage?

Since Athens is also responsible for populating it's backing storage, it should be capable of generating an up to date list only when modules change, not on every client call.

I also don't see how a new endpoint would reduce load

@seankhliao seankhliao changed the title proposal: cmd/go: go install/run deprecation check module proxy behavior proposal: cmd/go: reduce calls to module proxy /list for deprecation checks in run / install Jan 24, 2025
@redloaf
Copy link
Author

redloaf commented Jan 24, 2025

isn't this an Athens deficiency of not separating index/metadata (which is requested much more frequently and would benefit from a local db) from full module storage?

I'm not able to address this completely. Athens has an index, but it's maybe more like https://index.golang.org/index and (AFAIK) isn't used for serving list requests. I logged an issue, gomods/athens#2027, to bring this to the attention of the Athens team. I'd be interested in their feedback on this.

Since Athens is also responsible for populating it's backing storage, it should be capable of generating an up to date list only when modules change, not on every client call.

At least in our setup, Athens behaves like a read-through cache. Configuration options such as NetworkMode=offline can make Athens serve listings using only versions it's aware of, but using that without a separate mechanism to update that list would result in breakage of this new behavior, as newer versions would not be fetched from VCS to determine if the version being requested is deprecated. We've contemplated adopting that for some workloads; however as mentioned we've already hit storage system rate limits for listings, so we would need a bespoke index or cache for listings.

I also don't see how a new endpoint would reduce load

It is common practice to cache list API calls for a short period of time because the freshness of the data is important. Adding a new API call would offer an opportunity to cache responses with a higher TTL and/or treat certain modules differently. For example, having a version listing that's stale for a minute is fine, but having it stale for a day is probably not ideal. Contrast that with the deprecation status of a module, where a day (or even a week or more) is probably fine. The higher TTL would reduce load. On the topic of treating other modules differently: we don't ever mark our internal modules deprecated, so performing the check on them isn't useful. We have separate tooling to upgrade versions and to ensure modules aren't referenced if their VCS repos are archived. However since go 1.23 was released we've seen higher load on our VCS system as a result of these additional list calls as the tooling checks for a condition that, for our internal modules, will never be true.

I don't think this is isolated to Athens. I'm not sure what software proxy.golang.org runs, but you can easily make it time out by requesting a listing of a version that doesn't exist. For example, time curl -v https://proxy.golang.org/google.golang.org/api/v$RANDOM/@v/list times out after 58 seconds with a 404 and body not found: list timed out with a TTL of 60s.

At least in our environment, the list API call has the highest latency and highest failure rate and lowest cache TTL. The new reliance on it in 1.23's tooling has, as you'd expect, resulted in higher latency and failure rates for any workflows using go install or go run.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Incoming
Development

No branches or pull requests

4 participants