Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GitHub proxy: download GitHub server keys #50891

Draft
wants to merge 2 commits into
base: STeve/48762_ssh
Choose a base branch
from

Conversation

greedy52
Copy link
Contributor

@greedy52 greedy52 commented Jan 8, 2025

@greedy52 greedy52 added the no-changelog Indicates that a PR does not require a changelog entry label Jan 8, 2025
}

func (c *githubMetadataHTTPClient) fetchFingerprints() ([]string, string, error) {
resp, err := http.Get(c.api)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use https://pkg.go.dev/github.com/google/go-github/v68/github#MetaService.Get instead? This is something that we likely only need to refresh at most every 24 hours or so.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a pretty simple get call so i would prefer not to import another package. i also like the idea doing If-None-Match.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd rather use pre-existing package to lower the maintenance costs, but it is true an extra dependency of this size is not trivial.

lib/srv/git/github.go Outdated Show resolved Hide resolved
lib/srv/git/github.go Outdated Show resolved Hide resolved
lib/srv/git/github.go Outdated Show resolved Hide resolved
@greedy52 greedy52 requested review from espadolini and Tener January 9, 2025 03:14

// Remove newline from ssh.MarshalAuthorizedKey.
key := strings.TrimSpace(string(ssh.MarshalAuthorizedKey(targetKey)))
if slices.Contains(m.keys, key) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid confusing errors, I'd check that m.keys is non-empty as a special case.

Comment on lines +60 to +66
return &githubServerKeyManager{
clock: clockwork.NewRealClock(),
apiEndpoint: "https://api.github.com/meta",
client: &http.Client{
Timeout: defaults.HTTPRequestTimeout,
},
}
Copy link
Contributor

@Tener Tener Jan 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We start with an empty list of keys that will only be refreshed on first request. This increases the latency and has unhappy consequences on error. Given how often GitHub experiences hiccups, I'd prefer to run refreshLocked in a separate goroutine, with variable schedule based on current status.

For example:

  1. Empty list of known keys? Retry every minute until we get the list.
  2. Non-empty list of known keys, but last attempt at refresh failed? Retry after 15 minutes.
  3. Non-empty list of known keys and last attempt worked? Repeat after 12h.

With this we decouple update schedule from checking, and check() method becomes very simple.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is definitely not great experience on unhappy scenarios. the api is also rate-limited.

So doing it in some schedular is preferable. However, I don't want to pull things from the api if they won't be used. so to do it properly, we need a new "service" registered to the proxy process and watch for Git servers. then only call the API when seeing github proxy is present. So i don't think it's worth it at the moment.

With the current implementation, we only need to successfully fetch once to unblock the flow. One thing i can improve with the current implementation, is to stop refreshing attempt if a failure has happened in the last 15 minutes. WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the requests are cheap enough it shouldn't matter if we fire a single one every few hours. On the other hand if the API is unavailable (e.g. due to firewall misconfiguration) it would be better to know that sooner rather than later.

Perhaps we can leave it as it is for now, but simply fire a single go refreshLocked() on startup?

@greedy52 greedy52 changed the title GitHub proxy: download GitHub server fingerprints GitHub proxy: download GitHub server keys Jan 9, 2025
@greedy52 greedy52 mentioned this pull request Jan 9, 2025
17 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
no-changelog Indicates that a PR does not require a changelog entry
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants