Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Intermittent Node Authorizer Forbidden Errors #4727

Open
phil-fileread opened this issue Jan 1, 2025 · 0 comments
Open

[BUG] Intermittent Node Authorizer Forbidden Errors #4727

phil-fileread opened this issue Jan 1, 2025 · 0 comments
Labels

Comments

@phil-fileread
Copy link

phil-fileread commented Jan 1, 2025

Describe the bug
A clear and concise description of what the bug is.

We are running the latest version of Github Actions arc runners in our AKS cluster (v1.29.1 - free tier - 10.5.6.0/24 service CIDR )
Node pools: Standard_d8ads_v6 / AKSUbuntu-2204gen2containerd-202412.10.0

When the arc runner controller spin up a container to run a Github Actions workflow, we often see the job failing with ECONNREFUSED 10.5.6.1:443. This happens typically in the "Initialize container" or "Stop containers' pre/post workflow tasks.

At this point, it is not clear what triggers this behaviour.

That said, we see the following logs via diagnostic settings:

I0101 16:13:23.562959 1 node_authorizer.go:205] "NODE DENY" err="node 'aks-cicd-26340822-vmss000006' cannot get unknown secret arc-runners/arc-runner-set-redacted-lr4hc-runner-fqw7h"
And the following logs in the systemd journal logs for the kubelet:

Note the following logs show an additional error on a resource unrelated to GHA arc runners which suggests this is not necessarily scoped to GHA runners and therefore a broader AKS issue.

Jan 01 02:31:24 aks-cicd-26340822-vmss000000 kubelet[2812]: E0101 02:31:24.056768 2812 reflector.go:147] object-"kube-system"/"metrics-server-config": Failed to watch *v1.ConfigMap: failed to list *v1.ConfigMap: configmaps "metrics-server-config" is forbidden: User "system:node:aks-cicd-26340822-vmss000000" cannot list resource "configmaps" in API group "" in the namespace "kube-system": no relationship found between node 'aks-cicd-26340822-vmss000000' and this object

Jan 01 16:19:51 aks-cicd-26340822-vmss000003 kubelet[2935]: E0101 16:19:51.737343 2935 reflector.go:147] object-"arc-runners"/"arc-runner-set-redacted-lr4hc-runner-czzj2": Failed to watch *v1.Secret: failed to list *v1.Secret: secrets "arc-runner-set-redacted-lr4hc-runner-czzj2" is forbidden: User "system:node:aks-cicd-26340822-vmss000003" cannot list resource "secrets" in API group "" in the namespace "arc-runners": no relationship found between node 'aks-cicd-26340822-vmss000003' and this object

Of note, we have not adjusted any cluster settings related to the node authorizer or the additional permissions node seems to receive via RBAC since cluster initialization.

To Reproduce
We cannot seem to reliably reproduce this issue.

Expected behavior
The pods should succeed without failure.

Screenshots
If applicable, add screenshots to help explain your problem.

Environment (please complete the following information):
Client Version: v1.30.2
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.29.4
AKS free tier
10.5.6.0/24 service CIDR
Node pools: Standard_d8ads_v6 / AKSUbuntu-2204gen2containerd-202412.10.0

Additional context
Add any other context about the problem here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant