You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
A clear and concise description of what the bug is.
We are running the latest version of Github Actions arc runners in our AKS cluster (v1.29.1 - free tier - 10.5.6.0/24 service CIDR )
Node pools: Standard_d8ads_v6 / AKSUbuntu-2204gen2containerd-202412.10.0
When the arc runner controller spin up a container to run a Github Actions workflow, we often see the job failing with ECONNREFUSED 10.5.6.1:443. This happens typically in the "Initialize container" or "Stop containers' pre/post workflow tasks.
At this point, it is not clear what triggers this behaviour.
That said, we see the following logs via diagnostic settings:
I0101 16:13:23.562959 1 node_authorizer.go:205] "NODE DENY" err="node 'aks-cicd-26340822-vmss000006' cannot get unknown secret arc-runners/arc-runner-set-redacted-lr4hc-runner-fqw7h"
And the following logs in the systemd journal logs for the kubelet:
Note the following logs show an additional error on a resource unrelated to GHA arc runners which suggests this is not necessarily scoped to GHA runners and therefore a broader AKS issue.
Jan 01 02:31:24 aks-cicd-26340822-vmss000000 kubelet[2812]: E0101 02:31:24.056768 2812 reflector.go:147] object-"kube-system"/"metrics-server-config": Failed to watch *v1.ConfigMap: failed to list *v1.ConfigMap: configmaps "metrics-server-config" is forbidden: User "system:node:aks-cicd-26340822-vmss000000" cannot list resource "configmaps" in API group "" in the namespace "kube-system": no relationship found between node 'aks-cicd-26340822-vmss000000' and this object
Jan 01 16:19:51 aks-cicd-26340822-vmss000003 kubelet[2935]: E0101 16:19:51.737343 2935 reflector.go:147] object-"arc-runners"/"arc-runner-set-redacted-lr4hc-runner-czzj2": Failed to watch *v1.Secret: failed to list *v1.Secret: secrets "arc-runner-set-redacted-lr4hc-runner-czzj2" is forbidden: User "system:node:aks-cicd-26340822-vmss000003" cannot list resource "secrets" in API group "" in the namespace "arc-runners": no relationship found between node 'aks-cicd-26340822-vmss000003' and this object
Of note, we have not adjusted any cluster settings related to the node authorizer or the additional permissions node seems to receive via RBAC since cluster initialization.
To Reproduce
We cannot seem to reliably reproduce this issue.
Expected behavior
The pods should succeed without failure.
Screenshots
If applicable, add screenshots to help explain your problem.
Environment (please complete the following information):
Client Version: v1.30.2
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.29.4
AKS free tier
10.5.6.0/24 service CIDR
Node pools: Standard_d8ads_v6 / AKSUbuntu-2204gen2containerd-202412.10.0
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered:
Describe the bug
A clear and concise description of what the bug is.
We are running the latest version of Github Actions arc runners in our AKS cluster (v1.29.1 - free tier - 10.5.6.0/24 service CIDR )
Node pools: Standard_d8ads_v6 / AKSUbuntu-2204gen2containerd-202412.10.0
When the arc runner controller spin up a container to run a Github Actions workflow, we often see the job failing with
ECONNREFUSED 10.5.6.1:443
. This happens typically in the "Initialize container" or "Stop containers' pre/post workflow tasks.At this point, it is not clear what triggers this behaviour.
That said, we see the following logs via diagnostic settings:
I0101 16:13:23.562959 1 node_authorizer.go:205] "NODE DENY" err="node 'aks-cicd-26340822-vmss000006' cannot get unknown secret arc-runners/arc-runner-set-redacted-lr4hc-runner-fqw7h"
And the following logs in the systemd journal logs for the kubelet:
Note the following logs show an additional error on a resource unrelated to GHA arc runners which suggests this is not necessarily scoped to GHA runners and therefore a broader AKS issue.
Jan 01 02:31:24 aks-cicd-26340822-vmss000000 kubelet[2812]: E0101 02:31:24.056768 2812 reflector.go:147] object-"kube-system"/"metrics-server-config": Failed to watch *v1.ConfigMap: failed to list *v1.ConfigMap: configmaps "metrics-server-config" is forbidden: User "system:node:aks-cicd-26340822-vmss000000" cannot list resource "configmaps" in API group "" in the namespace "kube-system": no relationship found between node 'aks-cicd-26340822-vmss000000' and this object
Jan 01 16:19:51 aks-cicd-26340822-vmss000003 kubelet[2935]: E0101 16:19:51.737343 2935 reflector.go:147] object-"arc-runners"/"arc-runner-set-redacted-lr4hc-runner-czzj2": Failed to watch *v1.Secret: failed to list *v1.Secret: secrets "arc-runner-set-redacted-lr4hc-runner-czzj2" is forbidden: User "system:node:aks-cicd-26340822-vmss000003" cannot list resource "secrets" in API group "" in the namespace "arc-runners": no relationship found between node 'aks-cicd-26340822-vmss000003' and this object
Of note, we have not adjusted any cluster settings related to the node authorizer or the additional permissions node seems to receive via RBAC since cluster initialization.
To Reproduce
We cannot seem to reliably reproduce this issue.
Expected behavior
The pods should succeed without failure.
Screenshots
If applicable, add screenshots to help explain your problem.
Environment (please complete the following information):
Client Version: v1.30.2
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.29.4
AKS free tier
10.5.6.0/24 service CIDR
Node pools: Standard_d8ads_v6 / AKSUbuntu-2204gen2containerd-202412.10.0
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: