You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Apologies if this is not an eksctl issue and is instead related to my own environment, but after much fruitless research I am somewhat desperate.
I have just started using EKS and I setup a cluster with "eksctl create cluster" as per the README/guide - this is all seems to work, I just used the default options.
I run "kubectl get nodes" and it reports the nodes in my cluster as Ready.
I leave the cluster alone, I don't change any config nor do I deploy anything to it.
All is well until around 30 minutes later the nodes go to NotReady.
At the same time, in the instance system logs there are lots of new authorization failures being reported (I use "journalctl -u kubelet" to watch them).
So the kubelet process has seemingly lost its authorization to interact with the cluster after some timeout.
I have clearly missed something, but what?
The text was updated successfully, but these errors were encountered:
Can you see details of such authorization failure (e.g. IAM role)? I got similar issue but it's due to kubernetes-sigs/aws-iam-authenticator#268, and I did update CM after the cluster creation.
I see lots of authorisation failures, in the CloudWatch logs I spotted a failure that says ARN is not mapped and it gives the AmazonSSMRoleForInstancesQuickSetup role.
That role is the one returned by get-caller-identity on the instance.
Apologies if this is not an eksctl issue and is instead related to my own environment, but after much fruitless research I am somewhat desperate.
I have just started using EKS and I setup a cluster with "eksctl create cluster" as per the README/guide - this is all seems to work, I just used the default options.
I run "kubectl get nodes" and it reports the nodes in my cluster as Ready.
I leave the cluster alone, I don't change any config nor do I deploy anything to it.
All is well until around 30 minutes later the nodes go to NotReady.
At the same time, in the instance system logs there are lots of new authorization failures being reported (I use "journalctl -u kubelet" to watch them).
So the kubelet process has seemingly lost its authorization to interact with the cluster after some timeout.
I have clearly missed something, but what?
The text was updated successfully, but these errors were encountered: