eksctl created cluster nodes regress to NotReady #1907

caprica · 2020-03-08T16:48:25Z

Apologies if this is not an eksctl issue and is instead related to my own environment, but after much fruitless research I am somewhat desperate.

I have just started using EKS and I setup a cluster with "eksctl create cluster" as per the README/guide - this is all seems to work, I just used the default options.

I run "kubectl get nodes" and it reports the nodes in my cluster as Ready.

I leave the cluster alone, I don't change any config nor do I deploy anything to it.

All is well until around 30 minutes later the nodes go to NotReady.

At the same time, in the instance system logs there are lots of new authorization failures being reported (I use "journalctl -u kubelet" to watch them).

So the kubelet process has seemingly lost its authorization to interact with the cluster after some timeout.

I have clearly missed something, but what?

sayboras · 2020-03-09T10:22:45Z

Can you see details of such authorization failure (e.g. IAM role)? I got similar issue but it's due to kubernetes-sigs/aws-iam-authenticator#268, and I did update CM after the cluster creation.

caprica · 2020-03-09T11:14:02Z

I see lots of authorisation failures, in the CloudWatch logs I spotted a failure that says ARN is not mapped and it gives the AmazonSSMRoleForInstancesQuickSetup role.

That role is the one returned by get-caller-identity on the instance.

caprica · 2020-03-09T21:56:23Z

I'm running some extended tests but I think I have resolved it.

I was thrown off by everything working properly at first, so I never bothered to map a different IAM role in the aws-auth config.

So it seems when the cluster is created it somehow has an initial authorisation which is subsequently permanently lost.

There are indeed warnings in the documentation about the need for mapping a role, I guess I should have taken those warnings more seriously.

sayboras · 2020-03-09T23:26:36Z

It's fine, glad that you managed to solve it 👍

caprica added the kind/help Request for help label Mar 8, 2020

caprica closed this as completed Mar 9, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

eksctl created cluster nodes regress to NotReady #1907

eksctl created cluster nodes regress to NotReady #1907

caprica commented Mar 8, 2020

sayboras commented Mar 9, 2020

caprica commented Mar 9, 2020

caprica commented Mar 9, 2020

sayboras commented Mar 9, 2020 •

edited

Loading

eksctl created cluster nodes regress to NotReady #1907

eksctl created cluster nodes regress to NotReady #1907

Comments

caprica commented Mar 8, 2020

sayboras commented Mar 9, 2020

caprica commented Mar 9, 2020

caprica commented Mar 9, 2020

sayboras commented Mar 9, 2020 • edited Loading

sayboras commented Mar 9, 2020 •

edited

Loading