Error during EKS Creation: Post "http://localhost/api/v1/namespaces/kube-system/configmaps": dial tcp 127.0.0.1:80: connect: connection refused #1280

lbornov2 · 2021-03-21T10:20:03Z

Description

When creating an EKS Cluster using terraform, we get the following error:

Error: Post "http://localhost/api/v1/namespaces/kube-system/configmaps": dial tcp 127.0.0.1:80: connect: connection refused

  on .terraform/modules/deployment.eks/aws_auth.tf line 65, in resource "kubernetes_config_map" "aws_auth":
  65: resource "kubernetes_config_map" "aws_auth" {

To fix this, we have to manually run:

aws eks update-kubeconfig --name ${var.context.app_name} --region ${var.context.region}

and then:

terraform apply -auto-approve

Versions

Terraform: 0.12.21
Provider(s):

aws - 3.22.0
terraform-aws-modules/eks/aws - 14.0.0
terraform-aws-modules/vpc/aws - 2.61.0

AWS CLI: 2.0.30
Helm: 3.3.4
Kubectl: 1.19.0

Reproduction

Run the terraform code in the Code Snippet to Reproduce section
Run terraform init && terraform apply -auto-approve
You will get the error in the description.

To fix, manually run:

aws eks update-kubeconfig --name ${var.context.app_name} --region ${var.context.region}
terraform apply -auto-approve

Code Snippet to Reproduce

terraform {
  required_version = ">= 0.12.21"
}

provider "aws" {
  version = "~> 3.22.0"
  region  = "${var.context.region}"
}


data "aws_availability_zones" "available" {}

resource "random_string" "suffix" {
  length  = 8
  special = false
}

module "vpc" {
  source  = "terraform-aws-modules/vpc/aws"
  version = "2.61.0"

  name                 = "${var.context.app_name}"
  cidr                 = "10.0.0.0/16"
  azs                  = data.aws_availability_zones.available.names
  private_subnets      = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
  public_subnets       = ["10.0.4.0/24", "10.0.5.0/24", "10.0.6.0/24"]
  enable_nat_gateway   = true
  single_nat_gateway   = true
  enable_dns_hostnames = true

  tags = {
    "kubernetes.io/cluster/${var.context.app_name}" = "shared"
  }

  public_subnet_tags = {
    "kubernetes.io/cluster/${var.context.app_name}" = "shared"
    "kubernetes.io/role/elb"                      = "1"
  }

  private_subnet_tags = {
    "kubernetes.io/cluster/${var.context.app_name}" = "shared"
    "kubernetes.io/role/internal-elb"             = "1"
  }
}

resource "aws_security_group" "all_worker_mgmt" {
  name_prefix = "${var.context.app_name}-all_worker_management"
  vpc_id      = "${module.vpc.vpc_id}"

  ingress {
    from_port = 22
    to_port   = 22
    protocol  = "tcp"

    cidr_blocks = [
      "10.0.0.0/8",
      "172.16.0.0/12",
      "192.168.0.0/16",
    ]
  }
}

module "eks" {
  source                               = "terraform-aws-modules/eks/aws"
  version                              = "14.0.0"
  cluster_name                         = "${var.context.app_name}"
  cluster_version                      = "1.19"
  subnets                              = "${module.vpc.private_subnets}"
  vpc_id                               = "${module.vpc.vpc_id}"
  cluster_create_timeout               = "30m"
  worker_groups = [
    {
      instance_type = "${var.context.kubernetes.aws.machine_type}"
      asg_desired_capacity = "${var.context.replica_count}"
      asg_min_size = "${var.context.replica_count}"
      asg_max_size  = "${var.context.replica_count}"
      root_volume_type = "gp2"
    }
  ]
  worker_additional_security_group_ids = ["${aws_security_group.all_worker_mgmt.id}"]
  map_users = var.context.iam.aws.map_users
  map_roles = var.context.iam.aws.map_roles
}

Expected behavior

The cluster gets created successfully

Actual behavior

We get this output:

Error: Post "http://localhost/api/v1/namespaces/kube-system/configmaps": dial tcp 127.0.0.1:80: connect: connection refused

  on .terraform/modules/deployment.eks/aws_auth.tf line 65, in resource "kubernetes_config_map" "aws_auth":
  65: resource "kubernetes_config_map" "aws_auth" {

The text was updated successfully, but these errors were encountered:

dak1n1 · 2021-03-23T00:43:06Z

It looks like the Kubernetes provider isn't receiving a configuration. Here's how I configure mine, which is similar to the EKS module README except it's for the newer version of the Kubernetes provider. (My team recently released version 2.0 of the Kubernetes provider and it requires a slightly different config than shown in this module's README).

data "aws_eks_cluster" "default" {
  name = module.cluster.cluster_id
}

data "aws_eks_cluster_auth" "default" {
  name = module.cluster.cluster_id
}

provider "kubernetes" {
  host                   = data.aws_eks_cluster.default.endpoint
  cluster_ca_certificate = base64decode(data.aws_eks_cluster.default.certificate_authority[0].data)
  token                  = data.aws_eks_cluster_auth.default.token
}

lbornov2 · 2021-03-23T16:15:33Z

@dak1n1 - The code example I provided does NOT instantiate the kubernetes provider (at least not directly). The error happens during creation of the EKS module - not before or after. So - the problem happens in the EKS module itself - not outside of the module.

dak1n1 · 2021-03-23T17:32:39Z

Right, but the EKS module uses the Kubernetes provider under the hood, so it needs a provider configuration. Otherwise, it will assume a default/empty config. You can skip using the Kubernetes provider within the EKS module by specifying manage_aws_auth = false in your EKS module config. That will skip the section that relies on the Kubernetes provider.

module "cluster" {
  source  = "terraform-aws-modules/eks/aws"
  version = "14.0.0" 
...
  manage_aws_auth  = false
...
}

ArchiFleKs · 2021-03-30T12:56:56Z

I have the same issue on a newly created cluster everything is fine and then I'm changing the module tag and get this error, the weird thing is that if I put everything back just like in the state everything work as expected

RobertFischer · 2021-03-30T17:48:41Z

@dak1n1 -- How am I supposed to instantiate the Kubernetes provider using output from this module, but before I instantiate this module?

(Also, is everyone who specifies manage_aws_auth = true experiencing this issue? If not, what's different that allows them to avoid it? In the alternative, if we are all experiencing this issue, then isn't manage_aws_auth = true straight-up broke?)

dak1n1 · 2021-03-30T19:49:38Z

@dak1n1 -- How am I supposed to instantiate the Kubernetes provider using output from this module, but before I instantiate this module?

(Also, is everyone who specifies manage_aws_auth = true experiencing this issue? If not, what's different that allows them to avoid it? In the alternative, if we are all experiencing this issue, then isn't manage_aws_auth = true straight-up broke?)

The issues vary depending on the user's configuration, but they're all related to a subject I'm currently researching, which is why I'm volunteering my time here. The EKS module, which is a community-driven effort (not managed by Hashicorp), specifically is using a pattern that is discouraged by the creators of Terraform. (Specifically, where it says a provider config should only reference values that are known before the configuration is applied, that's the issue we're hitting here).

The issues being described in this bug report all have the same root cause, which is that they are passing variables into a provider configuration which are not known at plan time. Terraform simply doesn't support that, and so they encourage instead separating out the AWS provider and Kubernetes provider resources, so you can use two applies when needed. However, there are some work-arounds that can still help to achieve this workflow. Since so many users want to use this in a single apply, this is my area of interest, to try and enable that pattern to succeed, despite the current limitations in Terraform.

TLDR: you can copy/paste the config I gave above, which will only read the EKS cluster after the variables are known. You can also use the config in this repo's README, which is another way to establish this dependency.

However, since this pattern is not actually supported in Terraform, there will be times when it will fail. Specifically, when the EKS cluster's credentials become unknown, such as when replacing the cluster, or during destroy on Terraform 0.14.x. To avoid these errors, there are work-arounds such as terraform refresh prior to destroy, and removing the kubernetes config map from state prior to replacing/modifying the EKS cluster (I believe that would look something like terraform state rm module.cluster.kubernetes_config_map.aws-auth, but I don't know what impact that would have on the EKS worker nodes).

This page can be helpful for learning about configuring providers that are used with modules. https://www.terraform.io/docs/language/modules/develop/providers.html

I also have a couple working configs here, if anyone wants to reference them.

On the Kubernetes provider side, I have some plans to smooth this out a bit and provide more meaningful error messages. I'm also planning to implement a version of this old request that I found when researching the topic. That will allow the Kubernetes provider to keep trying to contact the Kubernetes API, rather than failing immediately with the obscure localhost error we all see. So there are some changes hopefully coming in the next few months.

RobertFischer · 2021-03-30T20:08:55Z

Thanks for the thorough response, Stef. I appreciate it. Ultimately, since my `eks` module is called within another module, the provider solution just doesn’t work out due to the limitation against nesting providers within modules. I’m taking a look at cdktf and seeing if it gives me the flexibility that I need.

jgournet · 2021-05-19T05:44:50Z

it probably won't apply to many people, but in case that can help: I had this config:

provider "kubernetes" {
  host                   = data.aws_eks_cluster.this.endpoint
  cluster_ca_certificate = base64decode(data.aws_eks_cluster.this.certificate_authority[0].data)
  config_context         = data.aws_eks_cluster.this.arn
  token                  = data.aws_eks_cluster_auth.this.token
}

Somehow, removing the "config_context" line made this error disappear ...

charneykaye · 2021-06-15T04:48:46Z

I have never set manage_aws_auth yet I experienced this issue after I had changed the name of my eks module.

The resolution was

use terraform state list in order to discover the resources that were stored under the legacy module name
use terraform state rm to remove the resources stored under the legacy module name
use terraform import to re-import the module's resources under the new name

stevehipwell · 2021-06-21T15:20:22Z

@dak1n1 this issue is being caused by hashicorp/terraform#24886 (which is also a pretty large pain point in implementing a Hashicorp native credential flow e.g. OIDC -> Vault STS -> AWS provider). Without this being solved what is the logic for a aws_eks_cluster_auth not to be created until something else has been configured? The current "best practice" pattern errors when you plan and don't apply for over 15 mins; it also errors when you've layered other code on top of this module and a control plane or managed worker change take over 15 mins to complete.

TL;DR - Is there a hack to make a aws_eks_cluster_auth resource wait for something else before being calculated so we can target these to be ready when they're not going to expire?

ArchiFleKs · 2021-06-21T16:12:13Z

@stevehipwell Do you think this might be related to the issue you mentioned above ?

stevehipwell · 2021-06-21T16:16:24Z

@ArchiFleKs I suspect that it's related and I meant to add a comment to that effect. We're having to delete the aws_auth config map from state when we destroy a cluster, see #1280 (comment).

ArchiFleKs · 2021-06-21T16:24:30Z

@stevehipwell Yes my workaround is:

remove aws_auth from state
apply with manage_aws_auth=false
import configmap aws_auth to state
apply with manage_aws_auth=true

stevehipwell · 2021-06-21T16:27:06Z

@ArchiFleKs have you tried reverting the Kubernetes provider version?

ArchiFleKs · 2021-06-21T16:32:45Z

@ArchiFleKs have you tried reverting the Kubernetes provider version?

To which version ? before v2.0 ?

stevehipwell · 2021-06-21T16:43:10Z

That'd be my first suggestion, and if it works see is any of the v2 versions also work.

jaimehrubiks · 2021-06-28T15:23:01Z

I am also looking into best practices, although currently I use a single terraform run to deploy some dependencies, this eks module, and a bunch of helm charts and kubernetes yaml files.

Currently, using kubernetes provider version = "~> 1.11.1" and terraform 0.14.11 (and also manage_aws_auth=true) I am not experiencing big issues. ( I do need to refresh before destroy, but I feel that is expected). My provider blocks look like this:

provider "kubernetes" {
  host                   = data.aws_eks_cluster.cluster.endpoint
  cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority.0.data)
  token                  = data.aws_eks_cluster_auth.cluster.token
  load_config_file       = false
}

provider "helm" {
  kubernetes {
    host                   = data.aws_eks_cluster.cluster.endpoint
    cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority.0.data)
    token                  = data.aws_eks_cluster_auth.cluster.token
  }
}

provider "kubectl" {
  host                   = data.aws_eks_cluster.cluster.endpoint
  cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority.0.data)
  token                  = data.aws_eks_cluster_auth.cluster.token
  load_config_file       = false
}

I though of sharing my versions and blocks in case someone finds it useful.

Still, will keep around the discussion, in case I get an issue in the future and we can come up with workarounds.

PascalBourdier · 2021-08-17T16:36:51Z

We encountered this trouble too and @grandria find another workaround : set a good value to KUBE_CONFIG_PATH export KUBE_CONFIG_PATH=$KUBECONFIG according the official doc: https://registry.terraform.io/providers/hashicorp/kubernetes/latest/docs/guides/v2-upgrade-guide

KarstenSiemer · 2021-08-20T12:28:21Z

I experience this issue too and alike @jaimehrubiks I am as well using eks as a submodule and install a good amount of helm charts and other stuff into the cluster. I have had huge problems with token lifetimes in the past and always had to spin up clusters in two steps. 1st create the cluster, then error because there is no token, replan and apply with a token now.

I configured my providers like this in the submodule to overcome these problems:

data "aws_eks_cluster" "this" {
  name       = module.eks_control_plane.cluster.name
  depends_on = [module.eks_control_plane.cluster]
}

data "aws_eks_cluster_auth" "this" {
  count      = module.eks_control_plane.cluster != null ? 1 : 0
  name       = module.eks_control_plane.cluster.name
  depends_on = [module.eks_control_plane.cluster]
}

provider "kubernetes" {
  alias                  = "initial"
  host                   = data.aws_eks_cluster.this.endpoint
  cluster_ca_certificate = base64decode(data.aws_eks_cluster.this.certificate_authority[0].data)
  token                  = data.aws_eks_cluster_auth.this[0].token
}

data "aws_eks_cluster" "iam" {
  name       = module.eks_control_plane.cluster.name
  depends_on = [module.eks_control_plane.cluster, module.aws_auth]
}

provider "kubernetes" {
  host                   = data.aws_eks_cluster.iam.endpoint
  cluster_ca_certificate = base64decode(data.aws_eks_cluster.iam.certificate_authority[0].data)
  exec {
    api_version = "client.authentication.k8s.io/v1alpha1"
    args        = ["eks", "get-token", "--cluster-name", data.aws_eks_cluster.iam.id, "--role-arn", "arn:aws:iam::${var.aws_account_id}:role/Atlantis"]
    command     = "aws"
  }
}

provider "helm" {
  kubernetes {
    host                   = data.aws_eks_cluster.iam.endpoint
    cluster_ca_certificate = base64decode(data.aws_eks_cluster.iam.certificate_authority[0].data)
    exec {
      api_version = "client.authentication.k8s.io/v1alpha1"
      args        = ["eks", "get-token", "--cluster-name", data.aws_eks_cluster.iam.id, "--role-arn", "arn:aws:iam::${var.aws_account_id}:role/Atlantis"]
      command     = "aws"
    }
  }
}

With this I can spin up a cluster and install helm charts and other stuff in a single plan/apply.
Using the aws eks get-token command it is easy to get a long living token for helm but I need to add the configuration for Atlantis to the cluster first, so that I can actually get such a token. This is done by the "initial" provider. It installs the aws-auth configmap, which is managed in a seperate module apart from the eks cluster itself.
Yet I experience this error:

Error: Get "http://localhost/api/v1/namespaces/kube-system/configmaps/aws-auth": dial tcp [::1]:80: connect: connection refused

The data blocks to configure the provider are empty somehow and I cannot just reference a kube config since I want to be able to still use Atlantis and it's easy iam assumeables for authentification.

If I get said error I go into the local terraform file cache under .terraform/modules/$module-name and edit the provider files for the clusters into this:

data "aws_eks_cluster" "this" {
  name       = module.eks_control_plane.cluster.name
  depends_on = [module.eks_control_plane.cluster]
}

data "aws_eks_cluster_auth" "this" {
  count      = module.eks_control_plane.cluster != null ? 1 : 0
  name       = module.eks_control_plane.cluster.name
  depends_on = [module.eks_control_plane.cluster]
}

provider "kubernetes" {
  alias                  = "initial"
  host                   = module.eks_control_plane.cluster.endpoint
  cluster_ca_certificate = base64decode(module.eks_control_plane.cluster.certificate_authority[0].data)
  exec {
    api_version = "client.authentication.k8s.io/v1alpha1"
    args        = ["eks", "get-token", "--cluster-name", module.eks_control_plane.cluster.name, "--role-arn", "arn:aws:iam::${var.aws_account_id}:role/Atlantis"]
    command     = "aws"
  }
}

data "aws_eks_cluster" "iam" {
  name       = module.eks_control_plane.cluster.name
  depends_on = [module.eks_control_plane.cluster, module.aws_auth]
}

provider "kubernetes" {
  host                   = module.eks_control_plane.cluster.endpoint
  cluster_ca_certificate = base64decode(module.eks_control_plane.cluster.certificate_authority[0].data)
  exec {
    api_version = "client.authentication.k8s.io/v1alpha1"
    args        = ["eks", "get-token", "--cluster-name", module.eks_control_plane.cluster.name, "--role-arn", "arn:aws:iam::${var.aws_account_id}:role/Atlantis"]
    command     = "aws"
  }
}

provider "helm" {
  kubernetes {
    host                   = module.eks_control_plane.cluster.endpoint
    cluster_ca_certificate = base64decode(module.eks_control_plane.cluster.certificate_authority[0].data)
    exec {
      api_version = "client.authentication.k8s.io/v1alpha1"
      args        = ["eks", "get-token", "--cluster-name", module.eks_control_plane.cluster.name, "--role-arn", "arn:aws:iam::${var.aws_account_id}:role/Atlantis"]
      command     = "aws"
    }
  }
}

This works everytime once the cluster is already spun up. But I cant just use it like that, since I cannot spin up cluster in a single plan/apply fashion any more. I don't understand tho why the datablocks become stale and empty, even refreshing them doesn't help at all

ArchiFleKs · 2021-08-20T13:05:30Z

We encountered this trouble too and @grandria find another workaround : set a good value to KUBE_CONFIG_PATH export KUBE_CONFIG_PATH=$KUBECONFIG according the official doc: https://registry.terraform.io/providers/hashicorp/kubernetes/latest/docs/guides/v2-upgrade-guide

Actually this technique is working, but it is a bit hard to set when running in a CI for example and kubeconfig is not present

KarstenSiemer · 2021-08-23T16:29:17Z

okay, I think I somehow solved the problem at least for me.
I just moved the data blocks into the eks module and set an output with a depends on like this (inside module.eks_control_plane):

resource "time_sleep" "wait" {
  depends_on = [aws_eks_cluster.this[0]]

  create_duration = "30s"
}

data "aws_eks_cluster" "this" {
  count      = var.enabled ? 1 : 0
  name       = aws_eks_cluster.this[0].name
  depends_on = [aws_eks_cluster.this[0], time_sleep.wait]
}

data "aws_eks_cluster_auth" "this" {
  count      = var.enabled ? 1 : 0
  name       = aws_eks_cluster.this[0].name
  depends_on = [aws_eks_cluster.this[0], time_sleep.wait]
}

output "aws_eks_cluster" {
  value      = var.enabled ? data.aws_eks_cluster.this[0] : null
  depends_on = [aws_eks_cluster.this[0], time_sleep.wait]
}

output "aws_eks_cluster_auth" {
  value      = var.enabled ? data.aws_eks_cluster_auth.this[0] : null
  depends_on = [aws_eks_cluster.this[0], time_sleep.wait]
}

(I know putting the depends_on twice is actually redundant, but I just like to make really sure)
Anyway, I then refer to that in my parent module inside the providers:

provider "kubernetes" {
  alias                  = "initial"
  host                   = module.eks_control_plane.aws_eks_cluster.endpoint
  cluster_ca_certificate = base64decode(module.eks_control_plane.aws_eks_cluster.certificate_authority[0].data)
  token                  = module.eks_control_plane.aws_eks_cluster_auth.token
}

provider "kubernetes" {
  host                   = module.eks_control_plane.aws_eks_cluster.endpoint
  cluster_ca_certificate = base64decode(module.eks_control_plane.aws_eks_cluster.certificate_authority[0].data)
  exec {
    api_version = "client.authentication.k8s.io/v1alpha1"
    args        = ["eks", "get-token", "--cluster-name", module.eks_control_plane.aws_eks_cluster.id, "--role-arn", "arn:aws:iam::${var.aws_account_id}:role/Atlantis"]
    command     = "aws"
  }
}

provider "helm" {
  kubernetes {
    host                   = module.eks_control_plane.aws_eks_cluster.endpoint
    cluster_ca_certificate = base64decode(module.eks_control_plane.aws_eks_cluster.certificate_authority[0].data)
    exec {
      api_version = "client.authentication.k8s.io/v1alpha1"
      args        = ["eks", "get-token", "--cluster-name", module.eks_control_plane.aws_eks_cluster.id, "--role-arn", "arn:aws:iam::${var.aws_account_id}:role/Atlantis"]
      command     = "aws"
    }
  }
}

That module gets sourced multiple times in a single aws account to spin up a load of clusters and this somehow gave me the least problems. The sleep makes the api more resilient to timing problems ( so I feel at least )

davidgiga1993 · 2021-09-16T07:37:44Z

I'm facing the same issue after trying to change the tags of the cluster

ikarlashov · 2021-09-17T14:59:22Z

Okay, I found out what is the problem. Spent the entire day to fix it :)

TL;DR

Move k8s provider to aws_auth.
Set correct configuration for k8s provider as shown below.
Configure aws cli before running terraform.

It's a bad idea to set manage_aws_auth to false like someone offered before. There's a block in EKS module that is dependent on it. So, just keep default for manage_aws_auth that is set to "true".

First of all, I put k8s provider configuration inside aws_auth.tf:

provider "kubernetes" {
  host                   = data.aws_eks_cluster.cluster.endpoint
  cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority.0.data)
  exec {
    api_version = "client.authentication.k8s.io/v1alpha1"
    args        = ["eks", "get-token", "--cluster-name", var.cluster_name]
    command     = "aws"
  }
}

This is the only one correct configuration for eks that is also pointed out in official doc.

U can also use pre-generated kubeconfig file from aws eks update-kubeconfig, but it also uses same aws eks get-token under the hood. So you don't have hardcoded token anyway. It will be always dynamically generated from aws command.

Back to the main point: since there's an aws command to get token - u need to pre-configure it before running terraform:

aws configure set aws_access_key_id ${AWS_ACCESS_KEY_ID} --profile ${AWS_PROFILE}
aws configure set aws_secret_access_key $AWS_SECRET_ACCESS_KEY --profile ${AWS_PROFILE}
aws configure set region $AWS_DEFAULT_REGION --profile ${AWS_PROFILE}

And Voila, it works!

stevehipwell · 2021-09-17T16:21:21Z

We've been told by Hashicorp to always use the exec plugin for Kubernetes providers to stop this issue or ones like it. FYI Azure AKS has an even bigger problem with this than AWS EKS.

fangj99 · 2021-10-06T16:16:57Z

We met same error when we try to add extra subnets to the VPC and EKS, removed the extra subnets, the error disappeared

A user of this module can subsequently use this ConfigMap output as they wish, in their own module. This should help with issue terraform-aws-modules#1280 ``` resource "kubernetes_config_map" "aws_auth" { metadata { name = module.eks.config_map_aws_auth_yaml.metadata.name namespace = module.eks.config_map_aws_auth_yaml.metadata.namespace labels = module.eks.config_map_aws_auth_yaml.metadata.labels } data = module.eks.config_map_aws_auth_yaml.data } ```

github-actions · 2021-12-13T00:41:32Z

This issue has been automatically marked as stale because it has been open 30 days
with no activity. Remove stale label or comment or this issue will be closed in 10 days

gkzz · 2021-12-14T14:08:35Z

same here.

$ terraform plan
╷
│ Error: Get "http://localhost/api/v1/namespaces/kube-system/configmaps/aws-auth": dial tcp 127.0.0.1:80: connect: connection refused
│ 
│   with module.eks.kubernetes_config_map.aws_auth[0],
│   on .terraform/modules/eks/aws_auth.tf line 63, in resource "kubernetes_config_map" "aws_auth":
│   63: resource "kubernetes_config_map" "aws_auth" {
$ terraform version
Terraform v1.1.0
on linux_amd64
+ provider registry.terraform.io/hashicorp/aws v3.63.0
+ provider registry.terraform.io/hashicorp/cloudinit v2.2.0
+ provider registry.terraform.io/hashicorp/kubernetes v2.6.1
+ provider registry.terraform.io/hashicorp/local v2.1.0
+ provider registry.terraform.io/hashicorp/null v3.1.0
+ provider registry.terraform.io/hashicorp/random v3.1.0
+ provider registry.terraform.io/hashicorp/template v2.2.0
+ provider registry.terraform.io/terraform-aws-modules/http v2.4.1

$ cat kubernetes.tf
provider "kubernetes" {
  host                   = data.aws_eks_cluster.cluster.endpoint
  token                  = data.aws_eks_cluster_auth.cluster.token
  cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority.0.data)
}

fabidick22 · 2021-12-21T04:17:51Z

> We met same error when we try to add extra subnets to the VPC and EKS, removed the extra subnets, the error disappeared

I was able to solve my problem in the same way, I have a module to manage network resources and another module to manage the EKS cluster.
First I created the cluster with two subnets but then we had the requirement to create another subnet, after applying these changes I had problems with my EKS module, I had to specify only my two old subnets to avoid the problem (private_subnet_id[0], private_subnet_id[1])

philomory · 2021-12-30T01:43:21Z

> We met same error when we try to add extra subnets to the VPC and EKS, removed the extra subnets, the error disappeared

I was able to solve my problem in the same way, I have a module to manage network resources and another module to manage the EKS cluster. First I created the cluster with two subnets but then we had the requirement to create another subnet, after applying these changes I had problems with my EKS module, I had to specify only my two old subnets to avoid the problem (private_subnet_id[0], private_subnet_id[1])

This happened to us, as well; we wanted to expand our cluster into additional subnets (specifically, the worker node ASGs, not so much the control plane); of course, by default the ASGs use the same set of subnets as the cluster control plan, and, apparently, you can't change the subnets of the cluster control plane. That said, it's a bit wild to me that this causes the planning phase to fall back to querying localhost. It's not easy to tell from Terraform's default logging what step is going wrong, but maybe setting TF_LOG to TRACE would reveal more.

antonbabenko · 2022-01-05T19:58:02Z

This issue has been resolved in version 18.0.0 🎉

timblaktu · 2022-02-09T20:29:33Z

@antonbabenko we're still seeing this issue using v18.0.5 of this module. I see in the upgrade doc that:

Support for managing aws-auth configmap has been removed. This change also removes the dependency on the Kubernetes Terraform provider, the local dependency on aws-iam-authenticator for users, as well as the reliance on the forked http provider to wait and poll on cluster creation. To aid users in this change, an output variable aws_auth_configmap_yaml has been provided which renders the aws-auth configmap necessary to support at least the IAM roles used by the module (additional mapRoles/mapUsers definitions to be provided by users)

...and we have shifted to using the new aws_auth_configmap_yaml module output as suggested, and are now using what I believe is the recommended kubernetes provider config:

################################################################################
# Kubernetes provider configuration: How terraform auth{n,z} to a cluster
################################################################################
data "aws_eks_cluster" "cluster" {
  name = module.eks.cluster_id
}

data "aws_eks_cluster_auth" "cluster" {
  name = module.eks.cluster_id
}

provider "kubernetes" {
  host                   = data.aws_eks_cluster.cluster.endpoint
  cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority[0].data)
  token                  = data.aws_eks_cluster_auth.cluster.token
}

...but don't understand what about the release fixed this issue, or what other config this fix is dependent on. Any advice would be appreciated. We could raise another issue, but I feel this is likely a misunderstanding on our part.

bryantbiggs · 2022-02-09T20:33:45Z

The issue was marked as resolved by release v18 because we removed the management of the aws-auth config map from the module (what the issue was created for)

timblaktu · 2022-02-09T20:48:59Z

Thanks @bryantbiggs. So, does this mean that it would be expected that this issue would still occur in a terraform project which uses v18+ of this eks module, and still uses the kubernetes provider? That's us. We're still managing our aws_auth config map in the same project which of course requires the kubernetes provider. I understand this was moved out of the eks module.

bryantbiggs · 2022-02-09T20:51:04Z

It's possible; it's highly dependent on how your resources are configured, network connectivity, and what actions you are taking

timblaktu · 2022-02-09T20:53:06Z

OK, thanks for the info. We're first going to try using the exec plugin to handle kubernetes provider authentication, as some others have recommended above.

stevehipwell · 2022-02-09T20:56:33Z

@timblaktu the tl;dr here is use the exec plugin for all Kubernetes based providers.

The long answer is that if you ask HashiCorp how to do this you will get many different, usually incorrect, answers. Of these two actually work; the "official" one is that you can't use Kubernetes in the workspace where it was created, the "engineering" one is to use the exec plugin and make sure you plan out your dependencies correctly.

timblaktu · 2022-02-09T22:28:55Z

@stevehipwell thanks so much for those insights. I'm trying your "engineering" answer and want to dive a bit into the "make sure you plan out your dependencies correctly" part. I've changed my kubernetes provider config (which used to specify a token) to use the exec plugin for auth, like this:

provider "kubernetes" {
  host                   = data.aws_eks_cluster.cluster.endpoint
  cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority[0].data)
  exec {
    api_version = "client.authentication.k8s.io/v1alpha1"
    command     = "aws"
    args        = ["eks", "get-token", "--cluster-name", module.eks.cluster_id]
  }
}

and am still getting the "localhost connection refused" error, presumably the first time the kubernetes provider tries to reach out to the cluster. In my case this is at the declaration of a "kubernetes_role" "configmap_update" resource, which causes it to try to Get "http://localhost/apis/rbac.authorization.k8s.io/v1/namespaces/kube-system/roles/configmap-update".

So,

How do I know my exec plugin is working?
What can I do to ensure the dependencies in my eks/k8s project are "planned out correctly" to avoid this issue?

EDIT: Could my problem here be that I am declaring the kubernetes provider to be implicitly dependent on the eks module through my reference to module.eks.cluster_id to fetch the cluster name? I could just as well fetch the cluster name from a local variable I have sitting around. Looking in the output of terraform graph the only relevant dependency I see between the kubernetes provider and what the eks module manages is this:

                "[root] provider[\"registry.terraform.io/hashicorp/kubernetes\"]" -> "[root] data.aws_eks_cluster.cluster (expand)"

...but this dependency is probably caused by my references to data.aws_eks_cluster.cluster.* in my provider config.

stevehipwell · 2022-02-10T09:05:18Z

@timblaktu I've used the following pattern when layering on top of an EKS cluster created by this module for a number of major releases. I think the important point here is to use the module outputs for host and cluster_ca_certificate rather than a data object due to the way Terraform manages it's data collection. I also use the local.cluster_name input variable that I pass into the module rather than module.eks.cluster_id but I suspect that it might not make any difference.

provider "kubernetes" {
  host                   = module.eks.cluster_endpoint
  cluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data)

  exec {
    api_version = "client.authentication.k8s.io/v1alpha1"
    command     = "aws"
    args        = ["eks", "get-token", "--cluster-name", local.cluster_name]
  }
}

timblaktu · 2022-02-10T20:23:24Z

Thanks again, @stevehipwell! This appears to be the final (and quite obscured!) wisdom that completely solves this issue. Using v18 of the module doesn't fix it by itself. Using an exec plugin in your kubernetes provider config doesn't fix it by itself. Your last point is also essential to the solution: you can't workaround this bug completely if you are referring to eks cluster data sources in your kubernetes provider config.

It's too bad that more issues aren't followed up on after being closed with the resulting essential tribal wisdom like we (mostly you) did here. But I guess that's the way of open source, no? If we (the tribe) doesn't do it, who will? Thanks again!

lrstanley · 2022-02-12T01:05:06Z

Is it worth calling out @stevehipwell's recommendations/details in the examples, and potentially the module docs, so others are aware of this caveat/don't run into the same issue? We've got many teams who have run into the same thing, and just temporarily split up the terraform to prevent running them so close together, which isn't ideal. I understand it's not the fault of these modules, but it might be worth at least adding a little snippet to point people in the right direction.

timblaktu · 2022-02-12T01:16:56Z

@lrstanley Probably it would make the most sense for this info to go prominently into the eks module documentation, since ultimately this is a bug in the module implementation, as noted here, and is the reason why all of these "all planets in alignment" work arounds are necessary.

stevehipwell · 2022-02-12T08:08:57Z

@timblaktu this isn't an issue with this module, it's a general Terraform defect disguised as a design choice. I agree that the docs here could be updated with some "suggestions" on use, but HashiCorp should be the ones documenting how their providers work (as noted above don't hold your breath). Remember the maintainers have removed the nested provider in the module so provider logic belongs to the workspace consuming the module; HashiCorp own supporting this now.

…etes_config_map_v1_data` resource (#1999)

BlueShells · 2022-04-24T22:54:06Z

@dak1n1 Hi Dak1n1 , I run into the same issue and fix this issue by your method , glad to to see you

mareq · 2022-07-30T23:08:38Z

I am not sure if I understand the solution correctly: Is it to use exec instead of data source for token and have the host and cluster_certificate initialised from the module outputs instead of the data object as suggested above?

That is what I have done (hopefully no silly mistakes there):

provider "kubernetes" {
  host                   = module.eks.cluster_endpoint
  cluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data)

  exec {
    api_version = "client.authentication.k8s.io/v1beta1"
    command     = "aws"
    args        = ["eks", "get-token", "--cluster-name", var.cluster_name]
  }
}

But I am still getting the original error:

╷
│ Error: configmaps "aws-auth" already exists
│
│   with module.eks.kubernetes_config_map.aws_auth[0],
│   on .terraform/modules/eks/main.tf line 453, in resource "kubernetes_config_map" "aws_auth":
│  453: resource "kubernetes_config_map" "aws_auth" {
│
╵

All I am doing is spinning up an emtpy cluster with eks_managed_node_groups.

This workaround does help, but is kind of ugly:

$ terraform apply
   ..error..
$ terraform import module.eks.kubernetes_config_map.aws_auth kube-system/aws-auth`
$ terraform apply

It is probably possible to avoid the error by saying -target=module.eks in the first apply, but what I am really up to is to avoid whole this dance altogether. Is that somehow possible or have I misunderstood the thread above and this IS the solution, at least for now?

glyhood · 2022-11-03T23:00:34Z

It looks like the Kubernetes provider isn't receiving a configuration. Here's how I configure mine, which is similar to the EKS module README except it's for the newer version of the Kubernetes provider. (My team recently released version 2.0 of the Kubernetes provider and it requires a slightly different config than shown in this module's README).
data "aws_eks_cluster" "default" {
  name = module.cluster.cluster_id
}

data "aws_eks_cluster_auth" "default" {
  name = module.cluster.cluster_id
}

provider "kubernetes" {
  host                   = data.aws_eks_cluster.default.endpoint
  cluster_ca_certificate = base64decode(data.aws_eks_cluster.default.certificate_authority[0].data)
  token                  = data.aws_eks_cluster_auth.default.token
}

This worked for me.

marcos-gomes-ishop · 2022-11-10T19:43:04Z

You can declare an env variable in your GitHub YAML like that and this will work:
env:
KUBE_CONFIG_PATH: /home/runner/.kube/config

github-actions · 2022-12-11T02:16:28Z

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

schollii mentioned this issue Mar 28, 2021

Error: Unauthorized on .terraform/modules/eks/aws_auth.tf line 65, in resource "kubernetes_config_map" "aws_auth": #1287

Closed

bernardolk mentioned this issue Aug 19, 2021

Deployment problem datarevenue-berlin/OpenMLOps#66

Closed

daroga0002 mentioned this issue Aug 24, 2021

Cluster endpoint is not reachable after updating to 17.1.0 #1536

Closed

mhill-holoplot mentioned this issue Oct 7, 2021

node_groups never create successfully #1628

Closed

vijay-veeranki mentioned this issue Dec 1, 2021

Kubernetes provider for 1.20 upgrade ministryofjustice/cloud-platform#3392

Closed

bryantbiggs mentioned this issue Dec 7, 2021

feat!: Removed support for launch configuration and replace count with for_each #1680

Merged

1 task

github-actions bot added the stale label Dec 13, 2021

github-actions bot removed the stale label Dec 15, 2021

antonbabenko closed this as completed in #1680 Jan 5, 2022

timnee referenced this issue Apr 14, 2022

feat: Add support for managing aws-auth configmap using new `kubern…

da3d54c

…etes_config_map_v1_data` resource (#1999)

ghost mentioned this issue May 23, 2022

dial tcp 127.0.0.1:80: connect: connection refused #2007

Closed

1 task

github-actions bot locked as resolved and limited conversation to collaborators Dec 11, 2022

Error during EKS Creation: Post "http://localhost/api/v1/namespaces/kube-system/configmaps": dial tcp 127.0.0.1:80: connect: connection refused #1280

Error during EKS Creation: Post "http://localhost/api/v1/namespaces/kube-system/configmaps": dial tcp 127.0.0.1:80: connect: connection refused #1280

Comments

lbornov2 commented Mar 21, 2021 • edited Loading

Description

Versions

Reproduction

Code Snippet to Reproduce

Expected behavior

Actual behavior

dak1n1 commented Mar 23, 2021

lbornov2 commented Mar 23, 2021

dak1n1 commented Mar 23, 2021

ArchiFleKs commented Mar 30, 2021

RobertFischer commented Mar 30, 2021 • edited Loading

dak1n1 commented Mar 30, 2021

RobertFischer commented Mar 30, 2021 via email

jgournet commented May 19, 2021

charneykaye commented Jun 15, 2021 • edited Loading

stevehipwell commented Jun 21, 2021

ArchiFleKs commented Jun 21, 2021

stevehipwell commented Jun 21, 2021

ArchiFleKs commented Jun 21, 2021 • edited Loading

stevehipwell commented Jun 21, 2021

ArchiFleKs commented Jun 21, 2021

stevehipwell commented Jun 21, 2021

jaimehrubiks commented Jun 28, 2021

PascalBourdier commented Aug 17, 2021

KarstenSiemer commented Aug 20, 2021

ArchiFleKs commented Aug 20, 2021

KarstenSiemer commented Aug 23, 2021 • edited Loading

davidgiga1993 commented Sep 16, 2021

ikarlashov commented Sep 17, 2021

stevehipwell commented Sep 17, 2021

fangj99 commented Oct 6, 2021

github-actions bot commented Dec 13, 2021

gkzz commented Dec 14, 2021 • edited Loading

fabidick22 commented Dec 21, 2021

philomory commented Dec 30, 2021

antonbabenko commented Jan 5, 2022

timblaktu commented Feb 9, 2022

bryantbiggs commented Feb 9, 2022 • edited Loading

timblaktu commented Feb 9, 2022

bryantbiggs commented Feb 9, 2022

timblaktu commented Feb 9, 2022

stevehipwell commented Feb 9, 2022

timblaktu commented Feb 9, 2022 • edited Loading

stevehipwell commented Feb 10, 2022

timblaktu commented Feb 10, 2022

lrstanley commented Feb 12, 2022

timblaktu commented Feb 12, 2022

stevehipwell commented Feb 12, 2022

BlueShells commented Apr 24, 2022

mareq commented Jul 30, 2022

glyhood commented Nov 3, 2022

marcos-gomes-ishop commented Nov 10, 2022

github-actions bot commented Dec 11, 2022

lbornov2 commented Mar 21, 2021 •

edited

Loading

RobertFischer commented Mar 30, 2021 •

edited

Loading

charneykaye commented Jun 15, 2021 •

edited

Loading

ArchiFleKs commented Jun 21, 2021 •

edited

Loading

KarstenSiemer commented Aug 23, 2021 •

edited

Loading

gkzz commented Dec 14, 2021 •

edited

Loading

bryantbiggs commented Feb 9, 2022 •

edited

Loading

timblaktu commented Feb 9, 2022 •

edited

Loading