Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add updated examples for GKE and EKS #1115

Merged
merged 30 commits into from
Jan 21, 2021
Merged

Add updated examples for GKE and EKS #1115

merged 30 commits into from
Jan 21, 2021

Conversation

dak1n1
Copy link
Contributor

@dak1n1 dak1n1 commented Jan 12, 2021

In order to prevent confusion around progressive apply, provide some working examples of creating Kubernetes resources in the same apply where the cluster is created.

These were tested using the current (master) version of the kubernetes provider, which will become version 2.0. (The required_providers version = attribute has been omitted until the new version is released).

Description

Acceptance tests

  • Have you added an acceptance test for the functionality being added?
  • Have you run the acceptance tests on this branch?

Output from acceptance testing:

$ make testacc TESTARGS='-run=TestAccXXX'

...

Release Note

Release note for CHANGELOG:

...

References

hashicorp/terraform-provider-helm#647

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

@dak1n1
Copy link
Contributor Author

dak1n1 commented Jan 20, 2021

Added docs in a separate PR #1121

@dak1n1 dak1n1 requested a review from jrhouston January 20, 2021 19:54
@dak1n1 dak1n1 mentioned this pull request Jan 20, 2021
2 tasks
@ghost ghost added size/XXL and removed size/XL labels Jan 20, 2021
Copy link
Collaborator

@jrhouston jrhouston left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These examples are amazing and are going to be super helpful for me! 😄

I just left a couple of things to fix:

  • Better to use >= (or ~>) in the version constraints so we don't have to update the versions in the examples too often
  • There's a bunch of formatting quirks, I would just give everything a run through terraform fmt

@@ -0,0 +1,56 @@
resource "kubernetes_namespace" "test" {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to run terraform fmt on this file I think.

}
helm = {
source = "hashicorp/helm"
version = "2.0.1"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should have a => constraint here and in the one above since this is an example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean version = ">= 2.0.1"? I haven't used version constraints much. You can also click "insert a suggestion" to help me figure out exactly where to put this. Thanks!

Copy link
Collaborator

@jrhouston jrhouston Jan 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes sorry that's what I mean. Just so it will pull down the latest version of the provider without us having to go back and update the doc examples all the time. This way we would only have to go back and change it when we have another major version bump.

The syntax is [here](https://www.terraform.io/docs/configuration/version-constraints.html.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the cloud providers (like azurerm), I was going to leave the versions static so we don't have to update them until we want to. Potentially it would keep working for years. For Kubernetes and Helm though, I'm ok with these auto-updating, since we'll know about any breaking changes.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep that makes sense!

@@ -0,0 +1,12 @@
variable "location" {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also need terraform fmt on this file

_examples/gke/main.tf Outdated Show resolved Hide resolved
@@ -2,7 +2,7 @@ terraform {
required_providers {
kubernetes = {
source = "hashicorp/kubernetes"
version = "1.13"
version = "9.9.9"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this version change on purpose or a left-over from testing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a left-over. Thanks for catching this!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh oops, actually, this is in the test-infra dir, so it's on purpose. This PR should only modify _examples, which should not have 9.9.9 in them.

Copy link
Collaborator

@jrhouston jrhouston left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

Copy link
Member

@alexsomesan alexsomesan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🏆

@dak1n1 dak1n1 merged commit 0f97829 into hashicorp:master Jan 21, 2021
@kaloyanhs
Copy link

@dak1n1 Sorry for commenting closed PR , but I'm little confused, the examples and the Terraform documentation contradicts a bit.

  • Here, right after the warning message, there is a paragraph saying the most reliable way is to have separate apply stage for Kubernetes. What could be classified as production setup, the example here or an external pipeline with two apply stages ?
  • Here there is generic advise not to use provider block in modules , but to pass the provider config from the root module. How this relates to the examples here ?

@dak1n1
Copy link
Contributor Author

dak1n1 commented Feb 5, 2021

@kaloyanhs Hi, thanks for bringing this up! You're right that I should move the provider blocks into main.tf and remove them from the modules. This is from the docs you gave me:

A legacy module containing its own provider configurations is not compatible with the for_each, count, and depends_on arguments that were introduced in Terraform v0.13.

I did run into that problem during testing, and couldn't use depends_on because of it. So that's a good reason for me to update the modules and move the provider blocks into the root module.

My main goal with these examples was to come up with a configuration that works reliably, since so many people wanted to run the Kubernetes provider in a single apply with their cluster build. What I came up with works in testing, but it could use some refinement.

My advice in general, regarding production configurations, is to always test in your specific environment. Especially when there are short-lived cloud credentials involved, you'll want to test that the configuration still works an hour later, after the short-lived token has expired. (They expire after about an hour in GKE, but each cloud provider is different. EKS tokens expire after 15 minutes, etc).

In one of the 3 examples here, I did pass the providers into the modules as the documentation states to do, but the other two examples need updating to follow this pattern too. They still work, but it's not best practice.

Regarding the warning message in the docs, I'll try to explain. Previously, we were telling everyone to apply Kubernetes resources in a separate apply from the cluster build. That's what the docs used to say. But after seeing how often people wanted to use it in a single apply, I set out to find a way to accomplish this. I found that in many cases, you can use a single apply. But there are some cases in which a single apply will fail. Specifically, it will fail if the credentials being passed into the Kubernetes provider have expired, such as in the case of rotating out client certificates in the Kubernetes cluster.

For this reason, I updated the warning message in the docs to say that you'll need to keep the Kubernetes resources in a separate module, so that you have the option of using a targeted apply when needed. This one I tested out by replacing the AKS client certificates, which is something people will have to do in production for long-lived clusters:

https://github.com/hashicorp/terraform-provider-kubernetes/tree/master/_examples/aks#kubeconfig-for-manual-cli-access

You'll also want the Kubernetes resources to be in a separate module if you ever plan to replace the Kubernetes cluster out from under existing Kubernetes resources. This will ensure the resources get re-created during the apply that creates the new Kubernetes cluster. I added a case for that at the end of each of the 3 READMEs for GKE, EKS, AKS.

We do have some upcoming docs improvements planned, regarding all this authentication best practice. It's not very intuitive or even well documented at the moment, and often times users end up with outdated credentials being passed into the provider. I think for the next iteration of these examples, I'll try and add an exec block to each Kubernetes and Helm provider config, since that seems to be the most reliable way to ensure the credentials are always up-to-date. That would look like this:

provider "kubernetes" {
  host                   = var.cluster_endpoint
  cluster_ca_certificate = base64decode(var.cluster_ca_cert)
  exec {
    api_version = "client.authentication.k8s.io/v1alpha1"
    args        = ["eks", "get-token", "--cluster-name", var.cluster_name]
    command     = "aws"
  }
}

Any option that is passed into a provider block has the potential to cause some issues in a single apply, if any of those options become outdated. In the example above, issues could arise if the value being passed into host changes, or cluster_ca_certificate. Thankfully, the value of exec is always up-to-date, but if host or cluster_ca_cert changes, you have the potential for what we call a progressive apply issue (it's more of a "type" of issue, rather than a single concrete problem), which is most easily mitigated by using two applies.

This is why the Learn guide was recently updated to recommend the exec block too.

TL;DR: two applies will always work. A single apply works in all cases except those that replace data being passed into the provider block (host, certs, token, etc). Many users want a single apply and we'll continue working on ways to support these users, but it is a large and complex issue that is outside of the Kubernetes/Helm Provider's control. So we don't currently have a fix to guarantee a single apply will work under all conditions.

@dak1n1
Copy link
Contributor Author

dak1n1 commented Feb 6, 2021

So in the above comment, I basically gave an explanation of the state of everything as I knew it... This lead to some additional testing and improvement of the AKS and GKE examples, to add the suggested changes (thanks for that!). I'm seeing some promising results with using depends_on in the data sources that fetch the cluster details (host, certs, token), so we might actually have a way to make this work every time with a single apply! (Maybe). This is not a route I'd pursued before, since the docs had discouraged using data sources with depends_on in the past. However, I just checked the docs here, and it would seem this is now ok to do as of Terraform version 0.13.

I'm continuing to work on this in a draft PR here since it's related to some other work I had in progress. I'll push out some updates to the examples and their READMEs to demonstrate the most reliable and straightforward way to configure this.

alexsomesan pushed a commit that referenced this pull request Feb 9, 2021
Add some examples of using cloud providers with version 2 of the Kubernetes and Helm providers.
@ghost
Copy link

ghost commented Feb 21, 2021

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks!

@ghost ghost locked as resolved and limited conversation to collaborators Feb 21, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants