Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deal with built-in Venafi approval steps #100

Open
jkacou opened this issue Oct 27, 2022 · 7 comments
Open

Deal with built-in Venafi approval steps #100

jkacou opened this issue Oct 27, 2022 · 7 comments
Labels
enhancement New feature or request

Comments

@jkacou
Copy link

jkacou commented Oct 27, 2022

BUSINESS PROBLEM
The issue is about the creation of a certificate that is requiring an approval before actually create the certificate.
For now, terraform finish in timeout while waiting for the approval. below the process of failing

  1. Terraform request a new certificate
  2. The creation is stopped by the approval step
  3. Since the approval is not validated within 3minutes terraform timeout
  4. No entry will be save in the state file (so terraform will never know the issue)
  5. On the next apply (let's say we have a jenkins job behind) another certificate (enven tough the previous one has been validated)
  6. The request fail (Assuming there is a template policy to prevent duplication)
  7. Forced to manually delete the failed latest creation request and import the resource created during the validation step

PROPOSED SOLUTION
As a solution, it should require an http response from Venafi indicating there is a validation step ongoing (I don't know if it is already the case)
Then the provider could handle this specific case and update the state file according it (as a temporary state which will be updated on the next apply)

CURRENT ALTERNATIVES
using python script to check the delta on state file vs created certificates (with the required filters) and import the delta

VENAFI EXPERIENCE
Using Venafi for 1 year
10% of my time today as I am working on our module to leverage its features

@jkacou jkacou added the enhancement New feature or request label Oct 27, 2022
@jkacou
Copy link
Author

jkacou commented Oct 27, 2023

Hello,
A year now I openned this issue
Is there anything planned for it ?

@jswartzy
Copy link

jswartzy commented May 9, 2024

Ditto for our team. Is this going to get any love?

@BeardedPrincess
Copy link

There are some fundamental technical (and security-related issues) that arise from introducing human approvals into an automated flow like Terraform (also applies to vCert playbooks). The primary issue is what is the expected behavior of applying a terraform plan that could take days, or even weeks to complete by waiting for manual approvals? I don't believe that it's possible to wait indefinitely, but if it were, is that desirable? In general, you'd expect a TF plan to apply consistency on every run/apply. However, to make this work, we'd need to somehow keep state of what has already been requested, and if it's done, and expect the plan to be continuously "tried" until the certificate was approved? Can anyone provide other samples of providers that enable human interventions that can cause potentially indefinite wait periods like this? Would be curious to see what the best-practice is for handling that.

My recommendation instead: you should carefully evaluate what criteria the "human approver" is applying when making a decision about whether to approve or deny a request, and look to implement that with the policy enforcement available in the Venafi platform, or using an adaptable workflow to address more complex enforcements.

One way to start this evaluation is to look at what requests were rejected by humans over the last 90-180 days. If the humans are not rejecting any, that is a good clue that they may be "rubber-stamping" all requests. If they have rejected, look at the reasons for rejection, and determine if those things are already being enforced by the Venafi Platform anyway, or if they can be implemented in an adaptable workflow.

@abrahamoshel
Copy link

At least for our team I think it is really a second pair of eyes to make sure there is not a spelling error in cert or domain name. Since we are looking to provision fairly expensive certs.

@BeardedPrincess
Copy link

At least for our team I think it is really a second pair of eyes to make sure there is not a spelling error in cert or domain name. Since we are looking to provision fairly expensive certs.

That's understandable, but a mistake in spelling the hostname incorrect in a terraform plan (which I expect is doing much more than just creating a certificate) would have other significant impacts. Wrong DNS registered, incorrect host / SNI settings on Load Balancers / webservers, etc. Is it common for your IaaC or CI/CD processes to get to a production deployment stage with such errors?

Additionally, it is possible to have Venafi automatically enforce a policy setting to only allow specific domains, or even enforce specific patterns (RegEx) on every request. With Trust Protection Platform specifically, very complex Adaptable Workflows can even be used to do automated verification of such things, even consulting CMDB or other data sources to correlate what is being requested.

Not only are these approaches more accurate and reliable than a human, they allow for true and full automation. I don't see where manual human approvals in the middle of an automated process can work. Do you have other steps in your terraform plans that do not complete for days or weeks? What is the behavior and how are you handling those today?

@jkacou
Copy link
Author

jkacou commented May 24, 2024

On our side, the main concern is the cost since each public certificate imply some expenses. We rely already on the PR review for configuration check/validation ala gitops.
So we need this approval flow for all public certificates requests to keep some "control" on the certificate creation cost.
Waiting for the approval can work in some limited cases, but for sure, is not realist if this can take days..
It is why I was wondering if a transient state was possible.
the plan could be to wait for a certain time (lets say 30 minutes) and then put the certificate creation state as incomplete until next run.
That way, we do not have the timeout, we know what is happening whith the certificate to be create, the state knows the real state, and we don't have a forever waiting process.

@jkacou
Copy link
Author

jkacou commented May 24, 2024

So the state could be updated with one of three possible states: created (will update the state, it is a success), incomplete (no change, it is neither a success nor a fail) , rejected (it is a fail, we can raise a error) this one is more likely the same as a certificate creation failed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants