Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

installing cdk on aws eks #679

Open
sandeepk24 opened this issue Mar 14, 2024 · 15 comments
Open

installing cdk on aws eks #679

sandeepk24 opened this issue Mar 14, 2024 · 15 comments

Comments

@sandeepk24
Copy link

I clone the repo and check out to 7.3 as per the backstage document(git checkout release/7.3-20240131)(https://backstage.forgerock.com/docs/forgeops/7.3/forgeops.html). Most of the directories like charts and helm disappear. But when I clone only the master I see all the files. Could you please suggest as to what I should clone.
I also when I try to install ingress in eks cluster the ingress pods dont come up.
When I install the ds using forgeops it complains about the pv and pvc not available. And when I try to install IG using forgeops the pod does not come up either. Could you please help? I can provide whatever is needed for debugging.

@stenolan1
Copy link

Hello Sandeep, Are you following the guide for aws eks here? https://backstage.forgerock.com/docs/forgeops/7.3/cdk/cloud/setup/eks/forgeops.html
The cloning itself should be git clone https://github.com/ForgeRock/forgeops.git regardless of which branch you check out
The branch that you check out for 7.3 should be git checkout release/7.3-20240131
Make sure you obtain details about your eks cluster according to the details here
https://backstage.forgerock.com/docs/forgeops/7.3/cdk/cloud/setup/eks/clusterinfo.html

Steve Nolan

@sandeepk24
Copy link
Author

Thank you Steve for getting back so promptly! I am following this links provided by the backstage same as the ones you mentioned. When I clone the repo from Master I get all the files but when I do the git checkout release/7.3-20240131 to the this branch I lost all the data as you can see in the screenshot below:
Let me know if I am doing anything wrong. Let me know if I can schedule a call to see if I can articulate it more clearly?
Facing few other issues such as not able to bring up ds and ig pods up.

Thanks,
Sandeep

forgerock_master

@lee-baines
Copy link
Contributor

Hi @sandeepk24. Ignore the difference in the commits. Thats just because master is equivalent to 7.5(unreleased) compared to 7.3. So there are significant differences between the 2.
Also, Helm charts were introduced in 7.4 hence why the charts folder isn't there in 7.3. You just need to strictly follow the docs for 7.3 only. I've checked that branch and all the directories are correct

@lee-baines
Copy link
Contributor

lee-baines commented Mar 18, 2024

Regarding DS, you need to ensure that you have the correct storage class available so the PVC can be correctly provisioned. So for EKS you need to apply the following:

createStorageClasses() {
    kubectl create -f - <<EOF
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
    name: fast
provisioner: ebs.csi.aws.com
volumeBindingMode: WaitForFirstConsumer
---
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
    name: standard
provisioner: kubernetes.io/aws-ebs
parameters:
    type: gp2
EOF

@sandeepk24
Copy link
Author

Thank you @lee-baines for getting back. That information is helpful. We tried to install ingress and were not able to as the shell script was using helm to build it. So we planning on using AWS ALB controller instead of nginx ingress controller. Do you have any solution for ingress?

@lee-baines
Copy link
Contributor

Hi @sandeepk24. Why can't you use Helm?
Regardless, if you use an AWS ALB, then you'll need to set the correct annotation on the ingress.yaml:

annotations:
    kubernetes.io/ingress.class: alb

Beyond that, I haven't configured an ALB in 7 years :). So you'll have to look at the docs. They key consideration is that nginx offloads SSL inside the cluster. With an ALB, you'll offload SSL at the ALB load balancer, so traffic between the load balancer and the cluster will be unencrypted. We do have some ongoing work to address this but it's still in progress.

@paulbsch any more considerations for ALBs?

@sandeepk24
Copy link
Author

Thank you @lee-baines that helps.
However when I run the forgeops install command I am facing an issue:

./forgeops install ig --mini --deploy-env test --config-profile test -n iam-test --fqdn removed-this --debug
Flag --short has been deprecated, and will be removed in the future.
deployment manifest path: kustomize/"/data/forgeops/bin/../kustomize/deploy-test"
[DEBUG] Running: "kubectl version --client=true -o json"
[DEBUG] Running: "kubectl version -o json"
[DEBUG] Running: "kustomize version --short"
Flag --short has been deprecated, and will be removed in the future.
Checking cert-manager and related CRDs: [DEBUG] Running: "kubectl get crd certificaterequests.cert-manager.io"
[DEBUG] Running: "kubectl get crd certificates.cert-manager.io"
[DEBUG] Running: "kubectl get crd clusterissuers.cert-manager.io"
cert-manager CRD found in cluster.
[DEBUG] Running: "kubectl -n cert-manager get deployment cert-manager -o jsonpath={.spec.template.spec.containers[0].image}"
Checking secret-agent operator and related CRDs: [DEBUG] Running: "kubectl get crd secretagentconfigurations.secret-agent.secrets.forgerock.io"
secret-agent CRD found in cluster.

Checking secret-agent operator is running...
[DEBUG] Running: "kubectl wait --for=condition=Established crd secretagentconfigurations.secret-agent.secrets.forgerock.io --timeout=30s"
customresourcedefinition.apiextensions.k8s.io/secretagentconfigurations.secret-agent.secrets.forgerock.io condition met
[DEBUG] Running: "kubectl -n secret-agent-system wait --for=condition=available deployment --all --timeout=120s"
deployment.apps/secret-agent-controller-manager condition met
[DEBUG] Running: "kubectl -n secret-agent-system get pod -l app.kubernetes.io/name=secret-agent-manager --field-selector=status.phase==Running"
NAME READY STATUS RESTARTS AGE
secret-agent-controller-manager-74f6b575b8-tkztc 2/2 Running 2 (11d ago) 12d
secret-agent operator is running
[DEBUG] Running: "kubectl -n secret-agent-system get deployment secret-agent-controller-manager -o jsonpath={.spec.template.spec.containers[0].image}"
Checking ds-operator and related CRDs: [DEBUG] Running: "kubectl get crd directoryservices.directory.forgerock.io"
ds-operator CRD found in cluster.
[DEBUG] Running: "kubectl -n fr-system get deployment ds-operator-ds-operator -o jsonpath={.spec.template.spec.containers[0].image}"
Traceback (most recent call last):
File "./forgeops", line 431, in
main()
File "./forgeops", line 410, in main
utils.install_dependencies(args.legacy)
File "/data/forgeops/bin/utils.py", line 675, in install_dependencies
_, img, _ = run('kubectl', f'-n fr-system get deployment ds-operator-ds-operator -o jsonpath={{.spec.template.spec.containers[0].image}}',
File "/data/forgeops/bin/utils.py", line 229, in run
raise(e)
File "/data/forgeops/bin/utils.py", line 223, in run
_r = subprocess.run(shlex.split(runcmd), stdout=stdo_pipe, stderr=stde_pipe,
File "/usr/lib/python3.8/subprocess.py", line 516, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['kubectl', '-n', 'fr-system', 'get', 'deployment', 'ds-operator-ds-operator', '-o', 'jsonpath={.spec.template.spec.containers[0].image}']' returned non-zero exit status 1.

@lee-baines
Copy link
Contributor

Are you trying to just install IG?

@sandeepk24
Copy link
Author

sandeepk24 commented Mar 20, 2024

Yes for now only IG. Running this on an AWS EKS cluster and trying to install mini for now. Tried adding all the roles and permissions mentioned in the document.

@sandeepk24
Copy link
Author

sandeepk24 commented Mar 20, 2024

@lee-baines I ran with all components and i ran into cert manager git hub repo fails, unable to install cert-manager. And also the utils.py is also failing.

./forgeops install --mini --deploy-env test --config-profile test -n iam-test --fqdn --debug
Could not verify Kubernetes server version. Continuing for now.
Flag --short has been deprecated, and will be removed in the future.
deployment manifest path: kustomize/"/data/forgeops/bin/../kustomize/deploy-test"
[DEBUG] Running: "kubectl version --client=true -o json"
[DEBUG] Running: "kubectl version -o json"
Could not verify Kubernetes server version. Continuing for now.
[DEBUG] Running: "kustomize version --short"
Flag --short has been deprecated, and will be removed in the future.
Checking cert-manager and related CRDs: [DEBUG] Running: "kubectl get crd certificaterequests.cert-manager.io"
cert-manager CRD not found. Installing cert-manager.
[DEBUG] Running: "kubectl apply -f https://github.com/jetstack/cert-manager/releases/latest/download/cert-manager.crds.yaml "
error: error validating "https://github.com/jetstack/cert-manager/releases/latest/download/cert-manager.crds.yaml": error validating data: failed to download openapi: the server has asked for the client to provide credentials; if you choose to ignore these errors, turn validation off with --validate=false
Traceback (most recent call last):
File "/data/forgeops/bin/utils.py", line 623, in install_dependencies
run('kubectl', 'get crd certificaterequests.cert-manager.io',
File "/data/forgeops/bin/utils.py", line 229, in run
raise(e)
File "/data/forgeops/bin/utils.py", line 223, in run
_r = subprocess.run(shlex.split(runcmd), stdout=stdo_pipe, stderr=stde_pipe,
File "/usr/lib/python3.8/subprocess.py", line 516, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['kubectl', 'get', 'crd', 'certificaterequests.cert-manager.io']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "./forgeops", line 431, in
main()
File "./forgeops", line 410, in main
utils.install_dependencies(args.legacy)
File "/data/forgeops/bin/utils.py", line 631, in install_dependencies
certmanager('apply', tag=REQ_VERSIONS['cert-manager']['DEFAULT'])
File "/data/forgeops/bin/utils.py", line 746, in certmanager
run('kubectl',
File "/data/forgeops/bin/utils.py", line 229, in run
raise(e)
File "/data/forgeops/bin/utils.py", line 223, in run
_r = subprocess.run(shlex.split(runcmd), stdout=stdo_pipe, stderr=stde_pipe,
File "/usr/lib/python3.8/subprocess.py", line 516, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command ''kubectl', 'apply', '-f', '[https://github.com/jetstack/cert-manager/releases/latest/download/cert-manager.crds.yaml']' returned non-zero exit status 1.

@lee-baines
Copy link
Contributor

I see there is a related support ticket for this? I think this is related to your Kubernetes versions.
Can you check you versions kubectl version. Your local version should be similar to your cluster version

@lee-baines
Copy link
Contributor

@sandeepk24
Copy link
Author

./certmanager-deploy.sh
"jetstack" already exists with the same configuration, skipping
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "ingress-nginx" chart repository
...Successfully got an update from the "jetstack" chart repository
Update Complete. ⎈Happy Helming!⎈
customresourcedefinition.apiextensions.k8s.io/certificaterequests.cert-manager.io unchanged
customresourcedefinition.apiextensions.k8s.io/certificates.cert-manager.io unchanged
customresourcedefinition.apiextensions.k8s.io/challenges.acme.cert-manager.io unchanged
customresourcedefinition.apiextensions.k8s.io/clusterissuers.cert-manager.io unchanged
customresourcedefinition.apiextensions.k8s.io/issuers.cert-manager.io unchanged
customresourcedefinition.apiextensions.k8s.io/orders.acme.cert-manager.io unchanged
Error: INSTALLATION FAILED: Unable to continue with install: ClusterRole "cert-manager-controller-certificates" in namespace "" exists and cannot be imported into the current release: invalid ownership metadata; label validation error: missing key "app.kubernetes.io/managed-by": must be set to "Helm"; annotation validation error: missing key "meta.helm.sh/release-name": must be set to "cert-manager"; annotation validation error: missing key "meta.helm.sh/release-namespace": must be set to "cert-manager"

@sandeepk24
Copy link
Author

sandeepk24 commented Mar 25, 2024

Hey @lee-baines seeing this error now:

Traceback (most recent call last):
File "/data/forgeops/bin/./forgeops", line 431, in
main()
File "/data/forgeops/bin/./forgeops", line 410, in main
utils.install_dependencies(args.legacy)
File "/data/forgeops/bin/utils.py", line 675, in install_dependencies
_, img, _ = run('kubectl', f'-n fr-system get deployment ds-operator-ds-operator -o jsonpath={{.spec.template.spec.containers[0].image}}',
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/forgeops/bin/utils.py", line 229, in run
raise(e)
File "/data/forgeops/bin/utils.py", line 223, in run
_r = subprocess.run(shlex.split(runcmd), stdout=stdo_pipe, stderr=stde_pipe,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/subprocess.py", line 571, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['kubectl', '-n', 'fr-system', 'get', 'deployment', 'ds-operator-ds-operator', '-o', 'jsonpath={.spec.template.spec.containers[0].image}']' returned non-zero exit status 1.

@sandeepk24
Copy link
Author

This got fixed once we ran the ds-operator.sh from the bin. But now seeing a bunch of app failure errors in ds, am and ig apps. Sent you the logs in the frg community.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants