Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error to launch task : ResourceInitializationError #2

Closed
Nydareld opened this issue Dec 17, 2021 · 6 comments
Closed

error to launch task : ResourceInitializationError #2

Nydareld opened this issue Dec 17, 2021 · 6 comments

Comments

@Nydareld
Copy link

Hello i try to launch a test keycloak system using your tool,

i had some issues.

First i had to downgrade my terraform from v1.1.1 to v1.0.11 because of a branch not found error while downloading requirements
( i didnt save the error but i'm not the only one, so i think it's a terraform issue, but if you want I can upgrade to 1.1.1 to reproduce )

Then i have an error while executing make all in my environement :

│ Error: Invalid count argument
│
│   on .terraform/modules/keycloak.alb.access_logs.s3_bucket/main.tf line 163, in resource "aws_s3_bucket_policy" "default":
│  163:   count      = module.this.enabled && (var.allow_ssl_requests_only || var.allow_encrypted_uploads_only || var.policy != "") ? 1 : 0
│
│ The "count" value depends on resource attributes that cannot be determined until apply, so Terraform cannot predict how many instances will be created. To work around this, use the -target argument to first
│ apply only the resources that the count depends on.

here i probably did wrong because i setted the count to 1 manualy in ".terraform/modules/keycloak.alb.access_logs.s3_bucket/main.tf" to see what appen

then i build my docker and push it with make all ENV=test2 in the build folder.

the service is well updated, and i have a redeployment event in my service

Finaly, all my tasks keep failing to launch, and i have an error in task details than i dont Understand :

ResourceInitializationError: unable to pull secrets or registry auth: execution resource retrieval failed: unable to retrieve secrets from ssm: service call has been retried 5 time(s): RequestCanceled: request context canceled caused by: context deadli...

I'm probably dooing something wrong, but i see where,

do you have an idea ?

@deadlysyn
Copy link
Owner

Hi thanks for trying it and providing feedback.

The first issue is likely this terraform bug they are aware of:

hashicorp/terraform#30119

I was waiting on that to test 1.1 but sounds like a fix is in 1.1.2.

The last issue looks like required secrets are not accessible (bad path, IAM, etc). I am afk so can’t troubleshoot further now. Look over container_definition.json and it shows type and paths of required environment/secrets. If we figure it out I can make the docs better.

@deadlysyn
Copy link
Owner

deadlysyn commented Jan 2, 2022

@Nydareld back at home and looking into this. the invalid count issue seems to come from this:

cloudposse/terraform-aws-alb#103

i missed that when bumping module versions too hastily last time. i may need to submit PR to fix the upstream issue. for now, i've rolled the alb module back to 0.33.1 and was able to cleanly apply.

the ssm issue appears well explained here:

https://stackoverflow.com/questions/61265108/aws-ecs-fargate-resourceinitializationerror-unable-to-pull-secrets-or-registry

it sounds like fargate platform changes caused this, and it should only affect internal deploys. i believe the fix is additional vpc endpoints, but before going too far could you confirm you are deploying with internal = true?

@Nydareld
Copy link
Author

Nydareld commented Jan 2, 2022

@deadlysyn first, thanks for spending time on my issue, I didn't spend a lot of time on it, i'm sorry.

I'm running in internal = false, and i provided some public subnets ids.

I'm curently trying your last release for the alb issue.

btw the downloading issue is well fixed on terraform 1.1.2.

@deadlysyn
Copy link
Owner

successfully tested using:

enable_network       = false
private_subnet_ids   = ["subnet-03a9de8d6d60450ce", "subnet-09b0c7e0d434ade78", "subnet-091ce66e48470a35f"]
rds_source_region    = "us-east-2a"
route_table_ids      = ["rtb-60e05f0b"] # poor name; needs to be rtb w/ route to igw
vpc_id               = "vpc-4d2fd826"
public_subnet_ids    = ["subnet-e3edb899", "subnet-4dac1001", "subnet-b39f80db"]
internal             = false

per SO thread, key is having private subnets with a route table/route through NAT gateway on public subnet. this fixes ssm connectivity.

i'll try to add the required vpc endpoints for ssm and test internal = true in the coming week, then roll a 15.1.1 hotfix which includes these changes.

@Nydareld
Copy link
Author

Nydareld commented Jan 4, 2022

per SO thread, key is having private subnets with a route table/route through NAT gateway on public subnet. this fixes ssm connectivity.

That was it, i fixed my network, thanks for the help and sorry for the newbie problem ^^

@Nydareld Nydareld closed this as completed Jan 4, 2022
@deadlysyn
Copy link
Owner

per SO thread, key is having private subnets with a route table/route through NAT gateway on public subnet. this fixes ssm connectivity.

That was it, i fixed my network, thanks for the help and sorry for the newbie problem ^^

awesome, glad you got it working!

never a need to apologize for misunderstanding something new to you. i do it all the time :-) your report was quite useful, it exposed the need for better release testing, helped me fix a couple bugs which would have bitten anyone during deployment, and uncovered the internal = true bits no longer work as expected on fargate 1.4 (hadn't tested since 1.3 since i did that work in relation to another issue vs using it myself).

happy new year!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants