Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrading provider version results in plan run time increase #271

Closed
4 of 5 tasks
nbaju1 opened this issue Nov 12, 2024 · 10 comments · Fixed by #274
Closed
4 of 5 tasks

Upgrading provider version results in plan run time increase #271

nbaju1 opened this issue Nov 12, 2024 · 10 comments · Fixed by #274
Assignees
Labels
bug Something isn't working

Comments

@nbaju1
Copy link

nbaju1 commented Nov 12, 2024

Describe the bug
We recently did a refactor of our JFrog IAC where we also bumped the Artifactory provider version from 10.8.3 to ~12.3.0. The refactor was mostly renaming resources for better readability, no new resources were added.

After the version update the run time for terraform plan increased from ~30 seconds to ~8 minutes. Looking at the changes introduced in versions after 10.8.3, the addition of UpgradeState function for artifactory_user is one of the suspects for the main source of the run time increase. Is the intention to execute this on each terraform plan run?

We have ~1000 internal users (used for service accounts) and from the logs I see that an API call is made for each user which I don't see prior to the update.

As I would rather not be stuck on an outdated version of the provider, is there something we can do to avoid this long run time?

Requirements for and issue

  • A description of the bug
  • A fully functioning terraform snippet that can be copy&pasted (no outside files or ENV vars unless that's part of the issue). If this is not supplied, this issue will likely be closed without any effort expended. NOT APPLICABLE
  • Your version of artifactory (you can curl it at $host/artifactory/api/system/version
  • Your version of terraform
  • Your version of terraform provider

Terraform 1.9.8
Artifactory provider 12.3.0
Artifactory version: 7.100.2

Expected behavior
Hopefully not a 1500% increase in run time for terraform plan.

@nbaju1 nbaju1 added the bug Something isn't working label Nov 12, 2024
@alexhung
Copy link
Member

alexhung commented Nov 12, 2024

@nbaju1 The UpgradeState function should be run only once when upgrading the Terraform state version from 0 to 1 (e.g. during the first time execution of terraform apply after Terraform provider upgrade). All subsequent terraform execution will not call the UpgradeState function. So the plan/execution time increase should occur only once.

If you have access to your state file/data, you will see the schema_version field has the value of 0 for your artifactory_user resources that were created using provider v10.8.3.

After upgrading your provider and run terraform plan, you will see a REST API request to fetch the user data from Artifactory. This is not related to UpgradeState func. This fetch is part of terraform plan to check the latest state of that resource, and if anything is different, generates a update plan. Assuming no change is needed, the schema version of that resource is not upgraded at this point. If there are changes, you will need to run terraform apply which will also upgrade the state schema.

You can run terraform refresh which will refresh the state of the resource and upgrade its state schema in the process. Afterward, the schema_version should be 1.

What other resources are in your Terraform configuration? There were reports of slowness with artifactory_permission_target resource.

@alexhung alexhung added the question Further information is requested label Nov 12, 2024
@nbaju1
Copy link
Author

nbaju1 commented Nov 13, 2024

On further inspection of the log output I see that terraform plan seemingly spends ~6 minutes on refreshing the xray_license_policy resource.

2024-11-13T10:13:50.387Z [DEBUG] provider: plugin exited

##[error]2024-11-13T10:16:45.287Z [WARN]  Provider "registry.terraform.io/jfrog/xray" produced an unexpected new value for xray_license_policy.xray_allowed_licenses_policy_90BCC419 during refresh.
      - .rule: planned set element cty.ObjectVal(map[string]cty.Value{"actions":cty.SetVal([]cty.Value{cty.ObjectVal(map[string]cty.Value{"block_download":cty.SetVal([]cty.Value{cty.ObjectVal(map[string]cty.Value{"active":cty.False, "unscanned":cty.False})}), "block_release_bundle_distribution":cty.False, "block_release_bundle_promotion":cty.False, "build_failure_grace_period_in_days":cty.NullVal(cty.Number), "create_ticket_enabled":cty.False, "custom_severity":cty.StringVal("Medium"), "fail_build":cty.False, "mails":cty.SetVal([]cty.Value{cty.StringVal("<redacted>")}), "notify_deployer":cty.False, "notify_watch_recipients":cty.False, "webhooks":cty.NullVal(cty.Set(cty.String))})}), "criteria":cty.SetVal([]cty.Value{cty.ObjectVal(map[string]cty.Value{"allow_unknown":cty.True, "allowed_licenses":cty.SetVal([]cty.Value{<redacted>}), "banned_licenses":cty.NullVal(cty.Set(cty.String)), "multi_license_permissive":cty.True})}), "name":cty.StringVal("allowed_licenses"), "priority":cty.NumberIntVal(1)}) does not correlate with any element in actual

##[error]2024-11-13T10:20:03.341Z [DEBUG] provider.stdio: received EOF, stopping recv loop: err="rpc error: code = Unavailable desc = error reading from server: EOF"

All of our policy resources has a similar error, but the 6 minutes time jump in the logs only happens after the license policy.

When upgrading from Xray provider 2.8.0 to ~2.13 the expected type for actions and criteria on ##PolicyRule changed from this

Image

to this

Image

Similar type changes happened for block_download on ##PolicyRuleActions, where a list is expected instead of object.

Reverting back to Xray provider 2.8.0, and changing our code accordingly for the policies reverted the plan run time back to ~30 seconds and the warning for the policy resources is not present in the logs.

(I guess this issue should be migrated to the Xray provider repository)

@alexhung alexhung transferred this issue from jfrog/terraform-provider-artifactory Nov 13, 2024
@alexhung
Copy link
Member

@nbaju1 Interesting. Can you try with Xray provider v2.11.0 and v2.11.1? I want to pin down when this performance issue first occurs. I have my suspicion but would like someone to confirm it 😄

@alexhung
Copy link
Member

@nbaju1 I wonder if this is related to #262 as well.

@nbaju1
Copy link
Author

nbaju1 commented Nov 13, 2024

@alexhung 2.11.0 works similar to 2.8.0. 2.11.1 introduces the new expected types mentioned above and results in the increase in run time.

@nbaju1
Copy link
Author

nbaju1 commented Nov 13, 2024

@nbaju1 I wonder if this is related to #262 as well.

@alexhung We only have 128 licenses in our allow list, which might be the reason as to why our plan eventually finishes instead of hanging like the one in that issue.

@alexhung
Copy link
Member

@alexhung 2.11.0 works similar to 2.8.0. 2.11.1 introduces the new expected types mentioned above and results in the increase in run time.

That's what I suspected. The change between 2.11.0 and 2.11.1 should be transparent to end users. The attribute types should be compatible between versions.

Let me take a deeper look into this.

@alexhung
Copy link
Member

@alexhung
Copy link
Member

@nbaju1 See my comment. If you have opinion on which approach works for you, please let me know. I suspect switching attribute type will help your use case.

@nbaju1
Copy link
Author

nbaju1 commented Nov 18, 2024

@alexhung, thanks for the update. I believe changing attribute type will work for us. Since we use the CDK, adding our own validation is relatively simple, so the loss of that functionality is not that important.

@alexhung alexhung removed the question Further information is requested label Nov 18, 2024
@alexhung alexhung linked a pull request Nov 18, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants