Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster Autoscaler pod fails with error "MissingRegion" #7389

Open
sivachandran-s opened this issue Oct 14, 2024 · 4 comments
Open

Cluster Autoscaler pod fails with error "MissingRegion" #7389

sivachandran-s opened this issue Oct 14, 2024 · 4 comments
Labels
area/cluster-autoscaler kind/bug Categorizes issue or PR as related to a bug.

Comments

@sivachandran-s
Copy link

sivachandran-s commented Oct 14, 2024

cluster-autoscaler: 1.30

Component version:

What k8s version are you using (kubectl version)?:

Client Version: v1.30.2
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.30.4-eks-a737599

kubectl version Output
$ kubectl version

What environment is this in?: Test

What did you expect to happen?:

I1014 17:54:22.064047 1 main.go:644] Cluster Autoscaler 1.30.0
I1014 17:54:22.155804 1 leaderelection.go:250] attempting to acquire leader lease kube-system/cluster-autoscaler...
I1014 17:54:22.168685 1 leaderelection.go:260] successfully acquired lease kube-system/cluster-autoscaler
I1014 17:54:22.169026 1 event_sink_logging_wrapper.go:48] Event(v1.ObjectReference{Kind:"Lease", Namespace:"kube-system", Name:"cluster-autoscaler", UID:"b82b9120-4fc3-4bc2-8b92-21daf9dd151f", APIVersion:"coordination.k8s.io/v1", ResourceVersion:"19453", FieldPath:""}): type: 'Normal' reason: 'LeaderElection' cluster-autoscaler-7c5484cd44-59xj8 became leader
I1014 17:54:22.251583 1 framework.go:373] "the scheduler starts to work with those plugins" Plugins={"PreEnqueue":{"Enabled":[{"Name":"SchedulingGates","Weight":0}],"Disabled":null},"QueueSort":{"Enabled":[{"Name":"PrioritySort","Weight":0}],"Disabled":null},"PreFilter":{"Enabled":[{"Name":"NodeAffinity","Weight":0},{"Name":"NodePorts","Weight":0},{"Name":"NodeResourcesFit","Weight":0},{"Name":"VolumeRestrictions","Weight":0},{"Name":"EBSLimits","Weight":0},{"Name":"GCEPDLimits","Weight":0},{"Name":"NodeVolumeLimits","Weight":0},{"Name":"AzureDiskLimits","Weight":0},{"Name":"VolumeBinding","Weight":0},{"Name":"VolumeZone","Weight":0},{"Name":"PodTopologySpread","Weight":0},{"Name":"InterPodAffinity","Weight":0}],"Disabled":null},"Filter":{"Enabled":[{"Name":"NodeUnschedulable","Weight":0},{"Name":"NodeName","Weight":0},{"Name":"TaintToleration","Weight":0},{"Name":"NodeAffinity","Weight":0},{"Name":"NodePorts","Weight":0},{"Name":"NodeResourcesFit","Weight":0},{"Name":"VolumeRestrictions","Weight":0},{"Name":"EBSLimits","Weight":0},{"Name":"GCEPDLimits","Weight":0},{"Name":"NodeVolumeLimits","Weight":0},{"Name":"AzureDiskLimits","Weight":0},{"Name":"VolumeBinding","Weight":0},{"Name":"VolumeZone","Weight":0},{"Name":"PodTopologySpread","Weight":0},{"Name":"InterPodAffinity","Weight":0}],"Disabled":null},"PostFilter":{"Enabled":[{"Name":"DefaultPreemption","Weight":0}],"Disabled":null},"PreScore":{"Enabled":[{"Name":"TaintToleration","Weight":0},{"Name":"NodeAffinity","Weight":0},{"Name":"NodeResourcesFit","Weight":0},{"Name":"VolumeBinding","Weight":0},{"Name":"PodTopologySpread","Weight":0},{"Name":"InterPodAffinity","Weight":0},{"Name":"NodeResourcesBalancedAllocation","Weight":0}],"Disabled":null},"Score":{"Enabled":[{"Name":"TaintToleration","Weight":3},{"Name":"NodeAffinity","Weight":2},{"Name":"NodeResourcesFit","Weight":1},{"Name":"VolumeBinding","Weight":1},{"Name":"PodTopologySpread","Weight":2},{"Name":"InterPodAffinity","Weight":2},{"Name":"NodeResourcesBalancedAllocation","Weight":1},{"Name":"ImageLocality","Weight":1}],"Disabled":null},"Reserve":{"Enabled":[{"Name":"VolumeBinding","Weight":0}],"Disabled":null},"Permit":{"Enabled":null,"Disabled":null},"PreBind":{"Enabled":[{"Name":"VolumeBinding","Weight":0}],"Disabled":null},"Bind":{"Enabled":[{"Name":"DefaultBinder","Weight":0}],"Disabled":null},"PostBind":{"Enabled":null,"Disabled":null},"MultiPoint":{"Enabled":null,"Disabled":null}}
I1014 17:54:22.265983 1 cloud_provider_builder.go:30] Building aws cloud provider.
E1014 17:54:25.405156 1 aws_cloud_provider.go:433] Failed to generate AWS EC2 Instance Types: MissingRegion: could not find region configuration, falling back to static list with last update time: 2024-04-08

What happened instead?:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:
cluster-autoscaler pod is crashing in our setup with the missing Region error , how to solve it

Also when i try to deploy 1.29 EKS version and 1.29 cluster-autoscaler version i am not seeing any issue even when i tried to perform the EKS upgrade from 1.29 to 1.30 i am not seeing this issue either . Only in the fresh install of 1.30 version of EKS and cluster-autoscaler i am getting the reported issue.

Tasks

No tasks being tracked yet.

Tasks

No tasks being tracked yet.
@sivachandran-s sivachandran-s added the kind/bug Categorizes issue or PR as related to a bug. label Oct 14, 2024
@adrianmoisey
Copy link
Member

/area cluster-autoscaler

@layor2257
Copy link

i started experiecing this issues also, i added this environment variable block:
env {
name = "AWS_REGION"
value = "eu-west-1"
}

  but now I am getting this error:
  
  I1016 10:13:31.629902       1 auto_scaling_groups.go:360] Regenerating instance to ASG map for ASG names: []

I1016 10:13:31.629918 1 auto_scaling_groups.go:367] Regenerating instance to ASG map for ASG tags: map[k8s.io/cluster-autoscaler/enabled: k8s.io/cluster-autoscaler/fcmb-stg-tco0001-cluster:]
I1016 10:13:34.932762 1 trace.go:219] Trace[774965466]: "Reflector ListAndWatch" name:k8s.io/autoscaler/cluster-autoscaler/utils/kubernetes/listers.go:212 (16-Oct-2024 10:13:24.533) (total time: 10398ms):
Trace[774965466]: ---"Objects listed" error: 10398ms (10:13:34.932)
Trace[774965466]: ---"Resource version extracted" 0ms (10:13:34.932)
Trace[774965466]: ---"Objects extracted" 0ms (10:13:34.932)
Trace[774965466]: ---"SyncWith done" 0ms (10:13:34.932)
Trace[774965466]: ---"Resource version updated" 0ms (10:13:34.932)
Trace[774965466]: [10.398963846s] [10.398963846s] END
I1016 10:13:35.129666 1 trace.go:219] Trace[1852186258]: "Reflector ListAndWatch" name:k8s.io/client-go/informers/factory.go:150 (16-Oct-2024 10:13:24.530) (total time: 10598ms):
Trace[1852186258]: ---"Objects listed" error: 10507ms (10:13:35.038)
Trace[1852186258]: ---"Resource version extracted" 0ms (10:13:35.038)
Trace[1852186258]: ---"Objects extracted" 90ms (10:13:35.128)
Trace[1852186258]: ---"SyncWith done" 0ms (10:13:35.129)
Trace[1852186258]: ---"Resource version updated" 0ms (10:13:35.129)
Trace[1852186258]: [10.598647288s] [10.598647288s] END
E1016 10:13:36.828775 1 aws_manager.go:125] Failed to regenerate ASG cache: NoCredentialProviders: no valid providers in chain. Deprecated.
For verbose messaging see aws.Config.CredentialsChainVerboseErrors
F1016 10:13:36.828823 1 aws_cloud_provider.go:419] Failed to create AWS Manager: NoCredentialProviders: no valid providers in chain. Deprecated.
For verbose messaging see aws.Config.CredentialsChainVerboseErrors

@sivachandran-s
Copy link
Author

One workaround which i followed is updating the EKS AMI type to "AL2_x86_64" instead of using the default type:
image .

@EvgeniiIakubov
Copy link

I also got this error (the same logs also with AWS_REGION env).
The error does not appears with clusters upgraded (I tried to upgrade from 1.29 in series), only on new clusters.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cluster-autoscaler kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

5 participants