Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug(nvidia-container-toolkit): unsupported IMEX channel #2062

Closed
ugurgural opened this issue Nov 19, 2024 · 2 comments
Closed

bug(nvidia-container-toolkit): unsupported IMEX channel #2062

ugurgural opened this issue Nov 19, 2024 · 2 comments

Comments

@ugurgural
Copy link

ugurgural commented Nov 19, 2024

Hi, since with the version v20241109, I started to get the error I shared below. Initially thought that my nvidia-device-plugin could be an older version, but even updating to the latest one(0.17.0) didn't solve it. As a workaround, I reverted the karpenter configuration pointing out to the previous version v20241016. Does anyone have an idea what this could be?

Note: configuration is almost barebone with the original eks GPU AMI, I only enable docker service in the user data for DIND scenarios.

Error: failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy' nvidia-container-cli: error parsing IMEX info: unsupported IMEX channel value: all: unknown

Environment:

  • AWS Region: eu-central-1
  • Instance Type(s): g4dn.8xlarge
  • EKS Platform version (use aws eks describe-cluster --name <name> --query cluster.platformVersion): eks.8
  • Kubernetes version (use aws eks describe-cluster --name <name> --query cluster.version): 1.30
  • AMI Version: v20241109
  • Kernel (e.g. uname -a): Linux ip-xxx-xx-xx-xx.eu-central-1.compute.internal 5.10.227-219.884.amzn2.x86_64 1 SMP Tue Oct 22 16:38:23 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
  • Release information (run cat /etc/eks/release on a node):
BASE_AMI_ID="ami-07d4ef2c1e42a1ce8"
BUILD_TIME="Sat Nov  9 22:55:23 UTC 2024"
BUILD_KERNEL="5.10.227-219.884.amzn2.x86_64"
ARCH="x86_64"
@Issacwww
Copy link
Member

I think this issue is related to NVIDIA/nvidia-container-toolkit#797, we will have another release (the one after v20241115) soon to have the updated nvidia-contianer-toolkit 1.17.2

@cartermckinnon cartermckinnon changed the title "unsupported IMEX channel" when using GPU AMI bug(nvidia-container-toolkit): unsupported IMEX channel Nov 21, 2024
@mselim00
Copy link

This should now be addressed by the latest release, which upgrades to v1.17.2 of nvidia-container-tollkit
https://github.com/awslabs/amazon-eks-ami/releases/tag/v20241121

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants