You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We’re running an OpenSearch cluster with dedicated ML nodes supported by CUDA. These nodes and run with GPU-enabled container runtimes and require a specific image with CUDA built-in, while other nodes should use the original ones that are smaller and runtime-agnostic.
What solution would you like?
Add an optional image property to node spec that overrides the Pod spec template of the resulting StatefulSet.
What alternatives have you considered?
If all ML nodes are CUDA-enabled, we can endure the larger images and just use the CUDA version for all nodes.
Do you have any additional context?
To use CUDA with specific runtime, we also need the ability to set the runtimeClassName property in the Pod spec. This should be another small feature request.
The text was updated successfully, but these errors were encountered:
[Triage]
Hey @stevapple as of today custom image at nodePool level spec.nodePools[0].image is not supported in the NodePool struct. Also just curious today CUDA built-in OpenSearch images are not officially release by the project (coming from the issue opensearch-project/opensearch-build#4743 you created :) ), do you have a built in custom image for this purpose?
Also if you are open can you please contribute to the feature to allow overriding image spec for node group ?
Is your feature request related to a problem?
We’re running an OpenSearch cluster with dedicated ML nodes supported by CUDA. These nodes and run with GPU-enabled container runtimes and require a specific image with CUDA built-in, while other nodes should use the original ones that are smaller and runtime-agnostic.
What solution would you like?
Add an optional
image
property to node spec that overrides the Pod spec template of the resulting StatefulSet.What alternatives have you considered?
If all ML nodes are CUDA-enabled, we can endure the larger images and just use the CUDA version for all nodes.
Do you have any additional context?
To use CUDA with specific runtime, we also need the ability to set the
runtimeClassName
property in the Pod spec. This should be another small feature request.The text was updated successfully, but these errors were encountered: