You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When defining an MPITask, the tolerations that are set as default for GPU usage in the deployment charts, are not applied to the created workers.
i.e. this configuration is not applied
This has also an additional bad side effect: the MPI launcher will end up on a GPU node, even if it could just run on a cpu node, wasting resources.
Expected behavior
The GPU tolerations should be applied automatically to MPI workers when GPUs are requested.
Currently this can be fixed by passing a flytekit.PodTemplate to the MPI Task, but this makes the code very cumbersome
e.g.
The tolerations are applied based on the universal resources specified in the @task decorator, but not for the launcher/worker-specific resource specification.
Describe the bug
When defining an MPITask, the tolerations that are set as default for GPU usage in the deployment charts, are not applied to the created workers.
i.e. this configuration is not applied
This has also an additional bad side effect: the MPI launcher will end up on a GPU node, even if it could just run on a cpu node, wasting resources.
Expected behavior
The GPU tolerations should be applied automatically to MPI workers when GPUs are requested.
Currently this can be fixed by passing a flytekit.PodTemplate to the MPI Task, but this makes the code very cumbersome
e.g.
Additional context to reproduce
No response
Screenshots
No response
Are you sure this issue hasn't been raised already?
Have you read the Code of Conduct?
The text was updated successfully, but these errors were encountered: