You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We've had success using torch-ccl with resnet and other AI workloads to test with libfabric over psm3 but when we try to use libmlx-fi.so, torch-ccl does not seem to see it even when the provider has been copied into the provider directory.
Is this a known limitation of torch-ccl? Is there a make file we need to modify?
TIA.
The text was updated successfully, but these errors were encountered:
@mwheinz torch-ccl doesn't work with mlx provider. I think the issue is oneCCL needs thread multiple capability to use multiple workers, and MLX provider doesn't support it so it fails at the init call itself.
We've had success using torch-ccl with resnet and other AI workloads to test with libfabric over psm3 but when we try to use libmlx-fi.so, torch-ccl does not seem to see it even when the provider has been copied into the provider directory.
Is this a known limitation of torch-ccl? Is there a make file we need to modify?
TIA.
The text was updated successfully, but these errors were encountered: