Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when running on Apple Silicon with device = "mps" #40

Open
KennyOrellana opened this issue Mar 19, 2023 · 6 comments
Open

Error when running on Apple Silicon with device = "mps" #40

KennyOrellana opened this issue Mar 19, 2023 · 6 comments

Comments

@KennyOrellana
Copy link

Hi, I'm running the example use_vmas_env; it runs well on Apple M1 Max using device = "cpu"; however, I'm getting an Error when changing device = "mps"

I installed PyTorch for Apple Silicon following the documentation https://developer.apple.com/metal/pytorch/

Could you help me to figure out how to fix this problem?

Here is the console log

/Users/kenny/.conda/envs/pythonProject/bin/python /Users/kenny/Projects/Pycharm/pythonProject/main.py 
Step 1
/Users/kenny/Projects/Pycharm/pythonProject/VectorizedMultiAgentSimulator/vmas/simulator/core.py:1594: UserWarning: The operator 'aten::linalg_vector_norm' is not currently supported on the MPS backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/mps/MPSFallback.mm:12.)
  torch.linalg.vector_norm(a.state.pos - b.state.pos, dim=1)
Traceback (most recent call last):
  File "/Users/kenny/Projects/Pycharm/pythonProject/main.py", line 93, in <module>
    use_vmas_env(render=True, save_render=False)
  File "/Users/kenny/Projects/Pycharm/pythonProject/main.py", line 73, in use_vmas_env
    env.render(
  File "/Users/kenny/Projects/Pycharm/pythonProject/VectorizedMultiAgentSimulator/vmas/simulator/environment/environment.py", line 491, in render
    self.viewer.set_bounds(
  File "/Users/kenny/Projects/Pycharm/pythonProject/VectorizedMultiAgentSimulator/vmas/simulator/rendering.py", line 131, in set_bounds
    self.bounds = np.array([left, right, bottom, top])
  File "/Users/kenny/.conda/envs/pythonProject/lib/python3.10/site-packages/torch/_tensor.py", line 970, in __array__
    return self.numpy()
TypeError: can't convert mps:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

Process finished with exit code 1
@matteobettini
Copy link
Member

matteobettini commented Mar 19, 2023

Hello,

mps is not yet fully supported as a torch device so we haven't tested vmas on it.

My suggestion is to use the cpu on mac untill they fix all the issues with mps
(this is a torch thing rather than a vmas thing)

EDIT: Now it should run (almost) smoothly

@KennyOrellana
Copy link
Author

Is it feasible to replace PyTorch with TensorFlow? Or TensorFlow has some limitations?

@matteobettini
Copy link
Member

The whole simulator is based on pytorch so I do not think it is feasible.

I’ll look into testing vmas with mps and fixing what I can fix and will let you know if there are any improvements

@KennyOrellana
Copy link
Author

Thanks, that would be really helpful 🙌🏽 .

@matteobettini
Copy link
Member

matteobettini commented Mar 20, 2023

There are a lot of operators which are not yet supperted, like the 'norm' one.
These are core to vmas and are everywhere.

UserWarning: The operator 'aten::linalg_vector_norm' is not currently supported on the MPS backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/mps/MPSFallback.mm:11.) torch.linalg.vector_norm(a.state.pos - b.state.pos, dim=1)

I think for now we will just have to wait for torch to fix all these mps operators.

In the meantime, the mac M1/M2 cpus are really fast, you can use those.

@matteobettini
Copy link
Member

matteobettini commented Jul 27, 2023

The simulator now seems to run fine with device="mps", the problem is just that many operations are extremely slow due to not being supported and falling back to cpu.

Like norm mentioned above

@matteobettini matteobettini pinned this issue Jul 27, 2023
@matteobettini matteobettini unpinned this issue Jun 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants