Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug Report] TTNN typecast operation fails when the tensor is on host #16279

Open
sdjordjevicTT opened this issue Dec 23, 2024 · 1 comment
Open
Assignees
Labels
bug Something isn't working op_cat: copy

Comments

@sdjordjevicTT
Copy link
Contributor

Describe the bug
TTNN typecast operation fails with the following message when the tensor is present on the host:

2024-12-23 14:17:39,026 - ERROR - ERROR: test=/__w/tt-mlir/tt-mlir/build/test/ttmlir/Silicon/TTNN/embedding/Output/simple_embedding.mlir.tmp.ttnn experienced an error with exception=TT_THROW @ /__w/tt-mlir/tt-mlir/third_party/tt-metal/src/tt-metal/ttnn/cpp/ttnn/tensor/tensor.hpp:329: tt::exception
info:
Cannot get the device from a tensor with host storage

To Reproduce
Steps to reproduce the behavior:

  1. Run following simple TTNN test:
import ttnn
import torch

device = ttnn.open_device(device_id=0)

torch_tensor = torch.rand(32, 32, dtype=torch.float32)
ttnn_tensor_cpu = ttnn.from_torch(torch_tensor, layout=ttnn.ROW_MAJOR_LAYOUT)

ttnn_tensor = ttnn.typecast(ttnn_tensor_cpu, dtype=ttnn.uint32)

ttnn.close_device(device)

The test should fail with the following exception:

Always | FATAL    | Cannot get the device from a tensor with host storage
Traceback (most recent call last):
  File "/localdev/sdjordjevic/src/tt-metal/python_test.py", line 9, in <module>
    ttnn_tensor = ttnn.typecast(ttnn_tensor_cpu, dtype=ttnn.uint32)
  File "/localdev/sdjordjevic/src/tt-metal/ttnn/ttnn/decorators.py", line 329, in __call__
    return self.function(*function_args, **function_kwargs)
RuntimeError: TT_THROW @ /localdev/sdjordjevic/src/tt-metal/ttnn/cpp/ttnn/tensor/tensor.hpp:329: tt::exception
info:
Cannot get the device from a tensor with host storage
backtrace:

Expected behavior
Expected behavior should not be the failure, the typecast op should under the hood do the conversion on the host.

Screenshots
/

Please complete the following environment information:

  • OS: both Ubuntu 20 and Ubuntu 22
  • Version of software: latest main

Additional context
Instead of using the typecast op, we can use to_dtype to convert on the host, but I believe that the semantics of typecast should cover this case as well.

@sdjordjevicTT
Copy link
Contributor Author

Hi @sjameelTT, we are currently facing this issue with our MLIR-based compiler. @jnie-TT mentioned that you spoke with him and have plans to improve the typecast operation to support casting on the host. I've created this issue so we can track our progress on this matter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working op_cat: copy
Projects
None yet
Development

No branches or pull requests

2 participants