Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug fix multiple runtimes #330

Merged
merged 2 commits into from
Aug 9, 2024
Merged

Bug fix multiple runtimes #330

merged 2 commits into from
Aug 9, 2024

Conversation

nsmithtt
Copy link
Contributor

@nsmithtt nsmithtt commented Aug 8, 2024

  • For some reason the cmake variables for toggling different runtimes were not working for me. Only the TTNN runtime was able to be enabled.
  • We also need to set the runtime before we open the device, otherwise the device and the rest of the runtime will mismatch.
  • For now we also just hardcode ttrt to be single device

- For some reason the cmake variables for toggling different runtimes
  were not working for me.  Only the TTNN runtime was able to be
  enabled.
- We also need to set the runtime before we open the device, otherwise
  the device and the rest of the runtime will mismatch.
- For now we also just hardcode ttrt to be single device
@nsmithtt nsmithtt force-pushed the nsmith/runtime-fix branch from d886751 to 7370312 Compare August 8, 2024 21:05
atexit.register(lambda: ttrt.runtime.close_device(device))

torch.manual_seed(args.seed)

for (binary_name, fbb, fbb_dict, program_indices) in fbb_list:
ttrt.runtime.set_compatible_runtime(fbb)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jnie-TT, I'm actually glad we got bit by this right away! What happened was that we called open_device above which opens a ttnn device, then we ttrt.runtime.set_compatible_runtime(fbb) to ttmetal runtime and boom!

Can you file an issue to track this? We could do something like:

enum class DeviceRuntime {    
  Disabled,    
  TTNN,    
  TTMetal,    
}; 

struct ObjectImpl {             
  DeviceRuntime runtime;                                                                                                                                                               
  std::shared_ptr<void> handle;                                                                                                                                                      
                                                                                                                                                                                    
  ObjectImpl(std::shared_ptr<void> handle) : handle(handle) {}                                                                                                                                           
  template <DeviceRuntime R, typename T> T &as() { assert(associatedRuntime == R); return *static_cast<T *>(handle.get()); }                                                                                                                     
  template <DeviceRuntime R, typename T> T const &as() const {                  
    assert(associatedRuntime == R);                                                                                                                      
    return *static_cast<T const *>(handle.get());                                                                                                                                    
  }                                                                                                                                                                             
};

Let's just make it impossible to mix types between the two runtime worlds.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nsmithtt great catch! I filed an issue #331

@nsmithtt nsmithtt merged commit b67fba8 into main Aug 9, 2024
7 checks passed
@nsmithtt nsmithtt deleted the nsmith/runtime-fix branch August 9, 2024 15:10
nsmithtt added a commit that referenced this pull request Aug 9, 2024
- For some reason the cmake variables for toggling different runtimes
  were not working for me.  Only the TTNN runtime was able to be
  enabled.
- We also need to set the runtime before we open the device, otherwise
  the device and the rest of the runtime will mismatch.
- For now we also just hardcode ttrt to be single device
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants