-
Notifications
You must be signed in to change notification settings - Fork 91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sub-Device fabric api updates, and make global sem/cb thread safe in metal #15936
Conversation
f3e0647
to
eb76f95
Compare
GlobalCircularBuffer( | ||
Device* device, | ||
const std::unordered_map<CoreCoord, CoreRangeSet>& sender_receiver_core_mapping, | ||
uint32_t size, | ||
BufferType buffer_type, | ||
tt::stl::Span<const SubDeviceId> sub_device_ids); | ||
tt::stl::Span<const SubDeviceId> sub_device_ids, | ||
Private); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make the ctor itself private instead?
Same for GlobalSemaphore
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was following the pattern that was added for the Buffer class, which enables using std::make_shared instead of std::shared_ptr(new T) in the creation functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sminakov-tt thoughts on this as you originally added this pattern for buffer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By thoughts are basically the same as you described. It's the suggested way to be used for enable_shared_from_this: https://en.cppreference.com/w/cpp/memory/enable_shared_from_this.
But can you please move this "private" constructor closer to the bottom of the class and add a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I've moved the creation fns to the top, constructors to the bottom and a comment saying they're "private" and to use the create fns instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I haven't noticed the shared_ptr
! Why are we switching to the shared ownership though? If this is really desired / needed, another option is to create a shared_ptr
using the move constructor at the client side, so that we are not imposing this on the user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^ s/switching/using - I see this is the existing usage.
5c0aafe
to
189032a
Compare
…ager with a fabric sub-device. Remove create_sub_device_manager overload of MeshDevice that accepts a diffent sub-device config per device This was added due to eth cores being different per device, but this is now handled internally
…o be thread-safe instead of depending on ttnn apis
189032a
to
bc81165
Compare
Ticket
#15944
Problem description
Want to clean up the api regarding how to instantiate the ethernet cores as its own sub-device by adding a new temporary api that adds it automatically when creating the sub-device manager. This will be removed when actual fabric is implemented.
Global CBs + global sems were not thread-safe with the metal api Normally thread safety was handled in ttnn apis by pushing from the top level to workers, but some of the use cases of these apis were calling the metal api on the main thread, so we need to make them internally thread-safe.
What's changed
Add a new api
create_sub_device_manager_with_fabric
which has the same signature ascreate_sub_device_manager
, but will automatically instantiate the ethernet cores as a separate sub-device. This also allows us to remove the MeshDevicecreate_sub_device_manager
overload that allowed users to specify a different sub-device configuration per device that was added due to eth cores being different per device. This api is temporary until we have proper fabric, where eth cores will no longer be user programmable and won't be a sub-device anymore.Make global cbs + sems metal apis thread safe by pushing the relevant code (writing the config/initial value) to the worker thread.
Checklist