-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#13609: Uplift dram and l1 allocators to use dram/l1 specific alignment #13762
base: main
Are you sure you want to change the base?
Conversation
…rm_with_block_sharded_v2_8x8_grid
ttnn/cpp/ttnn/operations/normalization/groupnorm/device/multi_core/groupnorm_op_multi_core.cpp
Outdated
Show resolved
Hide resolved
@llongTT could you re-run post-commit, BH post commit, Model tests, and ttnn sweeps and update the links in the PR description |
Please also run the device perf and model pipelines, there are CNN models using group norm so need to make sure those are passing |
@cmaryanTT we may need to announce when this change goes in because using l1/dram specific alignment for allocating buffers could expose noc violations when doing misaligned dram -> l1 reads. We could explain how ops are handling this |
…name for l1 alignment
…lignment(dst_buffer->buffer_type()) to dst_buffer->alignment()
Have them running now. |
Have them running now. |
Noted. @llongTT - please announce on #tt-metal-developers with a brief description of the change and the risks. Also, @llongTT will be presenting this in the learning forum in a couple of weeks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clang-Tidy
found issue(s) with the introduced code (1/1)
@@ -449,7 +449,7 @@ DeviceAddr Buffer::bank_local_page_address(uint32_t bank_id, uint32_t page_index | |||
} | |||
|
|||
uint32_t Buffer::alignment() const { | |||
return this->allocator_->config.alignment; | |||
return this->device_->get_allocator_alignment(this->buffer_type_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no matching member function for call to get_allocator_alignment
@@ -1406,14 +1407,14 @@ allocator::Statistics Device::get_memory_allocation_statistics(const BufferType | |||
return allocator::get_statistics(*allocator, buffer_type); | |||
} | |||
|
|||
uint32_t Device::get_allocator_alignment() const { | |||
uint32_t Device::get_allocator_alignment(const BufferType &buffer_type) const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
out-of-line definition of get_allocator_alignment
does not match any declaration in tt::tt_metal::Device
@@ -129,8 +129,8 @@ class Device : public IDevice { | |||
allocator::Statistics get_memory_allocation_statistics(const BufferType &buffer_type) const override; | |||
allocator::Statistics get_memory_allocation_statistics(const BufferType &buffer_type, SubDeviceId sub_device_id) const override; | |||
|
|||
uint32_t get_allocator_alignment() const override; | |||
uint32_t get_allocator_alignment(SubDeviceId sub_device_id) const override; | |||
uint32_t get_allocator_alignment(const BufferType &buffer_type) const override; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
non-virtual member function marked override
hides virtual member functions
uint32_t get_allocator_alignment() const override; | ||
uint32_t get_allocator_alignment(SubDeviceId sub_device_id) const override; | ||
uint32_t get_allocator_alignment(const BufferType &buffer_type) const override; | ||
uint32_t get_allocator_alignment(const BufferType &buffer_type, SubDeviceId sub_device_id) const override; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
non-virtual member function marked override
hides virtual member functions
@@ -19,7 +19,7 @@ uint64_t get_alloc_limit(const tt::tt_metal::IDevice* device) { | |||
auto dispatch_core_config = dispatch_core_manager::instance().get_dispatch_core_config(device->id()); | |||
auto storage_core_bank_size = | |||
tt::get_storage_core_bank_size(device->id(), device->num_hw_cqs(), dispatch_core_config); | |||
const uint32_t allocator_alignment = device->get_allocator_alignment(); | |||
const uint32_t allocator_alignment = device->get_allocator_alignment(BufferType::L1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no matching member function for call to get_allocator_alignment
@@ -788,7 +788,7 @@ void detail::Program_::allocate_circular_buffers(const IDevice* device) { | |||
} | |||
} | |||
} | |||
computed_addr = align(computed_addr, device->get_allocator_alignment()); | |||
computed_addr = align(computed_addr, device->get_allocator_alignment(BufferType::DRAM)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no matching member function for call to get_allocator_alignment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clang-Tidy
found issue(s) with the introduced code (1/1)
ttnn/cpp/ttnn/operations/data_movement/transpose/device/transpose_program_factory.cpp
Outdated
Show resolved
Hide resolved
@@ -676,7 +676,6 @@ | |||
// face-lines from C tiles to form a single tile, we can load a single tile and then write out its face-lines to C | |||
// tiles | |||
uint32_t alignment = dst_buffer->buffer_type() == tt::tt_metal::BufferType::DRAM ? hal::get_dram_alignment() | |||
: hal::get_l1_alignment(); | |||
bool misaligned = alignment > sub_tile_line_bytes; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
expected (
for function-style cast or type construction
…ose_program_factory.cpp Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clang-Tidy
found issue(s) with the introduced code (1/1)
@@ -675,8 +675,7 @@ operation::ProgramWithCallbacks transpose_hc_multi_core( | |||
// TODO: noc_async_write only require 16B alignment for both DRAM and L1 for Blackhole, so instead of reading in | |||
// face-lines from C tiles to form a single tile, we can load a single tile and then write out its face-lines to C | |||
// tiles | |||
uint32_t alignment = dst_buffer->buffer_type() == tt::tt_metal::BufferType::DRAM ? hal::get_dram_alignment() | |||
: hal::get_l1_alignment(); | |||
uint32_t alignment = dst_buffer->buffer_type() == tt::tt_metal::BufferType::DRAM ? hal::get_dram_alignment(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
expected :
uint32_t alignment = dst_buffer->buffer_type() == tt::tt_metal::BufferType::DRAM ? hal::get_dram_alignment(); | |
uint32_t alignment = dst_buffer->buffer_type() == tt::tt_metal::BufferType::DRAM ? hal::get_dram_alignment(): ; |
@@ -675,8 +675,7 @@ | |||
// TODO: noc_async_write only require 16B alignment for both DRAM and L1 for Blackhole, so instead of reading in | |||
// face-lines from C tiles to form a single tile, we can load a single tile and then write out its face-lines to C | |||
// tiles | |||
uint32_t alignment = dst_buffer->buffer_type() == tt::tt_metal::BufferType::DRAM ? hal::get_dram_alignment() | |||
: hal::get_l1_alignment(); | |||
uint32_t alignment = dst_buffer->buffer_type() == tt::tt_metal::BufferType::DRAM ? hal::get_dram_alignment(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
expected expression
Ticket
#13609
Problem description
Using the max of DRAM and L1 alignment for both DRAM and L1 buffers was causing pcc mismatches in i2s and s2i.
What's changed
Use L1/DRAM specific alignment for respective allocations. This will require some ops to be uplifted to handle re-alignment
@yugaoTT and @ntarafdar to add corresponding op changes
Checklist
Below post commits were triggered 12/03