-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support non-tile-multiple shard width #11435
Add support non-tile-multiple shard width #11435
Conversation
2f7ba35
to
59c347a
Compare
ttnn/cpp/ttnn/deprecated/tt_dnn/op_library/untilize/untilize_with_halo_op_v2.cpp
Outdated
Show resolved
Hide resolved
ttnn/cpp/ttnn/operations/conv2d/device/kernels/conv_bmm_tilize_col_major_out_blocks_sharded.cpp
Outdated
Show resolved
Hide resolved
...s/conv2d/device/kernels/reader_conv_activations_2d_mcast_padded_with_halo_3x3_weights_v2.cpp
Outdated
Show resolved
Hide resolved
.../operations/conv2d/device/multi_core_optimized_conv_sharded/optimized_conv_op_sharded_v2.cpp
Outdated
Show resolved
Hide resolved
59c347a
to
c4ebbe5
Compare
Signed-off-by: Nilaykumar Patel <[email protected]> tensor util changes Signed-off-by: Nilaykumar Patel <[email protected]> conv2d changes Signed-off-by: Nilaykumar K Patel <[email protected]> Normal test cases working with updated conv block config. Signed-off-by: Nilaykumar K Patel <[email protected]> kernel changes Signed-off-by: Nilaykumar K Patel <[email protected]> Resolve hang Signed-off-by: Nilaykumar K Patel <[email protected]> Clean up debug statements. Signed-off-by: Nilaykumar K Patel <[email protected]> Clean up debug statements and functions. Signed-off-by: Nilaykumar K Patel <[email protected]> Solve WS test hang. Signed-off-by: Nilaykumar K Patel <[email protected]> Update input offset calculations based on alignment. Signed-off-by: Nilaykumar K Patel <[email protected]> Align input matrix M for 1x1 conv according to new changes. Signed-off-by: Nilaykumar K Patel <[email protected]> Fix after rebase Signed-off-by: Nilaykumar K Patel <[email protected]> Resolve Comments. Signed-off-by: Nilaykumar K Patel <[email protected]> Remove Debug statements and commented code. Signed-off-by: Nilaykumar K Patel <[email protected]> Changes with rebase. Signed-off-by: Nilaykumar K Patel <[email protected]> Undo some changes after rebase-clenup Signed-off-by: Nilaykumar K Patel <[email protected]> Make pipeline work. Clean up needed. Signed-off-by: Nilaykumar K Patel <[email protected]> Update input offset calculations based on alignment. Resolves maxpool and model pipeline failures. Signed-off-by: Nilaykumar K Patel <[email protected]> Enable variable to support non-tile multiple width. Signed-off-by: Nilaykumar K Patel <[email protected]> Divide tiles among cores instead of total height Signed-off-by: Nilaykumar K Patel <[email protected]> Modify Condition Signed-off-by: Nilaykumar K Patel <[email protected]> Change after latest rebase. Signed-off-by: Nilaykumar K Patel <[email protected]> Modify test cases to accomodate small number of cores. Signed-off-by: Nilaykumar Patel <[email protected]> Resolve Yolo failure. Signed-off-by: Nilaykumar Patel <[email protected]> Add support for mulit-device tensor for weight and bias tensors. Signed-off-by: Nilaykumar K Patel <[email protected]> Add comment for prepare weight and bias matrix and modify condition. Signed-off-by: Nilaykumar Patel <[email protected]> Remove debug statements. Signed-off-by: Nilaykumar Patel <[email protected]> Address review comments Signed-off-by: Nilaykumar Patel <[email protected]>
Signed-off-by: Nilaykumar Patel <[email protected]>
Signed-off-by: Nilaykumar Patel <[email protected]>
Signed-off-by: Nilaykumar Patel <[email protected]>
Signed-off-by: Nilaykumar Patel <[email protected]>
Signed-off-by: Nilaykumar Patel <[email protected]>
Signed-off-by: Nilaykumar Patel <[email protected]>
Signed-off-by: Nilaykumar K Patel <[email protected]>
177f9f9
to
bb48f05
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
must be moved to somewhere like tests/ttnn/unit_tests/gtests/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is no tt_eager, eventually all those tests must be moved out of the folder
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will approve, but please address this issue next: #15691 In general, there is so much duplicate code in ttnn/cpp/ttnn/tensor/tensor_utils.cpp
. Please think of a way to commonize code.
Thank you, @TT-BrianLiu ! I agree, this is really important. |
@@ -103,6 +120,8 @@ void kernel_main() { | |||
constexpr uint32_t elem_nbytes = sizeof(uint16_t); | |||
constexpr uint16_t pad_core_id = 0xFFFF; | |||
|
|||
uint32_t input_aligned_page_size = get_arg_val<uint32_t>(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be a compile time arg
if constexpr (is_read) { | ||
uint32_t dst_addr = out_base_l1_addr + dst_offset; | ||
uint64_t src_addr = base_addr + src_offset; | ||
noc_async_read(src_addr, dst_addr, size); | ||
if (stick_nbytes == input_aligned_page_size) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please change this condition to use constexpr
to make it compile time instead of current RT.
Signed-off-by: Nilaykumar Patel <[email protected]>
Signed-off-by: Nilaykumar Patel <[email protected]>
Signed-off-by: Nilaykumar Patel <[email protected]>
### Ticket [#10109] ### Problem description With tile multiple shard width, all cores might not be utilised which can degrade performance. ### What's changed Add non-tile multiple shard width support. Signed-off-by: Nilaykumar K Patel <[email protected]>
Ticket
[https://github.com//issues/10109]
Problem description
With tile multiple shard width, all cores might not be utilised which can degrade performance.
What's changed
Add non-tile multiple shard width support.
Checklist