Add support non-tile-multiple shard width #11435

nkpatel-tt · 2024-08-14T10:13:45Z

Ticket

[https://github.com//issues/10109]

Problem description

With tile multiple shard width, all cores might not be utilised which can degrade performance.

What's changed

Add non-tile multiple shard width support.

Checklist

Post commit CI passes Link 1 Link 2 Link 3 Link4 Link5 Latest
Nightly fast dispatch Link1 Latest Latest same as main
Model regression CI testing passes (if applicable) Link1 Link2 Latest
Device performance regression CI testing passes (if applicable) Link Latest
New/Existing tests provide coverage for changes

tests/ttnn/unit_tests/operations/test_new_conv2d.py

ttnn/cpp/ttnn/deprecated/tt_dnn/op_library/untilize/untilize_with_halo_op_v2.cpp

ttnn/cpp/ttnn/operations/conv2d/conv2d.cpp

ttnn/cpp/ttnn/operations/conv2d/device/kernels/conv_bmm_tilize_col_major_out_blocks_sharded.cpp

...s/conv2d/device/kernels/reader_conv_activations_2d_mcast_padded_with_halo_3x3_weights_v2.cpp

.../operations/conv2d/device/multi_core_optimized_conv_sharded/optimized_conv_op_sharded_v2.cpp

ttnn/cpp/ttnn/operations/conv2d/device/optimized_conv_op.hpp

Signed-off-by: Nilaykumar Patel <[email protected]> tensor util changes Signed-off-by: Nilaykumar Patel <[email protected]> conv2d changes Signed-off-by: Nilaykumar K Patel <[email protected]> Normal test cases working with updated conv block config. Signed-off-by: Nilaykumar K Patel <[email protected]> kernel changes Signed-off-by: Nilaykumar K Patel <[email protected]> Resolve hang Signed-off-by: Nilaykumar K Patel <[email protected]> Clean up debug statements. Signed-off-by: Nilaykumar K Patel <[email protected]> Clean up debug statements and functions. Signed-off-by: Nilaykumar K Patel <[email protected]> Solve WS test hang. Signed-off-by: Nilaykumar K Patel <[email protected]> Update input offset calculations based on alignment. Signed-off-by: Nilaykumar K Patel <[email protected]> Align input matrix M for 1x1 conv according to new changes. Signed-off-by: Nilaykumar K Patel <[email protected]> Fix after rebase Signed-off-by: Nilaykumar K Patel <[email protected]> Resolve Comments. Signed-off-by: Nilaykumar K Patel <[email protected]> Remove Debug statements and commented code. Signed-off-by: Nilaykumar K Patel <[email protected]> Changes with rebase. Signed-off-by: Nilaykumar K Patel <[email protected]> Undo some changes after rebase-clenup Signed-off-by: Nilaykumar K Patel <[email protected]> Make pipeline work. Clean up needed. Signed-off-by: Nilaykumar K Patel <[email protected]> Update input offset calculations based on alignment. Resolves maxpool and model pipeline failures. Signed-off-by: Nilaykumar K Patel <[email protected]> Enable variable to support non-tile multiple width. Signed-off-by: Nilaykumar K Patel <[email protected]> Divide tiles among cores instead of total height Signed-off-by: Nilaykumar K Patel <[email protected]> Modify Condition Signed-off-by: Nilaykumar K Patel <[email protected]> Change after latest rebase. Signed-off-by: Nilaykumar K Patel <[email protected]> Modify test cases to accomodate small number of cores. Signed-off-by: Nilaykumar Patel <[email protected]> Resolve Yolo failure. Signed-off-by: Nilaykumar Patel <[email protected]> Add support for mulit-device tensor for weight and bias tensors. Signed-off-by: Nilaykumar K Patel <[email protected]> Add comment for prepare weight and bias matrix and modify condition. Signed-off-by: Nilaykumar Patel <[email protected]> Remove debug statements. Signed-off-by: Nilaykumar Patel <[email protected]> Address review comments Signed-off-by: Nilaykumar Patel <[email protected]>

Signed-off-by: Nilaykumar Patel <[email protected]>

Signed-off-by: Nilaykumar K Patel <[email protected]>

ayerofieiev-tt · 2024-12-04T17:03:41Z

tests/tt_eager/ops/test_tensor_utils.cpp

must be moved to somewhere like tests/ttnn/unit_tests/gtests/

there is no tt_eager, eventually all those tests must be moved out of the folder

TT-BrianLiu

Will approve, but please address this issue next: #15691 In general, there is so much duplicate code in ttnn/cpp/ttnn/tensor/tensor_utils.cpp. Please think of a way to commonize code.

ayerofieiev-tt · 2024-12-04T17:52:14Z

Thank you, @TT-BrianLiu ! I agree, this is really important.

mywoodstock · 2024-12-04T18:19:32Z

.../ttnn/operations/data_movement/untilize_with_halo_v2/device/kernels/dataflow/halo_gather.cpp

@@ -103,6 +120,8 @@ void kernel_main() {
    constexpr uint32_t elem_nbytes = sizeof(uint16_t);
    constexpr uint16_t pad_core_id = 0xFFFF;

+    uint32_t input_aligned_page_size = get_arg_val<uint32_t>(0);


this should be a compile time arg

mywoodstock · 2024-12-04T18:20:09Z

.../ttnn/operations/data_movement/untilize_with_halo_v2/device/kernels/dataflow/halo_gather.cpp

            if constexpr (is_read) {
                uint32_t dst_addr = out_base_l1_addr + dst_offset;
                uint64_t src_addr = base_addr + src_offset;
-                noc_async_read(src_addr, dst_addr, size);
+                if (stick_nbytes == input_aligned_page_size) {


please change this condition to use constexpr to make it compile time instead of current RT.

Signed-off-by: Nilaykumar Patel <[email protected]>

This reverts commit a318130.

### Ticket [#10109] ### Problem description With tile multiple shard width, all cores might not be utilised which can degrade performance. ### What's changed Add non-tile multiple shard width support. Signed-off-by: Nilaykumar K Patel <[email protected]>

nkpatel-tt force-pushed the nkpatel/conv_op_non_tile_multiple_shard_widht branch 3 times, most recently from 2f7ba35 to 59c347a Compare August 15, 2024 10:18

mywoodstock requested changes Aug 15, 2024

View reviewed changes

nkpatel-tt force-pushed the nkpatel/conv_op_non_tile_multiple_shard_widht branch from 59c347a to c4ebbe5 Compare August 20, 2024 20:19

nkpatel-tt temporarily deployed to dev August 22, 2024 04:31 — with GitHub Actions Inactive

nkpatel-tt temporarily deployed to dev August 22, 2024 04:43 — with GitHub Actions Inactive

nkpatel-tt temporarily deployed to dev August 22, 2024 04:44 — with GitHub Actions Inactive

nkpatel-tt temporarily deployed to production August 22, 2024 05:09 — with GitHub Actions Inactive

nkpatel-tt temporarily deployed to dev August 23, 2024 08:13 — with GitHub Actions Inactive

nkpatel-tt requested review from shwetankTT and mywoodstock December 4, 2024 00:45

mywoodstock approved these changes Dec 4, 2024

View reviewed changes

nkpatel-tt mentioned this pull request Dec 4, 2024

In general, there is so much duplicate code in ttnn/cpp/ttnn/tensor/tensor_utils.cpp. Please think of a way to commonize code. #15691

Closed

nkpatel-tt added 9 commits December 4, 2024 10:01

Fix after rebase

088fd4a

Move in_non_tile_mul_width variable to main function.

773c658

Signed-off-by: Nilaykumar Patel <[email protected]>

Remove extra exposed variable for non tile multiple width.

ae2f4bd

Signed-off-by: Nilaykumar Patel <[email protected]>

Remove std namespace.

d842f6a

Signed-off-by: Nilaykumar Patel <[email protected]>

Add test cases for weight and bias tensor convert functions.

4914340

Signed-off-by: Nilaykumar Patel <[email protected]>

tensor utils changes.

11e3ebf

Signed-off-by: Nilaykumar Patel <[email protected]>

Address comment and remove unnecessary variable.

df8aab3

Signed-off-by: Nilaykumar Patel <[email protected]>

Align tensor utils functions to new changes.

bb48f05

Signed-off-by: Nilaykumar K Patel <[email protected]>

nkpatel-tt force-pushed the nkpatel/conv_op_non_tile_multiple_shard_widht branch from 177f9f9 to bb48f05 Compare December 4, 2024 11:01

bbradelTT approved these changes Dec 4, 2024

View reviewed changes

ayerofieiev-tt reviewed Dec 4, 2024

View reviewed changes

TT-BrianLiu approved these changes Dec 4, 2024

View reviewed changes

mywoodstock requested changes Dec 4, 2024

View reviewed changes

Address review comments.

cc48842

Signed-off-by: Nilaykumar Patel <[email protected]>

mywoodstock approved these changes Dec 5, 2024

View reviewed changes

nkpatel-tt added 2 commits December 5, 2024 04:54

Merge branch 'main' into nkpatel/conv_op_non_tile_multiple_shard_widht

714baaa

Signed-off-by: Nilaykumar Patel <[email protected]>

Resolve conflicts after rebase.

c162e9b

Signed-off-by: Nilaykumar Patel <[email protected]>

nkpatel-tt merged commit a318130 into main Dec 5, 2024
168 of 171 checks passed

nkpatel-tt deleted the nkpatel/conv_op_non_tile_multiple_shard_widht branch December 5, 2024 09:11

nkpatel-tt mentioned this pull request Dec 5, 2024

Resolve failure on master for clang-tidy failure for tensor_utils funcions #15738

Closed

tt-rkim pushed a commit that referenced this pull request Dec 5, 2024

Revert "Add support non-tile-multiple shard width (#11435)"

5344b7c

This reverts commit a318130.

nkpatel-tt restored the nkpatel/conv_op_non_tile_multiple_shard_widht branch December 5, 2024 13:33

nkpatel-tt mentioned this pull request Dec 5, 2024

Nkpatel/conv op non tile multiple shard widht #15742

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support non-tile-multiple shard width #11435

Add support non-tile-multiple shard width #11435

nkpatel-tt commented Aug 14, 2024 •

edited

Loading

ayerofieiev-tt Dec 4, 2024

ayerofieiev-tt Dec 4, 2024

TT-BrianLiu left a comment

ayerofieiev-tt commented Dec 4, 2024

mywoodstock Dec 4, 2024

mywoodstock Dec 4, 2024

Add support non-tile-multiple shard width #11435

Add support non-tile-multiple shard width #11435

Conversation

nkpatel-tt commented Aug 14, 2024 • edited Loading

Ticket

Problem description

What's changed

Checklist

ayerofieiev-tt Dec 4, 2024

Choose a reason for hiding this comment

ayerofieiev-tt Dec 4, 2024

Choose a reason for hiding this comment

TT-BrianLiu left a comment

Choose a reason for hiding this comment

ayerofieiev-tt commented Dec 4, 2024

mywoodstock Dec 4, 2024

Choose a reason for hiding this comment

mywoodstock Dec 4, 2024

Choose a reason for hiding this comment

nkpatel-tt commented Aug 14, 2024 •

edited

Loading