Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove restriction of input_nsticks_per_core % w == 0 #15205

Conversation

nkpatel-tt
Copy link
Contributor

@nkpatel-tt nkpatel-tt commented Nov 19, 2024

Ticket

Link

Problem description

Currently, whole input row is processed per core which inefficient since other cores could be idle

What's changed

Distribute work to all possible cores.

Checklist

  • Post commit CI passes Link
  • Nightly fast dispatch Link
  • Model regression CI testing passes Link
  • Device performance regression CI testing passes (if applicable) Link
  • (For models and ops writers) Full new models tests Link

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Clang-Tidy found issue(s) with the introduced code (1/1)

@nkpatel-tt nkpatel-tt self-assigned this Nov 20, 2024
	ToDo: commonize code.

Signed-off-by: Nilaykumar Patel <[email protected]>
@nkpatel-tt nkpatel-tt force-pushed the nkpatel/upsample2d-generalization-remove-restrictions-input_width branch from 607825e to e39f225 Compare December 10, 2024 16:08
@nkpatel-tt nkpatel-tt force-pushed the nkpatel/upsample2d-generalization-remove-restrictions-input_width branch from a6ede9f to 67eba82 Compare December 13, 2024 07:44
Signed-off-by: Nilaykumar Patel <[email protected]>
…trictions-input_width

Signed-off-by: Nilaykumar Patel <[email protected]>
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Clang-Tidy found issue(s) with the introduced code (1/1)

Signed-off-by: Nilaykumar Patel <[email protected]>
Suggested-by: Pavle Josipović <[email protected]>
Signed-off-by: Nilaykumar Patel <[email protected]>
@nkpatel-tt nkpatel-tt force-pushed the nkpatel/upsample2d-generalization-remove-restrictions-input_width branch from b677eea to 09b57cb Compare December 18, 2024 17:54
@nkpatel-tt nkpatel-tt requested a review from esmalTT as a code owner December 18, 2024 17:54
@nkpatel-tt nkpatel-tt requested a review from uaydonat as a code owner December 18, 2024 17:54
writer_rt_args[5] = out_w;
writer_rt_args[6] = 0; // set for each core below
writer_rt_args[4] = input_nsticks_per_core;
writer_rt_args[5] = output_nsticks_per_core / 2; // half of the outputs are processed by each core
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if output_nsticks_per_core is odd?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved logic to kernel where odd case is handled.

tests/ttnn/unit_tests/operations/test_upsample.py Outdated Show resolved Hide resolved
constexpr uint32_t config_cb_id = get_compile_time_arg_val(3);

uint32_t reader_nsticks_per_core = (in_nsticks_per_core + is_reader) / 2;
uint32_t writer_nsticks_per_core = in_nsticks_per_core / 2;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed this code and moved output sticks logic here and am deriving output sticks from reader_nsticks_per_core so that we handle even and odd case correctly.

Signed-off-by: Nilaykumar Patel <[email protected]>
@nkpatel-tt nkpatel-tt changed the title Remove restriction of input_nsticks_per_core % w == 0 for height shar… Remove restriction of input_nsticks_per_core % w == 0 Dec 20, 2024
@nkpatel-tt nkpatel-tt requested a review from esmalTT December 20, 2024 15:58
Copy link
Contributor

@esmalTT esmalTT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving to unblock, but please open an issue for this: #15205 (comment)

@mywoodstock
Copy link
Contributor

please also run nightly pipeline to make sure there are no perf regressions.

@nkpatel-tt
Copy link
Contributor Author

please also run nightly pipeline to make sure there are no perf regressions.

Done.

@nkpatel-tt nkpatel-tt merged commit a94c89e into main Jan 10, 2025
284 checks passed
@nkpatel-tt nkpatel-tt deleted the nkpatel/upsample2d-generalization-remove-restrictions-input_width branch January 10, 2025 04:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants