Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#14732: add bert-tiny test_performance using trace and 2cq-WIP #14799

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

vigneshkeerthivasanx
Copy link
Contributor

@vigneshkeerthivasanx vigneshkeerthivasanx commented Nov 6, 2024

Ticket

Link to Github Issue

Problem description

Add trace+2cq test perf file for BERT-Tiny model.

What's changed

Added test perf using trace+2cq file for BERT-Tiny model.

Checklist

  • Post commit CI passes
  • Blackhole Post commit (if applicable)
  • Model regression CI testing passes (if applicable)
  • Device performance regression CI testing passes (if applicable)
  • New/Existing tests provide coverage for changes

Passing CI Links:

  • All post-commit tests - Link
  • Nightly fast dispatch tests - Link
  • (Single-card) Demo tests - Link
  • (Single-card) Model perf tests - Link

@vigneshkeerthivasanx
Copy link
Contributor Author

vigneshkeerthivasanx commented Nov 11, 2024

@mbahnasTT ,
Trace+2cq test performance for the BERT-Tiny model is complete. Bert-Tiny model doesn't support sharded inputs and implemented for the unsharded input tensors. Could I get a review for this pipeline?

Copy link
Member

@ayerofieiev-tt ayerofieiev-tt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approving for codeowners only, not reviewing this PR

@uaydonat
Copy link
Contributor

Is this a superset of #13471? If so, should we just combine the PRs into just one?

@uaydonat
Copy link
Contributor

@esmalTT can you review the 2CQ implementation?

@vigneshkeerthivasanx
Copy link
Contributor Author

Is this a superset of #13471? If so, should we just combine the PRs into just one?

We're awaiting the merge of #13471 first and will rebase afterward. We didn't want #13471 to get held up due to trace+2cq implementation.

@saichandax saichandax requested a review from esmalTT November 18, 2024 08:53
@saichandax
Copy link
Contributor

@esmalTT can you review the 2CQ implementation?

Added @esmalTT as reviewer, to notify. Please note.

@tt-rkim
Copy link
Collaborator

tt-rkim commented Nov 18, 2024

No CI runs, at all? Again?

Screenshot 2024-11-18 at 10 48 33 AM

@saichandax
Copy link
Contributor

@tt-rkim,
Trace_2cqs perf implementation hasn't been done before by MCW. This is the first model for which we have done trace+2cqs successfully. So, we are waiting for any review comments.
We'll trigger CIs if there are no changes requested.

We didn't want to trigger without any approval on this PR, specifically. Is that fine?

@tt-rkim
Copy link
Collaborator

tt-rkim commented Nov 18, 2024

You won't get my approval without passing CI.

This is fine, but then I will report back that CI is not the blocker.

models/demos/bert_tiny/tests/test_performance.py Outdated Show resolved Hide resolved
models/demos/bert_tiny/tests/test_perf_e2e_bert_tiny.py Outdated Show resolved Hide resolved
models/demos/bert_tiny/tests/perf_e2e_bert_tiny.py Outdated Show resolved Hide resolved
ttnn.record_event(1, write_event)
ttnn.wait_for_event(0, write_event)
# TODO: Add in place support to ttnn to_memory_config
# input_tensor = ttnn.reshard(tt_input_ids, input_mem_config, input_tensor)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be uncommented to work properly

Copy link
Contributor Author

@vigneshkeerthivasanx vigneshkeerthivasanx Nov 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In our first attempt on trace+2cq implementation, we have refered to existing work. Bert-Tiny is not sharded, whereas all the reference models are, and they are image models that process only a single input, whereas Bert-Tiny process multiple inputs.

We have modified the code to make it compatible with Bert-Tiny as an unsharded model.
Do we require the above lines as the inputs are not sharded?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In our first attempt on trace+2cq implementation, we have refered to existing work. Bert-Tiny is not sharded, whereas all the reference models are, and they are image models that process only a single input, whereas Bert-Tiny process multiple inputs.

We have modified the code to make it compatible with Bert-Tiny as an unsharded model. Do we require the above lines as the inputs are not sharded?

@esmalTT awaiting your approval on this PR and could you add your suggestion on this as well?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I can tell, my comments haven't been addressed. Please address the comments, and re-run pipelines before requesting re-review.

signpost(header="stop")
ttnn.dump_device_profiler(device)

ttnn.release_trace(device, tid)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should probably have some asserts (like checking PCC) in here to make sure things are working as expected.

@esmalTT
Copy link
Contributor

esmalTT commented Nov 19, 2024

@vigneshkeerthivasanx Please fill out the checklist section with links to CI runs before requesting reviews. There is no point in reviewing broken code.

@uaydonat
Copy link
Contributor

Did you run the posted code at least locally and confirmed that it works?

If so, it is safe to launch the CI, see that your tests pass, then post the passing CI links.

Then, we review the code. As Evan pointed out, it will waste of time to review untested broken code.

@vigneshkeerthivasanx
Copy link
Contributor Author

Will update the code and trigger CIs.

@vigneshkeerthivasanx vigneshkeerthivasanx force-pushed the vignesh/bert_tiny_trace_2cq branch from 5b8d6d6 to f2e62d6 Compare November 21, 2024 14:36
@vigneshkeerthivasanx vigneshkeerthivasanx force-pushed the vignesh/bert_tiny_trace_2cq branch 2 times, most recently from b6d3abb to 58785ba Compare November 25, 2024 13:08
@vigneshkeerthivasanx
Copy link
Contributor Author

CI Links
All post-commit tests: Link (Bert-tiny is passing)
Nightly fast dispatch tests: Link (Bert-tiny is passing)
(Single-card) Demo tests: Link (Bert-tiny is passing)
(Single-card) Device perf regressions: Link (Bert-tiny is passing)
(Single-card) Model perf tests : Link (Bert-tiny is passing)

Copy link
Collaborator

@tt-rkim tt-rkim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please address Evan's requests

@vigneshkeerthivasanx vigneshkeerthivasanx force-pushed the vignesh/bert_tiny_trace_2cq branch 2 times, most recently from ea5bc80 to d2ac87e Compare December 3, 2024 07:04
@vigneshkeerthivasanx vigneshkeerthivasanx force-pushed the vignesh/bert_tiny_trace_2cq branch from d2ac87e to 75d639b Compare December 3, 2024 07:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants