Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#6344: Update RoBERTa QA demo #8896

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

kkeerthana0573
Copy link
Contributor

@kkeerthana0573 kkeerthana0573 commented May 28, 2024

Ticket

  • Link to Github Issue

Problem description

  • Functional RoBERTa model Demo.
  • Port RoBERTa model to n300.

What's changed

  • TTNN support for RoBERTa model is done using ttnn_optimized_bert pipeline.
  • Ported RoBERTa model functionality to n300.
  • Added demo for RoBERTaForQuestionAnswering task.

Checklist

  • Post commit CI passes
  • Blackhole Post commit (if applicable)
  • Model regression CI testing passes (if applicable)
  • Device performance regression CI testing passes (if applicable)
  • New/Existing tests provide coverage for changes

Passing CI links - 09.12.2024:

Copy link
Contributor

@eyonland eyonland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking for tests with both these annotations...

@pytest.mark.models_device_performance_bare_metal
@pytest.mark.models_performance_bare_metal

Note that we document this here: https://tenstorrent.github.io/tt-metal/latest/ttnn/ttnn/demos.html
See for example:
tests/ttnn/integration_tests/resnet/test_performance.py

@kkeerthana0573
Copy link
Contributor Author

@eyonland,
Will add the test and update the PR.
Thank you.

@kkeerthana0573 kkeerthana0573 force-pushed the keerthana/functional_roberta_demo branch from 519788f to 72b75e3 Compare June 19, 2024 08:09
@saichandax
Copy link
Contributor

@esmalTT , Could you please review and give your approval for this PR?

tt_output_start_logits = tt_output[..., :, 0]
tt_output_end_logits = tt_output[..., :, 1]

assert_with_pcc(torch_output_start_logits, tt_output_start_logits, 0.81 if is_grayskull else 0.89)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these PCC bounds need to be updated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The base model for RoBERTa is ttnn_optimized_bert, at models/demos/bert/tt/ttnn_optimized_bert.py. The PCC for roberta model is 0.81 on GS and 0.89 on WH using ttnn_optimized_bert.

cc: @boris-drazic

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we know this model is correct? This PCC value seems pretty low to me.

tt_output_start_logits = tt_output[..., :, 0]
tt_output_end_logits = tt_output[..., :, 1]

assert_with_pcc(torch_output_start_logits, tt_output_start_logits, 0.81 if is_grayskull else 0.89)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we know this model is correct? This PCC value seems pretty low to me.

models/demos/roberta/demo/demo.py Show resolved Hide resolved
models/demos/roberta/demo/demo.py Show resolved Hide resolved
@kkeerthana0573 kkeerthana0573 force-pushed the keerthana/functional_roberta_demo branch 2 times, most recently from 8df21f2 to 60c9f3e Compare November 22, 2024 06:37
@esmalTT
Copy link
Contributor

esmalTT commented Nov 22, 2024

@kkeerthana0573 When you are ready, please update the description with the links to successful CI runs.

@saichandax
Copy link
Contributor

@kkeerthana0573 , please add the details, why CIs are failing for RobertA.

@kkeerthana0573
Copy link
Contributor Author

The model is working as expected on top of 019a5ccaed35c1864cc102895114657e6dff08b6, Here are the passing CIs links:

However, with changes introduced in commit 93fccc5e8880010e8b5739478b48c037e747cbb4, ttnn.linear op using core_grids here is throwing Statically allocated circular buffers in program 15 clash with L1 buffers on core range [(x=0,y=0) - (x=7,y=6)]. L1 buffer allocated at 1032832 and static circular buffer region ends at 1325568.

Here is the failure log in Device Perf CIs for reference : link

@kkeerthana0573 kkeerthana0573 force-pushed the keerthana/functional_roberta_demo branch 2 times, most recently from cb3678c to 7d6dbe0 Compare December 2, 2024 21:06
@saichandax saichandax requested a review from esmalTT December 3, 2024 04:40
@kkeerthana0573 kkeerthana0573 force-pushed the keerthana/functional_roberta_demo branch from 7d6dbe0 to ec34d65 Compare December 3, 2024 08:59
@kkeerthana0573 kkeerthana0573 requested a review from a team as a code owner December 3, 2024 08:59
@uaydonat uaydonat self-requested a review December 4, 2024 02:15
@kkeerthana0573
Copy link
Contributor Author

The model is working as expected on top of 019a5ccaed35c1864cc102895114657e6dff08b6, Here are the passing CIs links:

However, with changes introduced in commit 93fccc5e8880010e8b5739478b48c037e747cbb4, ttnn.linear op using core_grids here is throwing Statically allocated circular buffers in program 15 clash with L1 buffers on core range [(x=0,y=0) - (x=7,y=6)]. L1 buffer allocated at 1032832 and static circular buffer region ends at 1325568.

Here is the failure log in Device Perf CIs for reference : link

This is resolved after Disabling packer L1 accum. and math_approx_mode as well as setting math_fidelity to HiFi2.

@kkeerthana0573
Copy link
Contributor Author

@kkeerthana0573 When you are ready, please update the description with the links to successful CI runs.

@esmalTT,
The PR is ready for review and Successful CI links are added in the description.
Thank you.

@kkeerthana0573 kkeerthana0573 force-pushed the keerthana/functional_roberta_demo branch from ec34d65 to 3b3b37e Compare December 9, 2024 07:09
@github-actions github-actions bot closed this Dec 17, 2024
@github-actions github-actions bot deleted the keerthana/functional_roberta_demo branch December 17, 2024 00:29
@kkeerthana0573 kkeerthana0573 restored the keerthana/functional_roberta_demo branch December 17, 2024 12:35
@esmalTT esmalTT force-pushed the keerthana/functional_roberta_demo branch from 3b3b37e to bd30318 Compare December 20, 2024 12:58
@esmalTT
Copy link
Contributor

esmalTT commented Dec 20, 2024

Re-running demo pipeline here: https://github.com/tenstorrent/tt-metal/actions/runs/12431869447

@kkeerthana0573 kkeerthana0573 force-pushed the keerthana/functional_roberta_demo branch from bd30318 to d4bf99c Compare December 23, 2024 15:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants