-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#13397: Add data parallel suppport for SqueezeBERT model #13418
base: main
Are you sure you want to change the base?
Conversation
e698cc6
to
e041c8a
Compare
Can you post passing links... |
e02f5ea
to
7842fe6
Compare
You posted the wrong device perf link. I found it: https://github.com/tenstorrent/tt-metal/actions/runs/11908678446/job/33185055179 Please post the right link next time. By the way, it seems to have failed on your model. |
@tt-rkim, |
7842fe6
to
5189ca0
Compare
Passing CI links: (Single-card) Model perf tests - In Progress |
I will approve to unblock since it's only one left, but please ensure ttnn nightly passes. |
2f5a5b4
to
010fa8e
Compare
|
||
del tt_output | ||
i += 1 | ||
eval_score = squad_metric.compute(predictions=pred_labels, references=true_labels) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be safer if the test fails if eval_score is lower than some threshold
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The evaluation scores should ideally remain consistent with the batch size and number of iterations specified in the demo. However, they may or may not vary with changes in batch size. I’ve now added an assertion to validate the expected scores.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the check should be tighter. The purpose of this assert is to catch regressions if a bad commit goes in. For example, if I push a change to a kernel and the score drops from 98 to 97.9, this is a bug, and this test should catch it.
So, we want to assert if there are any changes in the eval score, maybe with a small margin.
models/demos/wormhole/squeezebert/tests/test_perf_device_squeezebert.py
Outdated
Show resolved
Hide resolved
|
||
|
||
def get_expected_times(squeezebert): | ||
return {ttnn_functional_squeezebert: (29.29, 15.5)}[squeezebert] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the current run time? How close is it to the expected times?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current runtimes in the test file have been updated based on the average times observed during CI runs.
We’re uncertain about the target numbers. Is there any other metric, besides the average CI times, that we can use to determine the target numbers?
cc: @boris-drazic, @mbahnasTT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You want the expected times to be close to the current time, but not so close that a small variation will cause test to fail. Maybe 10-20% margin is good.
models/demos/wormhole/squeezebert/tt/ttnn_functional_squeezebert.py
Outdated
Show resolved
Hide resolved
436aa1d
to
b25513c
Compare
b25513c
to
8ddc243
Compare
8ddc243
to
080e26d
Compare
mesh_device=mesh_device, | ||
use_program_cache=use_program_cache, | ||
model_name=model_name, | ||
batch_size=8, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should call with correct batch size
mesh_device=mesh_device, | ||
use_program_cache=use_program_cache, | ||
model_name=model_name, | ||
batch_size=8, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should call with correct batch size
|
||
profiler.start(f"preprocessing_parameter") | ||
mesh_device_flag = is_wormhole_b0() and ttnn.GetNumAvailableDevices() == 2 | ||
batch_size = batch_size * 2 if mesh_device_flag else batch_size |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do not overwrite the batch_size that the caller gives
tt_model_name = f"ttnn_{model_name}_optimized" | ||
|
||
mesh_device_flag = is_wormhole_b0() and ttnn.GetNumAvailableDevices() == 2 | ||
batch_size = batch_size * 2 if mesh_device_flag else batch_size |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do not overwrite the batch_size that the caller gives
|
||
del tt_output | ||
i += 1 | ||
eval_score = squad_metric.compute(predictions=pred_labels, references=true_labels) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the check should be tighter. The purpose of this assert is to catch regressions if a bad commit goes in. For example, if I push a change to a kernel and the score drops from 98 to 97.9, this is a bug, and this test should catch it.
So, we want to assert if there are any changes in the eval score, maybe with a small margin.
|
||
|
||
def get_expected_times(squeezebert): | ||
return {ttnn_functional_squeezebert: (29.29, 15.5)}[squeezebert] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You want the expected times to be close to the current time, but not so close that a small variation will cause test to fail. Maybe 10-20% margin is good.
5dd5731
to
9d46e12
Compare
9d46e12
to
6a66479
Compare
Ticket
Link to Github Issue
Problem description
The SqueezeBERT model is configured to run on either N150 or N300, depending on the available machine.
Checklist