-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#14257: few more optimization for yolo #15582
Conversation
Nightly Regression:https://github.com/tenstorrent/tt-metal/actions/runs/12103838737 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any perf impact of this change?
Does trace size needs to be adjusted?
Yes. The current perf is 200fps from 170. I ran the nightly CI no need for trace size adjastment. |
We don't have a perf CI in place for yolo ATM? |
Yeah we don't hv perf CI for yolo. I am getting the numbers from IRD machines(1st sheet with final name.). |
Ok than, document the perf diff in PR description/commit message. |
To calculate the FPS we take the sum of device kernel duration(ns) Column divided by 10^9 (sec --> nano sec conversion). |
Please run:
|
https://github.com/tenstorrent/tt-metal/actions/runs/12103838737 |
4adfa9c
to
2ee94e3
Compare
@shwetankTT Lets get this on the perf CI so that this does not regress. |
@mywoodstock Yeah. Added the perf on CI. Validating it. |
Ticket
#14257:
Problem description
YoloV4 Optimization.
Improvement:--> Getting 200fps from 170fps.
What's changed
Adding conv optimization flag for few layers. Removed few sharding which are not needed.
Checklist