-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
capture nsys report #14
Conversation
Signed-off-by: Allen Xu <[email protected]>
val nsysStopCommand = "nsys stop" | ||
val result: String = nsysStopCommand.!! | ||
println(s"Nsys Stop Command output: $result") | ||
Thread.sleep(120* 1000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this make the GPU perf worse ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really. only enabling cuda memory track(--cuda-memory-usage=true
) will make perf worse, but I didn't enable that.
Should we enable that? that is used to see peak memory stuff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i mean the "sleep" here, it will take 2 mins.
Customer will also use this branch for benchmarks. I am not sure debug-only code could get in.
@winningsix what's your idea ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I got your point. I removed the sleep, as the command itself is blocking, no need to wait.
And as we discussed offline that the report generation time is counted, so for analysis, use the new branch, for perf bench, use the orignal one which doesn't do nsys.
Signed-off-by: Allen Xu <[email protected]>
This is for customer to try capture nsys report. The report is generated before executor exits, so user should be able to upload the report to their persist storage
This requires the change at entry point as well:
NOTE
--cuda-memory-usage=true
will cause GPU perf drop, so if you don't need memory usage analysis, you can remove it from the nsys comamnd.