Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move the device timer conversion in cvk_device instead of commands #630

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

rjodinchr
Copy link
Contributor

No description provided.

Copy link
Owner

@kpet kpet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the motivation for this change? Performance improvements?

if (sync_host > sync_dev) {
return (sync_host - sync_dev) + dev;
cl_ulong cvk_device::device_timer_to_host(cl_ulong dev) {
if (dev > m_sync_dev) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (dev > m_sync_dev) {
if (dev <= m_sync_dev) {

Isn't your condition backwards? You want a new pair of aligned timestamps only when you've wrapped around and the timestamp to convert to the host's time base is smaller than the device timestamp used as a reference.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it is backwards.
The idea is to get new sync points when the time we want is bigger than what we have converted so far. Because they might have been a slight divergence since we have the last sync point.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you'll end up requesting a new pair of timestamps for pretty much each command since the device timestamp will almost always be greater than the last one obtained when getting a host/device pair. The only time where you're not requesting a host/device pair is when the device timestamp wraps around. At which point I don't see much motivation for the change.

src/device.cpp Outdated Show resolved Hide resolved
@rjodinchr
Copy link
Contributor Author

What's the motivation for this change? Performance improvements?

From what I remember, this is needed to fix issues introduced by the new pattern from using timeline semaphore.
But as without timeline semaphore this is just a different implementation that works as well, I thought it would be better to submit it before in order to reduce the size of the timeline semaphore PR.

@rjodinchr
Copy link
Contributor Author

I have rerun the CTS without this patch (but with the timeline semaphore) to remind me why this patch is needed with timeline semaphore.

In the implementation I have, there is a race condition between the application thread and the executor thread. It can lead to the thread app looking for an event profiling info, because the event is complete, but the executor has not computed the profiling info.
But this event might be related to a cvk_command inside of a cvk_command_batch. As the sync points are taken care by the batch command, it was making things complicated. And this is the way I dealt with it, moving the sync point to the device instead of the batch command.

@rjodinchr rjodinchr requested a review from kpet May 14, 2024 07:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants