-
Notifications
You must be signed in to change notification settings - Fork 912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend device_scalar
to optionally use pinned bounce buffer
#16947
Extend device_scalar
to optionally use pinned bounce buffer
#16947
Conversation
…fea-pinned-aware-device_scalar
…fea-host-device-copy
device_scalar
that optionally uses pinned bounce buffer
device_scalar
that optionally uses pinned bounce bufferdevice_scalar
to optionally uses pinned bounce buffer
device_scalar
to optionally uses pinned bounce bufferdevice_scalar
to optionally use pinned bounce buffer
…fea-pinned-aware-device_scalar
…uule/cudf into fea-pinned-aware-device_scalar
device_scalar
to optionally use pinned bounce bufferdevice_scalar
to optionally use pinned bounce buffer
…fea-pinned-aware-device_scalar
|
||
void set_value_async(T&& value, rmm::cuda_stream_view stream) | ||
{ | ||
bounce_buffer[0] = std::move(value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bonus feature from having a bounce buffer - we don't need to worry about the value
lifetime. rmm::device_scalar
prohibits passing an rvalue here, but we don't need to.
…fea-host-device-copy
…fea-pinned-aware-device_scalar
Why the global change? Could we still use |
(discussed offline, posting here for viz) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new class seems fine, but do we really need to use it everywhere? Does this mean that any usage of rmm::device_scalar in cudf is now forbidden because it could introduce unexpected performance overheads? If so, should we include a pre-commit hook or something to that effect to enforce that? Also, if this implication is correct @davidwendt may want to weigh in.
The title says "optionally use pinned bounce buffer", but this usage looks unconditional. Is the optionality encoded in the host vector, or is it simply no longer optional?
The use of a bounce buffer is unconditional, but |
Pretty much. My aim is to stop using rmm::device_scalar outside of public APIs. Same goes for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving based on some extensive offline discussion. We're going to move frward with this and see how this type works in libcudf, then consider upstreaming it to rmm if it's generalizable, and if it's not then looking into developing clear guidelines for when it should be used in cudf.
: rmm::device_scalar<T>{std::move(other)}, bounce_buffer{std::move(other.bounce_buffer)} | ||
{ | ||
} | ||
device_scalar& operator=(device_scalar&&) noexcept = default; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious why the move ctor required code but this did not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Default implementations should be fine in both cases. Compiled fine on 12.5 🤷
I suspect it's an 11.8 compiler bug, but really didn't want to dig into it, with a handy workaround available.
/merge |
Description
Depends on #16945
Added
cudf::detail::device_scalar
, derived fromrmm::device_scalar
. The new class overrides function members that perform copies between host and device. New implementation uses acudf::detail::host_vector
as a bounce buffer to avoid performing a pageable copy.Replaced
rmm::device_scalar
withcudf::detail::device_scalar
across libcudf.Checklist