-
Notifications
You must be signed in to change notification settings - Fork 912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace direct cudaMemcpyAsync
calls with utility functions (within /src
)
#17550
base: branch-25.02
Are you sure you want to change the base?
Conversation
…avoid-cudamemcpy-rest
…avoid-cudamemcpy-rest
…avoid-cudamemcpy-rest
…avoid-cudamemcpy-rest
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
@@ -78,7 +78,7 @@ class device_scalar : public rmm::device_scalar<T> { | |||
[[nodiscard]] T value(rmm::cuda_stream_view stream) const | |||
{ | |||
cuda_memcpy<T>(bounce_buffer, device_span<T const>{this->data(), 1}, stream); | |||
return bounce_buffer[0]; | |||
return std::move(bounce_buffer[0]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just in case we want to make a device_scalar of a non-copyable type.
…avoid-cudamemcpy-rest
…into avoid-cudamemcpy-rest
cudaMemcpyAsync
calls with utility functions (limited to /src
)cudaMemcpyAsync
calls with utility functions (within /src
)
…avoid-cudamemcpy-rest
Co-authored-by: David Wendt <[email protected]>
Co-authored-by: David Wendt <[email protected]>
…into avoid-cudamemcpy-rest
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me
…avoid-cudamemcpy-rest
Description
Replaced the calls to
cudaMemcpyAsync
with the newcuda_memcpy
/cuda_memcpy_async
utility, which optionally avoids using the copy engine.Also took the opportunity to use cudf::detail::host_vector and its factories to enable wider pinned memory use.
Remaining instances are either not viable (e.g. copying
h_needs_fallback
, interop) or D2D copies.Checklist