Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] provide external libraries a way of getting a DeviceBuffer pointer that can become spillable again #14029

Open
wence- opened this issue Sep 1, 2023 · 2 comments
Labels
0 - Backlog In queue waiting for assignment feature request New feature or request improvement Improvement / enhancement to an existing function Performance Performance related issue Python Affects Python cuDF API.

Comments

@wence-
Copy link
Contributor

wence- commented Sep 1, 2023

Is your feature request related to a problem? Please describe.

When running in a multi-gpu setting, message passing with ucx-py takes a DeviceBuffer and obtains the device memory pointer through the __cuda_array_interface__. This, correctly, marks the buffer as unspillable.

It would be nice if there were a way to expose a pointer that is marked as unspillable until the external library drops the reference (kind of like acquire_spill_lock). ucx-py could then use it, and scope the pointer use to the lifetime of the message request (once the request is completed, the pointer can be dropped and is available for spilling again).

Describe the solution you'd like

If we were to hand back an object that had a weakref.finalize(obj, unmark_spillable) callback, when it was dropped, we could let the buffer be spillable again.

Describe alternatives you've considered

Making ucx-py aware of cudf and using acquire_spill_lock.

cc @madsbk / @vyasr / @galipremsagar

@wence- wence- added feature request New feature or request Needs Triage Need team to review and classify 0 - Backlog In queue waiting for assignment Python Affects Python cuDF API. Performance Performance related issue improvement Improvement / enhancement to an existing function and removed Needs Triage Need team to review and classify labels Sep 1, 2023
@madsbk
Copy link
Member

madsbk commented Sep 4, 2023

Agree, this would be very useful but notice we do something like this already when serializing.
SpillableBuffer.serialize() returns a Buffer with the spill lock as owner:
https://github.com/rapidsai/cudf/blob/branch-23.10/python/cudf/cudf/core/buffer/spillable_buffer.py#L460-L474

@wence-
Copy link
Contributor Author

wence- commented Sep 4, 2023

Ah, I think I hadn't spotted that. That also works I think, so things are less bad than I thought

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0 - Backlog In queue waiting for assignment feature request New feature or request improvement Improvement / enhancement to an existing function Performance Performance related issue Python Affects Python cuDF API.
Projects
Status: Todo
Development

No branches or pull requests

2 participants