-
Notifications
You must be signed in to change notification settings - Fork 917
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] spil data from main memory to disk memory #3740
Comments
I guess this is more of a question for RMM, but we had a long discussion about this some time ago @jrhemstad , do you have any idea of how managed memory could eventually allow spilling to disk? I don't think that this is something that this is supported currently, given it's a driver level feature. |
This is out of the scope of |
Thank you for claryfing, it is well enough for my case. |
I am looking for a feature to spill computation vmem->mem->disk using |
We have spilling configurations defined in dask-cuda. I would suggest reading over the linked doc and if you have questions, post them in the dask-cuda GH repo |
@quasiben Thank you, I found existing issue for that there already, linking for future readers: rapidsai/dask-cuda#37 |
I have dataset stored in CSV of 3 different sizes
My machine has 120 GB of main memory and 11 GB of gpu memory.
cudf is capable to make computation on 0.5 GB data using just gpu memory.
If I want to run computations on 5 GB data I can set
cudf.set_allocator("managed")
. It works really nice and fast, more importantly it allows me to run computation on my medium data.Problem is when I attempt to run computation on 50 GB data I am getting following error:
I assume there is not enough of main memory for this computation. This is known python issue for pandas and (in-memory-)dask when attempting to run same computation on that data size.
My feature request is about extending the feature of spilling gpu memory to main memory for a disk memory as well.
The text was updated successfully, but these errors were encountered: