Extend memory spilling to multiple storage media #37

pentschev · 2019-04-21T20:52:31Z

Currently in the works of #35, we will have the capability of spilling CUDA device memory to host, and that to disk. However, as pointed out by @kkraus14 here, it would be beneficial to allow spilling host memory to multiple user-defined storage media.

I think we could follow the same configuration structure of Alluxio, as suggested by @kkraus14. Based on the current structure suggested in #35 (still subject to change), it would look something like the following:

cuda.worker.dirs.path=/mnt/nvme,/mnt/ssd,/mnt/nfs
cuda.worker.dirs.quota=16GB,100GB,1000GB

@mrocklin FYI

The text was updated successfully, but these errors were encountered:

jakirkham · 2019-11-28T21:19:56Z

One related note for tracking, it would be useful to leverage GPUDirect Storage to allow spilling directly from GPU memory to disk.

github-actions · 2021-02-16T19:09:27Z

This issue has been marked stale due to no recent activity in the past 30d. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be marked rotten if there is no activity in the next 60d.

github-actions · 2021-05-17T19:10:58Z

This issue has been labeled inactive-90d due to no recent activity in the past 90 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.

jangorecki · 2021-05-27T13:43:58Z

@pentschev could you link documentation which explains how to set up spilling to disk? I found https://github.com/rapidsai/dask-cuda/pull/51/files but there doesn't seem to be any documentation on new feature.
I want to use dask_cudf to spill from vmem to main mem, and then from main mem to disk, when main mem is not enough. Searching https://docs.rapids.ai/ doesn't provide any answer.

quasiben · 2021-05-27T13:47:06Z

In rapidsai/cudf#3740 I linked to: https://docs.rapids.ai/api/dask-cuda/nightly/spilling.html

jangorecki · 2021-05-27T14:04:07Z

This doc doesn't seem to answer my use case.

pentschev · 2021-05-27T15:27:56Z

Currently, --device-memory-limit/device_memory_limit (dask-cuda-worker/LocalCUDACluster) will spill from device to host, similarly, --memory-limit/memory_limit spills from host to disk just like in mainline Dask, and the spilled data is stored in --local-directory/local_directory. Spilling to disk today is only supported for the default mechanism, JIT spilling still doesn't support it.

jangorecki · 2021-05-27T16:29:06Z

@pentschev thank you for reply although it doesn't correspond to my current approach (cu.set_allocator("managed").
AFAIU to use it with dask I should have

client = Client(cluster)
client.run(cu.set_allocator, "managed")

Is this going to handle spilling vmem->mem->disk?
I don't want to change default limits of memory, but only enable spilling.

pentschev · 2021-05-27T17:11:34Z

No, managed memory is handled by the CUDA driver, we have no control over how it handles spilling and it doesn't support any spilling to disk whatsoever. Within Dask, you can enable spilling as I mentioned above, it doesn't make use of managed memory and thus is not as performant, but it will allow Dask to spill Python memory (i.e., Dask array/dataframes chunks), but it also has no control over the memory that's handled internally by libraries such as cuDF.

github-actions · 2021-11-23T20:03:31Z

This issue has been labeled inactive-90d due to no recent activity in the past 90 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.

github-actions · 2021-11-23T20:03:36Z

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

pentschev added the feature request New feature or request label Jan 8, 2021

github-actions bot added the inactive-30d label Feb 16, 2021

github-actions bot added the inactive-90d label May 17, 2021

jangorecki mentioned this issue May 27, 2021

[FEA] spil data from main memory to disk memory rapidsai/cudf#3740

Closed

github-actions bot removed the inactive-90d label May 27, 2021

github-actions bot removed the inactive-30d label May 27, 2021

jangorecki mentioned this issue May 27, 2021

extend GPU memory to run cuDF for medium and big data h2oai/db-benchmark#97

Closed

github-actions bot added the inactive-90d label Nov 23, 2021

github-actions bot added the inactive-30d label Nov 23, 2021

caryr35 added this to dask-cuda Dec 8, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extend memory spilling to multiple storage media #37

Extend memory spilling to multiple storage media #37

pentschev commented Apr 21, 2019

jakirkham commented Nov 28, 2019

github-actions bot commented Feb 16, 2021

github-actions bot commented May 17, 2021

jangorecki commented May 27, 2021

quasiben commented May 27, 2021

jangorecki commented May 27, 2021

pentschev commented May 27, 2021

jangorecki commented May 27, 2021 •

edited

Loading

pentschev commented May 27, 2021

github-actions bot commented Nov 23, 2021

github-actions bot commented Nov 23, 2021

Extend memory spilling to multiple storage media #37

Extend memory spilling to multiple storage media #37

Comments

pentschev commented Apr 21, 2019

jakirkham commented Nov 28, 2019

github-actions bot commented Feb 16, 2021

github-actions bot commented May 17, 2021

jangorecki commented May 27, 2021

quasiben commented May 27, 2021

jangorecki commented May 27, 2021

pentschev commented May 27, 2021

jangorecki commented May 27, 2021 • edited Loading

pentschev commented May 27, 2021

github-actions bot commented Nov 23, 2021

github-actions bot commented Nov 23, 2021

jangorecki commented May 27, 2021 •

edited

Loading