-
Notifications
You must be signed in to change notification settings - Fork 912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add default pinned pool that falls back to new pinned allocations #15665
Changes from 2 commits
163ad97
1e850d6
3be42ba
395dcf1
70ae74e
503d170
0873b1f
ff18a21
f5a735c
5766805
0bd92bf
854c0ab
5bf0ce4
284654d
ff4d7f6
f5b2c84
cf3f8a3
80b5963
d23684d
1828e05
0122038
60030da
a62377e
6733c45
abf40a8
fa7dce7
a244d7c
7076e73
0b8aa44
3db44a3
27d30c8
224e68f
382e7b3
b2fd734
0eccf9a
ecd6481
709123f
01b1bdb
ecb5f5a
f0d0bf0
f989a56
d0e6dd7
2b4952a
fdcfad3
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -206,14 +206,16 @@ static_assert(cuda::mr::resource_with<fixed_pinned_pool_memory_resource, | |
cuda::mr::host_accessible>, | ||
""); | ||
|
||
rmm::host_async_resource_ref default_pinned_mr() | ||
rmm::host_async_resource_ref make_default_pinned_mr(std::optional<size_t> config_size) | ||
{ | ||
static fixed_pinned_pool_memory_resource mr = []() { | ||
auto const size = []() -> size_t { | ||
static fixed_pinned_pool_memory_resource mr = [config_size]() { | ||
auto const size = [&config_size]() -> size_t { | ||
if (auto const env_val = getenv("LIBCUDF_PINNED_POOL_SIZE"); env_val != nullptr) { | ||
return std::atol(env_val); | ||
} | ||
|
||
if (config_size.has_value()) { return *config_size; } | ||
|
||
size_t free{}, total{}; | ||
CUDF_CUDA_TRY(cudaMemGetInfo(&free, &total)); | ||
// 0.5% of the total device memory, capped at 100MB | ||
|
@@ -228,16 +230,22 @@ rmm::host_async_resource_ref default_pinned_mr() | |
return mr; | ||
} | ||
|
||
rmm::host_async_resource_ref make_host_mr(std::optional<size_t> size) | ||
{ | ||
static rmm::host_async_resource_ref mr_ref = make_default_pinned_mr(size); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Perhaps this was asked before but I'm curious when/how this object is destroyed? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. great question. Currently the pool itself is not destroyed as it caused a segfault at the end of some tests; presumably because of the call to cudaFreeHost after main(). But this is something I should revisit and verify what exactly the issue was. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, can't destroy a static pool resource object. Open to suggestions to avoid the pool leak. |
||
return mr_ref; | ||
} | ||
|
||
std::mutex& host_mr_mutex() | ||
{ | ||
static std::mutex map_lock; | ||
return map_lock; | ||
} | ||
|
||
rmm::host_async_resource_ref host_mr() | ||
rmm::host_async_resource_ref& host_mr() | ||
{ | ||
static rmm::host_async_resource_ref host_mr = default_pinned_mr(); | ||
return host_mr; | ||
static rmm::host_async_resource_ref mr_ref = make_host_mr(std::nullopt); | ||
return mr_ref; | ||
} | ||
|
||
} // namespace | ||
|
@@ -256,4 +264,10 @@ rmm::host_async_resource_ref get_host_memory_resource() | |
return host_mr(); | ||
} | ||
|
||
void config_host_memory_resource(size_t size) | ||
{ | ||
std::scoped_lock lock{host_mr_mutex()}; | ||
make_host_mr(size); | ||
} | ||
|
||
} // namespace cudf::io |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a dangerous requirement, and may not be satisfied. How about making the static function re-configuruable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For doing so:
host_mr
will initialize it withstd::nullopt
size if it isnullptr
, otherwise just derefs the current pointer and returns.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to clarify, the issue with calling config after get/set is that it would have no effect.
allowing this opens another can of worms., e.g. what is the intended effect of calling config after set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't allow this, let's make some validity check to prevent it from being accidentally misused. It sounds unsafe if we just make an assumption.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@abellina what behavior do you suggest when config is called after the first resource use? I'm not sure if we should throw or just warn.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should throw, I agree with @ttnghia that we should do something in that case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a mechanism to throw if config is called after the default resource has already been created.
@abellina might be good to test your branch with this change.