Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Control whether a file data source memory-maps the file with an environment variable #17004

Merged
merged 8 commits into from
Oct 18, 2024
20 changes: 14 additions & 6 deletions cpp/src/io/utilities/datasource.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
*/

#include "file_io_utilities.hpp"
#include "getenv_or.hpp"

#include <cudf/detail/utilities/logger.hpp>
#include <cudf/detail/utilities/vector_factories.hpp>
Expand Down Expand Up @@ -392,14 +393,21 @@ std::unique_ptr<datasource> datasource::create(std::string const& filepath,
size_t offset,
size_t max_size_estimate)
{
#ifdef CUFILE_FOUND
if (cufile_integration::is_always_enabled()) {
// avoid mmap as GDS is expected to be used for most reads
auto const use_memory_mapping = [] {
auto const policy = getenv_or("LIBCUDF_MMAP_ENABLED", std::string{"ON"});

if (policy == "ON") { return true; }
if (policy == "OFF") { return false; }
ttnghia marked this conversation as resolved.
Show resolved Hide resolved

CUDF_FAIL("Invalid LIBCUDF_MMAP_ENABLED value: " + policy);
}();

if (use_memory_mapping) {
return std::make_unique<memory_mapped_source>(filepath.c_str(), offset, max_size_estimate);
} else {
// `file_source` reads the file directly, without memory mapping
return std::make_unique<file_source>(filepath.c_str());
}
#endif
// Use our own memory mapping implementation for direct file reads
return std::make_unique<memory_mapped_source>(filepath.c_str(), offset, max_size_estimate);
}

std::unique_ptr<datasource> datasource::create(host_buffer const& buffer)
Expand Down
Loading