Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add to_arrow_device() functions that accept views #15465

Merged
merged 20 commits into from
Apr 22, 2024
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
6edf1c2
Add to_arrow_device() functions that accept views
davidwendt Apr 4, 2024
6808443
Merge branch 'branch-24.06' into to-arrow-from-view
davidwendt Apr 4, 2024
b5d44a1
remove changes from test
davidwendt Apr 4, 2024
4989510
Merge branch 'branch-24.06' into to-arrow-from-view
davidwendt Apr 4, 2024
8b55b10
Merge branch 'branch-24.06' into to-arrow-from-view
davidwendt Apr 5, 2024
94bfd66
add cuda-try to event-destroy
davidwendt Apr 5, 2024
17ae26d
Merge branch 'branch-24.06' into to-arrow-from-view
davidwendt Apr 8, 2024
9f17519
Merge branch 'branch-24.06' into to-arrow-from-view
davidwendt Apr 9, 2024
6517524
refactor; add gtests
davidwendt Apr 10, 2024
65ed374
Merge branch 'branch-24.06' into to-arrow-from-view
davidwendt Apr 10, 2024
838a32b
Merge branch 'branch-24.06' into to-arrow-from-view
davidwendt Apr 10, 2024
9835511
Merge branch 'branch-24.06' into to-arrow-from-view
davidwendt Apr 10, 2024
d5bd427
add data_type_error to CUDF_FAILS
davidwendt Apr 11, 2024
077b566
change input view parameters to const&
davidwendt Apr 11, 2024
ac9ff09
Merge branch 'branch-24.06' into to-arrow-from-view
davidwendt Apr 11, 2024
662920f
Merge branch 'branch-24.06' into to-arrow-from-view
davidwendt Apr 16, 2024
b8c5f43
Merge branch 'branch-24.06' into to-arrow-from-view
davidwendt Apr 17, 2024
5f3ec83
change CUDF_CUDA_TRY to RMM_ASSERT_CUDA_SUCCESS
davidwendt Apr 17, 2024
b2fae15
add all copy types to doxygen
davidwendt Apr 18, 2024
27b57b5
Merge branch 'branch-24.06' into to-arrow-from-view
davidwendt Apr 18, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions cpp/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -359,6 +359,8 @@ add_library(
src/interop/from_arrow.cu
src/interop/to_arrow.cu
src/interop/to_arrow_device.cu
src/interop/to_arrow_schema.cpp
src/interop/to_arrow_utilities.cpp
src/interop/detail/arrow_allocator.cpp
src/io/avro/avro.cpp
src/io/avro/avro_gpu.cu
Expand Down
62 changes: 60 additions & 2 deletions cpp/include/cudf/interop.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -256,6 +256,66 @@ unique_device_array_t to_arrow_device(
rmm::cuda_stream_view stream = cudf::get_default_stream(),
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource());

/**
vyasr marked this conversation as resolved.
Show resolved Hide resolved
* @brief Create `ArrowDeviceArray` from a table view
*
* Populates the C struct ArrowDeviceArray performing copies only if necessary.
* This wraps the data on the GPU device and gives a view of the table data
* to the ArrowDeviceArray struct. If the caller frees the data referenced by
* the table_view, using the returned object results in undefined behavior.
*
* After calling this function, the release callback on the returned ArrowDeviceArray
* must be called to clean up any memory created during conversion.
*
* @note For decimals, since the precision is not stored for them in libcudf
* it will be converted to an Arrow decimal128 with the widest-precision the cudf decimal type
* supports. For example, numeric::decimal32 will be converted to Arrow decimal128 of the precision
* 9 which is the maximum precision for 32-bit types. Similarly, numeric::decimal128 will be
* converted to Arrow decimal128 of the precision 38.
*
* @note Copies will be performed in the cases where cudf differs from Arrow
* such as in the representation of bools (Arrow uses a bitmap, cudf uses 1-byte per value).
*
* @param table Input table
* @param stream CUDA stream used for device memory operations and kernel launches
* @param mr Device memory resource used for any allocations during conversion
* @return ArrowDeviceArray which will have ownership of any copied data
*/
unique_device_array_t to_arrow_device(
cudf::table_view table,
davidwendt marked this conversation as resolved.
Show resolved Hide resolved
rmm::cuda_stream_view stream = cudf::get_default_stream(),
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource());

/**
* @brief Create `ArrowDeviceArray` from a column view
*
* Populates the C struct ArrowDeviceArray performing copies only if necessary.
* This wraps the data on the GPU device and gives a view of the column data
* to the ArrowDeviceArray struct. If the caller frees the data referenced by
* the column_view, using the returned object results in undefined behavior.
*
* After calling this function, the release callback on the returned ArrowDeviceArray
* must be called to clean up any memory created during conversion.
*
* @note For decimals, since the precision is not stored for them in libcudf
* it will be converted to an Arrow decimal128 with the widest-precision the cudf decimal type
* supports. For example, numeric::decimal32 will be converted to Arrow decimal128 of the precision
* 9 which is the maximum precision for 32-bit types. Similar, numeric::decimal128 will be
* converted to Arrow decimal128 of the precision 38.
*
* @note Copies will be performed in the cases where cudf differs from Arrow such as
* in the representation of bools (Arrow uses a bitmap, cudf uses 1 byte per value).
vyasr marked this conversation as resolved.
Show resolved Hide resolved
*
* @param col Input column
* @param stream CUDA stream used for device memory operations and kernel launches
* @param mr Device memory resource used for any allocations during conversion
* @return ArrowDeviceArray which will have ownership of any copied data
*/
unique_device_array_t to_arrow_device(
cudf::column_view col,
davidwendt marked this conversation as resolved.
Show resolved Hide resolved
rmm::cuda_stream_view stream = cudf::get_default_stream(),
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource());

/**
* @brief Create `cudf::table` from given arrow Table input
*
Expand All @@ -264,7 +324,6 @@ unique_device_array_t to_arrow_device(
* @param mr Device memory resource used to allocate `cudf::table`
* @return cudf table generated from given arrow Table
*/

std::unique_ptr<table> from_arrow(
arrow::Table const& input,
rmm::cuda_stream_view stream = cudf::get_default_stream(),
Expand All @@ -278,7 +337,6 @@ std::unique_ptr<table> from_arrow(
* @param mr Device memory resource used to allocate `cudf::scalar`
* @return cudf scalar generated from given arrow Scalar
*/

std::unique_ptr<cudf::scalar> from_arrow(
arrow::Scalar const& input,
rmm::cuda_stream_view stream = cudf::get_default_stream(),
Expand Down
Loading
Loading