Skip to content

Commit

Permalink
Add row conversion code from spark-rapids-jni (#14664)
Browse files Browse the repository at this point in the history
This temporarily moves the row conversion code from spark-rapids-jni into libcudf. It is necessary to have the row conversion code compiled in a static library to overcome a CCCL issue that triggers invalid memory access when calling to `thrust::in(ex)clusive_scan` (NVIDIA/spark-rapids-jni#1567).

In the future, when we have CCCL updated to fix the issue (1567), we may need to move the code back into spark-rapids-jni.

Authors:
  - Nghia Truong (https://github.com/ttnghia)

Approvers:
  - Mike Wilson (https://github.com/hyperbolic2346)
  - Vyas Ramasubramani (https://github.com/vyasr)
  - MithunR (https://github.com/mythrocks)

URL: #14664
  • Loading branch information
ttnghia authored Dec 21, 2023
1 parent 49e4011 commit 36f56c9
Show file tree
Hide file tree
Showing 10 changed files with 1,990 additions and 1,012 deletions.
1 change: 1 addition & 0 deletions cpp/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -628,6 +628,7 @@ add_library(
src/transform/nans_to_nulls.cu
src/transform/one_hot_encode.cu
src/transform/row_bit_count.cu
src/transform/row_conversion.cu
src/transform/transform.cpp
src/transpose/transpose.cu
src/unary/cast_ops.cu
Expand Down
54 changes: 54 additions & 0 deletions cpp/include/cudf/row_conversion.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
/*
* Copyright (c) 2022-2023, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

#pragma once

#include <memory>

#include <cudf/lists/lists_column_view.hpp>
#include <cudf/table/table_view.hpp>
#include <cudf/utilities/default_stream.hpp>
#include <rmm/cuda_stream_view.hpp>

namespace cudf {
//! @cond Doxygen_Suppress

std::vector<std::unique_ptr<cudf::column>> convert_to_rows_fixed_width_optimized(
cudf::table_view const& tbl,
// TODO need something for validity
rmm::cuda_stream_view stream = cudf::get_default_stream(),
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource());

std::vector<std::unique_ptr<cudf::column>> convert_to_rows(
cudf::table_view const& tbl,
// TODO need something for validity
rmm::cuda_stream_view stream = cudf::get_default_stream(),
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource());

std::unique_ptr<cudf::table> convert_from_rows_fixed_width_optimized(
cudf::lists_column_view const& input,
std::vector<cudf::data_type> const& schema,
rmm::cuda_stream_view stream = cudf::get_default_stream(),
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource());

std::unique_ptr<cudf::table> convert_from_rows(
cudf::lists_column_view const& input,
std::vector<cudf::data_type> const& schema,
rmm::cuda_stream_view stream = cudf::get_default_stream(),
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource());

//! @endcond
} // namespace cudf

Large diffs are not rendered by default.

2 changes: 2 additions & 0 deletions cpp/tests/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -257,6 +257,8 @@ ConfigureTest(
transform/one_hot_encode_tests.cpp
)

ConfigureTest(ROW_CONVERSION_TEST transform/row_conversion.cpp)

# ##################################################################################################
# * interop tests -------------------------------------------------------------------------
ConfigureTest(
Expand Down
Loading

0 comments on commit 36f56c9

Please sign in to comment.