Skip to content

Commit

Permalink
Fix offset value for generating test data in `parquet_chunked_reade…
Browse files Browse the repository at this point in the history
…r_test.cu` (#15200)

In `parquet_chunked_reader_test.cu`, when generating test data, there is an `offset` value that should increase at every iteration. It is for shifting the null positions of each column such that the generating table will not have all nulls in the same rows. Somehow, it was left unchanged across all iterations, thus we need to fix that.

Authors:
  - Nghia Truong (https://github.com/ttnghia)
  - Karthikeyan (https://github.com/karthikeyann)

Approvers:
  - https://github.com/nvdbaranec
  - Bradley Dice (https://github.com/bdice)
  - Karthikeyan (https://github.com/karthikeyann)

URL: #15200
  • Loading branch information
ttnghia authored Mar 19, 2024
1 parent 4a5fab7 commit ea40596
Showing 1 changed file with 5 additions and 3 deletions.
8 changes: 5 additions & 3 deletions cpp/tests/io/parquet_chunked_reader_test.cu
Original file line number Diff line number Diff line change
Expand Up @@ -66,8 +66,6 @@ auto write_file(std::vector<std::unique_ptr<cudf::column>>& input_columns,
std::size_t max_page_size_bytes = cudf::io::default_max_page_size_bytes,
std::size_t max_page_size_rows = cudf::io::default_max_page_size_rows)
{
// Just shift nulls of the next column by one position to avoid having all nulls in the same
// table rows.
if (nullable) {
// Generate deterministic bitmask instead of random bitmask for easy computation of data size.
auto const valid_iter = cudf::detail::make_counting_transform_iterator(
Expand All @@ -83,6 +81,10 @@ auto write_file(std::vector<std::unique_ptr<cudf::column>>& input_columns,
std::move(col),
cudf::get_default_stream(),
rmm::mr::get_current_device_resource());

// Shift nulls of the next column by one position, to avoid having all nulls
// in the same table rows.
++offset;
}
}

Expand Down Expand Up @@ -988,7 +990,7 @@ TEST_F(ParquetChunkedReaderTest, TestChunkedReadWithListsOfStructs)

{
auto const [result, num_chunks] = chunked_read(filepath_with_nulls, 1'500'000);
EXPECT_EQ(num_chunks, 4);
EXPECT_EQ(num_chunks, 5);
CUDF_TEST_EXPECT_TABLES_EQUAL(*expected_with_nulls, *result);
}

Expand Down

0 comments on commit ea40596

Please sign in to comment.