Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fixed an issue with output chunking computation stemming from input c…
…hunking. (rapidsai#14889) Fixes rapidsai#14883 The core issue was that the output chunking code was expecting all columns to have terminating pages that end in the same row count. Previously this was the case because we always processed entire row groups. But now with the subrowgroup reader, we can split on page boundaries that cause a jagged max row index for different columns. Example: ``` 0 100 200 Col A [-----------][--------------] 300 Col B [-----------][----------------------] ``` The input chunking would have computed a max row index of 200 for the subpass. But when computing the _output_ chunks, there was code that would have tried finding where row 300 was in column A, resulting in an out-of-bounds read. The fix is simply to cap the max row seen for column B to be the max expected row for the subpass. Authors: - https://github.com/nvdbaranec Approvers: - Nghia Truong (https://github.com/ttnghia) - Vukasin Milovanovic (https://github.com/vuule) - Mike Wilson (https://github.com/hyperbolic2346) URL: rapidsai#14889
- Loading branch information