Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding assertion to check for regular JSON inputs of size greater than INT_MAX bytes #17057

Merged
merged 6 commits into from
Oct 14, 2024

Conversation

shrshi
Copy link
Contributor

@shrshi shrshi commented Oct 10, 2024

Description

Addresses #17017

Libcudf does not support parsing regular JSON inputs of size greater than INT_MAX bytes. Note that the batched reader can only be used for JSON lines inputs.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@github-actions github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Oct 10, 2024
@shrshi shrshi added bug Something isn't working cuIO cuIO issue non-breaking Non-breaking change labels Oct 10, 2024
@shrshi shrshi marked this pull request as ready for review October 11, 2024 00:49
@shrshi shrshi requested a review from a team as a code owner October 11, 2024 00:49
Copy link
Member

@mhaseeb123 mhaseeb123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comment, may need style fix. LGTM otherwise!

cpp/src/io/json/read_json.cu Outdated Show resolved Hide resolved
Copy link
Contributor

@karthikeyann karthikeyann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does check_input_size need an update as well ? because it uses
std::numeric_limits<cudf::io::json::SymbolOffsetT>::max() aka uint32_t

@shrshi
Copy link
Contributor Author

shrshi commented Oct 11, 2024

does check_input_size need an update as well ? because it uses std::numeric_limits<cudf::io::json::SymbolOffsetT>::max() aka uint32_t

That's correct, I've updated check_input_size to make it consistent with the check in read_json.

@shrshi shrshi mentioned this pull request Oct 11, 2024
3 tasks
@shrshi
Copy link
Contributor Author

shrshi commented Oct 14, 2024

/merge

@rapids-bot rapids-bot bot merged commit 319ec3b into rapidsai:branch-24.12 Oct 14, 2024
102 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cuIO cuIO issue libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants