-
Notifications
You must be signed in to change notification settings - Fork 912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JSON reader validation of values #15968
JSON reader validation of values #15968
Conversation
…_spark_validation
…_spark_validation
* change device_uvector to device_buffer * update tests --------- Co-authored-by: Karthikeyan Natarajan <[email protected]>
Co-authored-by: Nghia Truong <[email protected]>
Co-authored-by: Nghia Truong <[email protected]>
@@ -1683,6 +1696,10 @@ JNIEXPORT jlong JNICALL Java_ai_rapids_cudf_Table_readAndInferJSON(JNIEnv* env, | |||
.recovery_mode(recovery_mode) | |||
.normalize_single_quotes(static_cast<bool>(normalize_single_quotes)) | |||
.normalize_whitespace(static_cast<bool>(normalize_whitespace)) | |||
.strict_validation(strict_validation) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code has to change now for the tests to pass. We have to not try and set
.numeric_leading_zeros(allow_leading_zeros)
.nonnumeric_numbers(allow_nonnumeric_numbers)
.unquoted_control_chars(allow_unquoted_control)
at all if strict_validation is disabled. And this goes for all of the APIs in this file because of the assertion that was just added in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
/merge |
Description
Addresses part of #15222
This change adds validation stage in JSON reader at tokens level. If any validation fails in a row, it will make the entire row as null.
thrust::tabulate_output_iterator
NVIDIA/cccl#2282)Checklist