-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix json inputs for drill windowing tests #15148
Conversation
@imply-cheddar , @rohangarg could you please take a look? |
*/ | ||
@Nullable | ||
private static Object convertField(Group g, String fieldName, boolean binaryAsString) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just double checking, this binaryAsString
argument. That was being passed around but never actually used?
Any changes you want to make to ParquetToJson
are good, that code is just for our own dev purposes, but these changes in ParquetGroupConverter
is changing the code that our current parquet-based file ingestions end up using. So, I just want to double check that this cleanup is truly a redundant argument and not just something that wasn't needed for ParquetToJson
but used for the other production code paths.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
binaryAsString
was passed around in private static
methods in a class on which an instance method
was called first....so I've choosen to remove them (and use the implict class access - to get it where needed) ; instead of adding another boolean to every static method
|
||
public ParquetGroupConverter(boolean binaryAsString) | ||
public ParquetGroupConverter(boolean binaryAsString, boolean convertCorruptDates) | ||
{ | ||
this.binaryAsString = binaryAsString; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is binaryAsString
still used somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes; it should be used around here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a design nit: with the change to remove the static
label from the methods, I would expect them to start coming after the constructor. That is, we tend to follow the code flow of static first, then constructor then class methods. I noticed that it was inverted because I kept searching for the constructor at the top of the file and didn't see it, and then realized it was at the bottom and that's because the methods used to be static but now are not.
The current structure reads really nicely for the diff though, I hope that the whitespace change that I'm nit picking doesn't screw up the diff...
Thank you @imply-cheddar and @abhishekagarwal87 for reviewing and merging the changes! |
This PR: adds a flag to JsonToParquet to do the fix during conversion updates the json files to more correct conents some resultset mismatches were fixed by this updates parquet to 1.13.1 (cherry picked from commit 9fb0dbf)
This PR: adds a flag to JsonToParquet to do the fix during conversion updates the json files to more correct conents some resultset mismatches were fixed by this updates parquet to 1.13.1 (cherry picked from commit 9fb0dbf)
This PR: adds a flag to JsonToParquet to do the fix during conversion updates the json files to more correct conents some resultset mismatches were fixed by this updates parquet to 1.13.1
Apparently Drill had some issue with parquet files up to 1.10 ; they have added a feature to fix those corruptions.
Unfortunately - we are working with parquet files which do had that issue...
This PR:
JsonToParquet
to do the fix during conversion