-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Add support of HiveTableScan TextHive #709
Conversation
Signed-off-by: Ahmed Hussein (amahussein) <[email protected]>
Signed-off-by: Ahmed Hussein (amahussein) <[email protected]>
@nartal1
|
You are right. Currently the tool looks for "ReadSchema" which is present in eventlogs for ORC and Parquet file formats. We try to get the schema as well. IIRC, schema is not present in the eventlogs for Scan Hive. We need to update that function to support Scan Hive file formats. |
"HiveParquet"), | ||
HiveScanSerdeClasses("org.apache.hadoop.hive.serde2.avro.AvroSerDe", "HiveAvro"), | ||
HiveScanSerdeClasses("org.apache.hadoop.hive.serde2.OpenCSVSerde", "HiveCSV"), | ||
HiveScanSerdeClasses("org.apache.hadoop.hive.ql.io.orc.OrcSerde", "HiveORC") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wonder if these are the only SerdeClasses that are used OR is there a possibility that a different class could be used for HiveParquet, HiveORC etc. Understand that we cannot cover all the classes, but would be good to document if that's the case.
#723 supersedes this PR. |
Fixes #681