Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jklamer/avro block builder system #1

Closed
wants to merge 135 commits into from

Conversation

jklamer
Copy link
Owner

@jklamer jklamer commented Apr 22, 2024

Description

Additional context and related issues

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text:

# Section
* Fix some things. ({issue}`issuenumber`)

@jklamer jklamer force-pushed the jklamer/AvroBlockBuilderSystem branch 3 times, most recently from 5ee2b6e to 32bcab0 Compare April 22, 2024 19:47
@jklamer jklamer force-pushed the jklamer/AvroBlockBuilderSystem branch 4 times, most recently from 37819b5 to 2733772 Compare May 7, 2024 19:31
@jklamer jklamer force-pushed the jklamer/AvroBlockBuilderSystem branch from 2733772 to 9e8b2d2 Compare May 30, 2024 15:56
findepi and others added 19 commits June 6, 2024 11:17
Older version of Hive doesn't support Float/Real type for parquet table format, but Hive 3.0+
doesn't have such restriction.
This coercion would work for partitioned tables for all formats and for unpartitioned tables
it would work for ORC and Parquet table format.
It is rare but possible to get empty spooling output stats for task which completed successfully.
This may happen if we observe FINISHED task state based on received TaskStatus but are later on unable to
successfully retrieve TaskInfo. In such case we are building final TaskInfo based on last known taskInfo, just
updating the taskState field. The spooling output stats will not be present.
As we need this information in FTE mode we need to fail such task artificially
Some BI tools don't pass a `catalog` when calling the `DatabaseMetaData`
`getTables`, `getColumns` and `getSchemas` methods. This makes the JDBC
driver search across all catalogs which can be expensive.

This commit introduces a new boolean connection property
`assumeNullCatalogMeansCurrentCatalog` (disabled by default) to be used
with such BI tools. If enabled the driver will try to use current
`catalog` of the JDBC connection when fetching Trino metadata like
`getTables`, `getColumns`, `getSchemas` if the `catalog` argument to
those methods is passed as `null`.

Co-authored-by: Rafał Połomka <[email protected]>
Co-authored-by: Ashhar Hasan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.