Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add enable_list_inference and intermediate_format as new BQ configurations for 1.7 #5452

Merged
merged 7 commits into from
May 16, 2024
21 changes: 21 additions & 0 deletions website/docs/reference/resource-configs/bigquery-configs.md
Original file line number Diff line number Diff line change
Expand Up @@ -902,3 +902,24 @@ As with most data platforms, there are limitations associated with materialized
Find more information about materialized view limitations in Google's BigQuery [docs](https://cloud.google.com/bigquery/docs/materialized-views-intro#limitations).

</VersionBlock>

<VersionBlock firstVersion="1.9">

## Python models

The BigQuery adapter supports Python models with the following additional configuration parameters:

| Parameter | Type | Required | Default | Valid values |
|-------------------------|-------------|----------|-----------|------------------|
| `enable_list_inference` | `<boolean>` | no | `True` | `True`, `False` |
| `intermediate_format` | `<string>` | no | `parquet` | `parquet`, `orc` |

### The `enable_list_inference` parameter
The `enable_list_inference` parameter enables a PySpark data frame to read multiple records in the same operation.
By default, this is set to `True` to support the default `intermediate_format` of `parquet`.

### The `intermediate_format` parameter
The `intermediate_format` parameter specifies which file format to use when writing records to a table. The default is `parquet`.
This parameter became configurable when the default write method changed from `direct` to `indirect` to support partitioning and clustering.
mikealfare marked this conversation as resolved.
Show resolved Hide resolved

</VersionBlock>
Loading