Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add microbatch to data platform configs #6588

Merged
merged 9 commits into from
Dec 5, 2024
7 changes: 4 additions & 3 deletions website/docs/reference/resource-configs/bigquery-configs.md
Original file line number Diff line number Diff line change
Expand Up @@ -425,9 +425,10 @@ Please note that in order for policy tags to take effect, [column-level `persist

The [`incremental_strategy` config](/docs/build/incremental-strategy) controls how dbt builds incremental models. dbt uses a [merge statement](https://cloud.google.com/bigquery/docs/reference/standard-sql/dml-syntax) on BigQuery to refresh incremental tables.

The `incremental_strategy` config can be set to one of two values:
- `merge` (default)
- `insert_overwrite`
The `incremental_strategy` config can be set to one of the following values:
- `merge` (default)
- `insert_overwrite`
- [`microbatch`](/docs/build/incremental-microbatch)

### Performance and cost

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ In dbt-postgres, the following incremental materialization strategies are suppor
- `append` (default when `unique_key` is not defined)
- `merge`
- `delete+insert` (default when `unique_key` is defined)
- [`microbatch`](/docs/build/incremental-microbatch)

## Performance optimizations

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ In dbt-redshift, the following incremental materialization strategies are suppor
- `append` (default when `unique_key` is not defined)
- `merge`
- `delete+insert` (default when `unique_key` is defined)
- [`microbatch`](/docs/build/incremental-microbatch)

All of these strategies are inherited from dbt-postgres.

Expand Down
17 changes: 12 additions & 5 deletions website/docs/reference/resource-configs/snowflake-configs.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,11 +38,11 @@ flags:
The following configurations are supported.
For more information, check out the Snowflake reference for [`CREATE ICEBERG TABLE` (Snowflake as the catalog)](https://docs.snowflake.com/en/sql-reference/sql/create-iceberg-table-snowflake).

| Field | Type | Required | Description | Sample input | Note |
| --------------------- | ------ | -------- | -------------------------------------------------------------------------------------------------------------------------- | ------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Table Format | String | Yes | Configures the objects table format. | `iceberg` | `iceberg` is the only accepted value. |
| Field | Type | Required | Description | Sample input | Note |
| ------ | ----- | -------- | ------------- | ------------ | ------ |
| Table Format | String | Yes | Configures the objects table format. | `iceberg` | `iceberg` is the only accepted value. |
| External volume | String | Yes(*) | Specifies the identifier (name) of the external volume where Snowflake writes the Iceberg table's metadata and data files. | `my_s3_bucket` | *You don't need to specify this if the account, database, or schema already has an associated external volume. [More info](https://docs.snowflake.com/en/sql-reference/sql/create-iceberg-table-snowflake#:~:text=Snowflake%20Table%20Structures.-,external_volume) |
| Base location Subpath | String | No | An optional suffix to add to the `base_location` path that dbt automatically specifies. | `jaffle_marketing_folder` | We recommend that you do not specify this. Modifying this parameter results in a new Iceberg table. See [Base Location](#base-location) for more info. |
| Base location Subpath | String | No | An optional suffix to add to the `base_location` path that dbt automatically specifies. | `jaffle_marketing_folder` | We recommend that you do not specify this. Modifying this parameter results in a new Iceberg table. See [Base Location](#base-location) for more info. |

### Example configuration

Expand Down Expand Up @@ -470,8 +470,15 @@ In this example, you can set up a query tag to be applied to every query with th

The [`incremental_strategy` config](/docs/build/incremental-strategy) controls how dbt builds incremental models. By default, dbt will use a [merge statement](https://docs.snowflake.net/manuals/sql-reference/sql/merge.html) on Snowflake to refresh incremental tables.

mirnawong1 marked this conversation as resolved.
Show resolved Hide resolved
Snowflake supports the following incremental strategies:
- Merge (default)
- Append
- Delete+insert
- [`microbatch`](/docs/build/incremental-microbatch)

Snowflake's `merge` statement fails with a "nondeterministic merge" error if the `unique_key` specified in your model config is not actually unique. If you encounter this error, you can instruct dbt to use a two-step incremental approach by setting the `incremental_strategy` config for your model to `delete+insert`.


## Configuring table clustering

dbt supports [table clustering](https://docs.snowflake.net/manuals/user-guide/tables-clustering-keys.html) on Snowflake. To control clustering for a <Term id="table" /> or incremental model, use the `cluster_by` config. When this configuration is applied, dbt will do two things:
Expand Down Expand Up @@ -701,4 +708,4 @@ flags:

```

</VersionBlock>
</VersionBlock>
3 changes: 2 additions & 1 deletion website/docs/reference/resource-configs/spark-configs.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,8 @@ For that reason, the dbt-spark plugin leans heavily on the [`incremental_strateg
- **`append`** (default): Insert new records without updating or overwriting any existing data.
- **`insert_overwrite`**: If `partition_by` is specified, overwrite partitions in the <Term id="table" /> with new data. If no `partition_by` is specified, overwrite the entire table with new data.
- **`merge`** (Delta, Iceberg and Hudi file format only): Match records based on a `unique_key`; update old records, insert new ones. (If no `unique_key` is specified, all new data is inserted, similar to `append`.)

- `microbatch` Implements the [microbatch strategy](/docs/build/incremental-microbatch) using `event_time` to define time-based ranges for filtering data.

Each of these strategies has its pros and cons, which we'll discuss below. As with any model config, `incremental_strategy` may be specified in `dbt_project.yml` or within a model file's `config()` block.

### The `append` strategy
Expand Down
Loading