Skip to content

Commit

Permalink
User-defined custom incremental strategies (#4716)
Browse files Browse the repository at this point in the history
[Preview](https://docs-getdbt-com-git-dbeatty-custom-incremental-d92d96-dbt-labs.vercel.app/docs/build/incremental-models#custom-strategies)

## What are you changing in this pull request and why?

This addresses the "**For end users**" portion of
#1761.

The feature request in dbt-labs/dbt-core#5245
describes the value proposition as well as the previous and new
behavior:

#### Functional Requirement
- Advanced users that wish to specify a custom incremental strategy must
be able to do so.

#### Previous behavior
- Advanced dbt users who wished to specify a custom incremental strategy
must override the same boilerplate Jinja macro by copy pasting it into
their dbt project.

#### New behavior
- Advanced dbt users who wish to specify a custom incremental strategy
will only need to create a macro that conforms to the naming convention
`get_incremental_NAME_sql` that produces the correct SQL for the target
warehouse.

## Also

To address the questions raised in
dbt-labs/dbt-core#8769, we also want to
document how to utilize custom incremental macros that come from a
package.

For example, to use the `merge_null_safe` custom incremental strategy
from the `example` package, first [install the
package](/build/packages#how-do-i-add-a-package-to-my-project), then add
this macro to your project:

```sql
{% macro get_incremental_merge_null_safe_sql(arg_dict) %}
    {% do return(example.get_incremental_merge_null_safe_sql(arg_dict)) %}
{% endmacro %}
```

## 🎩 

<img width="503" alt="image"
src="https://github.com/dbt-labs/docs.getdbt.com/assets/44704949/51c3266e-e3fb-49bd-9428-7c43920a5412">

## Checklist
- [x] Review the [Content style
guide](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/content-style-guide.md)
so my content adheres to these guidelines.
- [x] For [docs
versioning](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#about-versioning),
review how to [version a whole
page](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#adding-a-new-version)
and [version a block of
content](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#versioning-blocks-of-content).
  • Loading branch information
mirnawong1 authored Jan 8, 2024
2 parents 09ae55f + 89bc5e1 commit fdfffe8
Showing 1 changed file with 125 additions and 1 deletion.
126 changes: 125 additions & 1 deletion website/docs/docs/build/incremental-models.md
Original file line number Diff line number Diff line change
Expand Up @@ -236,7 +236,7 @@ Instead, whenever the logic of your incremental changes, execute a full-refresh

## About `incremental_strategy`

There are various ways (strategies) to implement the concept of an incremental materializations. The value of each strategy depends on:
There are various ways (strategies) to implement the concept of incremental materializations. The value of each strategy depends on:

* the volume of data,
* the reliability of your `unique_key`, and
Expand Down Expand Up @@ -450,5 +450,129 @@ The syntax depends on how you configure your `incremental_strategy`:

</VersionBlock>

### Built-in strategies

Before diving into [custom strategies](#custom-strategies), it's important to understand the built-in incremental strategies in dbt and their corresponding macros:

| `incremental_strategy` | Corresponding macro |
|------------------------|----------------------------------------|
| `append` | `get_incremental_append_sql` |
| `delete+insert` | `get_incremental_delete_insert_sql` |
| `merge` | `get_incremental_merge_sql` |
| `insert_overwrite` | `get_incremental_insert_overwrite_sql` |


For example, a built-in strategy for the `append` can be defined and used with the following files:

<File name='macros/append.sql'>

```sql
{% macro get_incremental_append_sql(arg_dict) %}

{% do return(some_custom_macro_with_sql(arg_dict["target_relation"], arg_dict["temp_relation"], arg_dict["unique_key"], arg_dict["dest_columns"], arg_dict["incremental_predicates"])) %}

{% endmacro %}


{% macro some_custom_macro_with_sql(target_relation, temp_relation, unique_key, dest_columns, incremental_predicates) %}

{%- set dest_cols_csv = get_quoted_csv(dest_columns | map(attribute="name")) -%}

insert into {{ target_relation }} ({{ dest_cols_csv }})
(
select {{ dest_cols_csv }}
from {{ temp_relation }}
)

{% endmacro %}
```
</File>

Define a model models/my_model.sql:

```sql
{{ config(
materialized="incremental",
incremental_strategy="append",
) }}

select * from {{ ref("some_model") }}
```

### Custom strategies

<VersionBlock lastVersion="1.1">

Custom incremental strategies can be defined beginning in dbt v1.2.

</VersionBlock>

<VersionBlock firstVersion="1.2">

As an easier alternative to [creating an entirely new materialization](/guides/create-new-materializations), users can define and use their own "custom" user-defined incremental strategies by:

1. defining a macro named `get_incremental_STRATEGY_sql`. Note that `STRATEGY` is a placeholder and you should replace it with the name of your custom incremental strategy.
2. configuring `incremental_strategy: STRATEGY` within an incremental model

dbt won't validate user-defined strategies, it will just look for the macro by that name, and raise an error if it can't find one.

For example, a user-defined strategy named `insert_only` can be defined and used with the following files:

<File name='macros/my_custom_strategies.sql'>

```sql
{% macro get_incremental_insert_only_sql(arg_dict) %}

{% do return(some_custom_macro_with_sql(arg_dict["target_relation"], arg_dict["temp_relation"], arg_dict["unique_key"], arg_dict["dest_columns"], arg_dict["incremental_predicates"])) %}

{% endmacro %}


{% macro some_custom_macro_with_sql(target_relation, temp_relation, unique_key, dest_columns, incremental_predicates) %}

{%- set dest_cols_csv = get_quoted_csv(dest_columns | map(attribute="name")) -%}

insert into {{ target_relation }} ({{ dest_cols_csv }})
(
select {{ dest_cols_csv }}
from {{ temp_relation }}
)

{% endmacro %}
```

</File>

<File name='models/my_model.sql'>

```sql
{{ config(
materialized="incremental",
incremental_strategy="insert_only",
...
) }}

...
```

</File>

### Custom strategies from a package

To use the `merge_null_safe` custom incremental strategy from the `example` package:
- [Install the package](/docs/build/packages#how-do-i-add-a-package-to-my-project)
- Then add the following macro to your project:

<File name='macros/my_custom_strategies.sql'>

```sql
{% macro get_incremental_merge_null_safe_sql(arg_dict) %}
{% do return(example.get_incremental_merge_null_safe_sql(arg_dict)) %}
{% endmacro %}
```

</File>
</VersionBlock>

<Snippet path="discourse-help-feed-header" />
<DiscourseHelpFeed tags="incremental"/>

0 comments on commit fdfffe8

Please sign in to comment.