diff --git a/website/docs/reference/configs-and-properties.md b/website/docs/reference/configs-and-properties.md index 20d762b7462..b3f23584a4a 100644 --- a/website/docs/reference/configs-and-properties.md +++ b/website/docs/reference/configs-and-properties.md @@ -26,9 +26,18 @@ Whereas you can use **configurations** to: Depending on the resource type, configurations can be defined in the dbt project and also in an installed package by: + + +1. Using a [`config` property](/reference/resource-properties/config) in a `.yml` file in the `models/`, `snapshots/`, or `tests/` directory +2. From the [`dbt_project.yml` file](dbt_project.yml), under the corresponding resource key (`models:`, `snapshots:`, `tests:`, etc) + + + + 1. Using a [`config()` Jinja macro](/reference/dbt-jinja-functions/config) within a `model`, `snapshot`, or `test` SQL file -2. Using a [`config` property](/reference/resource-properties/config) in a `.yml` file +2. Using a [`config` property](/reference/resource-properties/config) in a `.yml` file in the `models/`, `snapshots/`, or `tests/` directory. 3. From the [`dbt_project.yml` file](dbt_project.yml), under the corresponding resource key (`models:`, `snapshots:`, `tests:`, etc) + ### Config inheritance diff --git a/website/docs/reference/resource-configs/check_cols.md b/website/docs/reference/resource-configs/check_cols.md index bd187409379..b8e7ae8398f 100644 --- a/website/docs/reference/resource-configs/check_cols.md +++ b/website/docs/reference/resource-configs/check_cols.md @@ -3,6 +3,31 @@ resource_types: [snapshots] description: "Read this guide to understand the check_cols configuration in dbt." datatype: "[column_name] | all" --- + + + + + ```yml + snapshots: + - name: snapshot_name + relation: source('jaffle_shop', 'orders') + config: + schema: string + unique_key: column_name_or_expression + strategy: check + check_cols: + - column_name + ``` + + + + + + +:::info Use the latest snapshot syntax + +In Versionless and dbt v1.9 and later, snapshots are defined in an updated syntax using a YAML file within your `snapshots/` directory (as defined by the [`snapshot-paths` config](/reference/project-configs/snapshot-paths)). For faster and more efficient management, consider the updated snapshot YAML syntax, [available in Versionless](/docs/dbt-versions/versionless-cloud) or [dbt Core v1.9 and later](/docs/dbt-versions/core). +::: ```jinja2 @@ -14,7 +39,7 @@ datatype: "[column_name] | all" ``` - + @@ -42,6 +67,30 @@ No default is provided. ### Check a list of columns for changes + + + + +```yaml +snapshots: + - name: orders_snapshot_check + relation: source('jaffle_shop', 'orders') + config: + schema: snapshots + unique_key: id + strategy: check + check_cols: + - status + - is_cancelled +``` + + +To select from this snapshot in a downstream model: `select * from {{ source('jaffle_shop', 'orders') }}` + + + + + ```sql {% snapshot orders_snapshot_check %} @@ -58,8 +107,32 @@ No default is provided. {% endsnapshot %} ``` + + ### Check all columns for changes + + + + +```yaml +snapshots: + - name: orders_snapshot_check + relation: source('jaffle_shop', 'orders') + config: + schema: snapshots + unique_key: id + strategy: check + check_cols: + - all + ``` + + +To select from this snapshot in a downstream model: `select * from {{ source('jaffle_shop', 'orders') }}` + + + + ```sql {% snapshot orders_snapshot_check %} @@ -75,3 +148,4 @@ No default is provided. {% endsnapshot %} ``` + diff --git a/website/docs/reference/resource-configs/invalidate_hard_deletes.md b/website/docs/reference/resource-configs/invalidate_hard_deletes.md index ba5b37c5d71..94fa40ade9d 100644 --- a/website/docs/reference/resource-configs/invalidate_hard_deletes.md +++ b/website/docs/reference/resource-configs/invalidate_hard_deletes.md @@ -4,6 +4,32 @@ description: "Invalidate_hard_deletes - Read this in-depth guide to learn about datatype: column_name --- + + + + + +```yaml +snapshots: + - name: snapshot + relation: source('my_source', 'my_table') + [config](/reference/snapshot-configs): + strategy: timestamp + invalidate_hard_deletes: true | false +``` + + + + + + + + +:::info Use the latest snapshot syntax + +In Versionless and dbt v1.9 and later, snapshots are defined in an updated syntax using a YAML file within your `snapshots/` directory (as defined by the [`snapshot-paths` config](/reference/project-configs/snapshot-paths)). For faster and more efficient management, consider the updated snapshot YAML syntax, [available in Versionless](/docs/dbt-versions/versionless-cloud) or [dbt Core v1.9 and later](/docs/dbt-versions/core). +::: + ```jinja2 @@ -17,6 +43,7 @@ datatype: column_name ``` + @@ -39,6 +66,26 @@ By default the feature is disabled. ## Example + + + +```yaml +snapshots: + - name: orders_snapshot + relation: source('jaffle_shop', 'orders') + config: + schema: snapshots + database: analytics + unique_key: id + strategy: timestamp + updated_at: updated_at + invalidate_hard_deletes: true + ``` + + + + + ```sql @@ -60,3 +107,4 @@ By default the feature is disabled. ``` + diff --git a/website/docs/reference/resource-configs/pre-hook-post-hook.md b/website/docs/reference/resource-configs/pre-hook-post-hook.md index e1e7d67f02e..cde914fd639 100644 --- a/website/docs/reference/resource-configs/pre-hook-post-hook.md +++ b/website/docs/reference/resource-configs/pre-hook-post-hook.md @@ -109,6 +109,8 @@ snapshots: + + ```sql @@ -125,13 +127,15 @@ select ... ``` + ```yml snapshots: - name: [] - config: + [config](/reference/resource-properties/config): + [](/reference/snapshot-configs): [pre_hook](/reference/resource-configs/pre-hook-post-hook): | [] [post_hook](/reference/resource-configs/pre-hook-post-hook): | [] ``` diff --git a/website/docs/reference/resource-configs/snapshot_name.md b/website/docs/reference/resource-configs/snapshot_name.md index bb4826a116b..a3ce6cbd63b 100644 --- a/website/docs/reference/resource-configs/snapshot_name.md +++ b/website/docs/reference/resource-configs/snapshot_name.md @@ -2,6 +2,27 @@ description: "Snapshot-name - Read this in-depth guide to learn about configurations in dbt." --- + + + +```yaml +snapshots: + - name: snapshot_name + relation: source('my_source', 'my_table') + config: + schema: string + database: string + unique_key: column_name_or_expression + strategy: timestamp | check + updated_at: column_name # Required if strategy is 'timestamp' + +``` + + + + + + ```jinja2 @@ -13,9 +34,16 @@ description: "Snapshot-name - Read this in-depth guide to learn about configurat +:::info Use the latest snapshot syntax + +In Versionless and dbt v1.9 and later, snapshots are defined in an updated syntax using a YAML file within your `snapshots/` directory (as defined by the [`snapshot-paths` config](/reference/project-configs/snapshot-paths)). For faster and more efficient management, consider the updated snapshot YAML syntax, [available in Versionless](/docs/dbt-versions/versionless-cloud) or [dbt Core v1.9 and later](/docs/dbt-versions/core). +::: + + + ## Description -The name of a snapshot, as defined in the `{% snapshot %}` block header. This name is used when selecting from a snapshot using the [`ref` function](/reference/dbt-jinja-functions/ref) +The name of a snapshot, which is used when selecting from a snapshot using the [`ref` function](/reference/dbt-jinja-functions/ref) This name must not conflict with the name of any other "refable" resource (models, seeds, other snapshots) defined in this project or package. @@ -24,6 +52,26 @@ The name does not need to match the file name. As a result, snapshot filenames d ## Examples ### Name a snapshot `order_snapshot` + + + + +```yaml +snapshots: + - name: order_snapshot + relation: source('my_source', 'my_table') + config: + schema: string + database: string + unique_key: column_name_or_expression + strategy: timestamp | check + updated_at: column_name # Required if strategy is 'timestamp' +``` + + + + + ```jinja2 @@ -35,6 +83,7 @@ The name does not need to match the file name. As a result, snapshot filenames d + To select from this snapshot in a downstream model: diff --git a/website/docs/reference/resource-configs/strategy.md b/website/docs/reference/resource-configs/strategy.md index b67feb64fbd..f55b29703f9 100644 --- a/website/docs/reference/resource-configs/strategy.md +++ b/website/docs/reference/resource-configs/strategy.md @@ -4,6 +4,14 @@ description: "Strategy - Read this in-depth guide to learn about configurations datatype: timestamp | check --- + + +:::info Use the latest snapshot syntax + +In Versionless and dbt v1.9 and later, snapshots are defined in an updated syntax using a YAML file within your `snapshots/` directory (as defined by the [`snapshot-paths` config](/reference/project-configs/snapshot-paths)). For faster and more efficient management, consider the updated snapshot YAML syntax, [available in Versionless](/docs/dbt-versions/versionless-cloud) or [dbt Core v1.9 and later](/docs/dbt-versions/core). +::: + + + + + + + ```yaml + snapshots: + - [name: snapshot_name](/reference/resource-configs/snapshot_name): + relation: source('my_source', 'my_table') + config: + strategy: timestamp + updated_at: column_name + ``` + + + + + ```jinja2 @@ -30,6 +55,7 @@ select ... ``` + @@ -47,6 +73,23 @@ snapshots: + + + + + ```yaml + snapshots: + - [name: snapshot_name](/reference/resource-configs/snapshot_name): + relation: source('my_source', 'my_table') + config: + strategy: check + check_cols: + - [column_name] | "all" + ``` + + + + ```jinja2 @@ -62,6 +105,7 @@ snapshots: ``` + @@ -88,7 +132,25 @@ This is a **required configuration**. There is no default value. ## Examples ### Use the timestamp strategy + + + +```yaml +snapshots: + - name: orders_snapshot_timestamp + relation: source('jaffle_shop', 'orders') + config: + schema: snapshots + strategy: timestamp + unique_key: id + updated_at: updated_at + +``` + + + + ```sql @@ -109,10 +171,33 @@ This is a **required configuration**. There is no default value. ``` + ### Use the check strategy + + + +```yaml +# snapshots/check_example.yml +snapshots: + - name: orders_snapshot_check + relation: source('jaffle_shop', 'orders') + config: + schema: snapshots + unique_key: id + strategy: check + check_cols: + - status + - is_cancelled + +``` + + + + + ```sql {% snapshot orders_snapshot_check %} @@ -129,6 +214,7 @@ This is a **required configuration**. There is no default value. {% endsnapshot %} ``` + ### Advanced: define and use custom snapshot strategy Behind the scenes, snapshot strategies are implemented as macros, named `snapshot__strategy` @@ -140,6 +226,24 @@ It's possible to implement your own snapshot strategy by adding a macro with the 1. Create a macro named `snapshot_timestamp_with_deletes_strategy`. Use the existing code as a guide and adjust as needed. 2. Use this strategy via the `strategy` configuration: + + + +```yaml +snapshots: + - name: my_custom_snapshot + relation: source('my_source', 'my_table') + config: + strategy: timestamp_with_deletes + updated_at: updated_at_column + unique_key: id + schema: snapshots +``` + + + + + ```jinja2 @@ -155,3 +259,4 @@ It's possible to implement your own snapshot strategy by adding a macro with the ``` + diff --git a/website/docs/reference/resource-configs/unique_key.md b/website/docs/reference/resource-configs/unique_key.md index 9ad3417fd5e..ac2e08ec61a 100644 --- a/website/docs/reference/resource-configs/unique_key.md +++ b/website/docs/reference/resource-configs/unique_key.md @@ -4,6 +4,30 @@ description: "Unique_key - Read this in-depth guide to learn about configuration datatype: column_name_or_expression --- + + + + + +```yaml +snapshots: + - name: orders_snapshot + relation: source('my_source', 'my_table') + [config](/reference/snapshot-configs): + unique_key: id + +``` + + + + + + +:::info Use the latest snapshot syntax + +In Versionless and dbt v1.9 and later, snapshots are defined in an updated syntax using a YAML file within your `snapshots/` directory (as defined by the [`snapshot-paths` config](/reference/project-configs/snapshot-paths)). For faster and more efficient management, consider the updated snapshot YAML syntax, [available in Versionless](/docs/dbt-versions/versionless-cloud) or [dbt Core v1.9 and later](/docs/dbt-versions/core). +::: + ```jinja2 @@ -12,8 +36,8 @@ datatype: column_name_or_expression ) }} ``` - + @@ -29,6 +53,8 @@ snapshots: ## Description A column name or expression that is unique for the inputs of a snapshot. dbt uses this to match records between a result set and an existing snapshot, so that changes can be captured correctly. +In Versionless and dbt v1.9 and later, [snapshots](/docs/build/snapshots) are defined and configured in YAML files within your `snapshots/` directory. The `unique_key` is specified within the `config` block of your snapshot YAML file. + :::caution Providing a non-unique key will result in unexpected snapshot results. dbt **will not** test the uniqueness of this key, consider [testing](/blog/primary-key-testing#how-to-test-primary-keys-with-dbt) the source data to ensure that this key is indeed unique. @@ -41,6 +67,26 @@ This is a **required parameter**. No default is provided. ## Examples ### Use an `id` column as a unique key + + + + + +```yaml +snapshots: + - name: orders_snapshot + relation: source('jaffle_shop', 'orders') + config: + schema: snapshots + unique_key: id + strategy: timestamp + updated_at: updated_at + +``` + + + + ```jinja2 @@ -55,7 +101,9 @@ This is a **required parameter**. No default is provided. You can also write this in yaml. This might be a good idea if multiple snapshots share the same `unique_key` (though we prefer to apply this configuration in a config block, as above). + +You can also specify configurations in your `dbt_project.yml` file if multiple snapshots share the same `unique_key`: ```yml @@ -70,6 +118,25 @@ snapshots: ### Use a combination of two columns as a unique key This configuration accepts a valid column expression. As such, you can concatenate two columns together as a unique key if required. It's a good idea to use a separator (e.g. `'-'`) to ensure uniqueness. + + + + +```yaml +snapshots: + - name: transaction_items_snapshot + relation: source('erp', 'transactions') + config: + schema: snapshots + unique_key: "transaction_id || '-' || line_item_id" + strategy: timestamp + updated_at: updated_at + +``` + + + + @@ -93,10 +160,41 @@ from {{ source('erp', 'transactions') }} ``` + Though, it's probably a better idea to construct this column in your query and use that as the `unique_key`: + + + + +```yaml +snapshots: + - name: transaction_items_snapshot + relation: {{ ref('transaction_items_ephemeral') }} + config: + schema: snapshots + unique_key: id + strategy: timestamp + updated_at: updated_at + +# models/transaction_items_ephemeral.sql +{{ config(materialized='ephemeral') }} + +select + transaction_id || '-' || line_item_id as id, + * +from {{ source('erp', 'transactions') }} + +``` + + + +In this example, we create an ephemeral model `transaction_items_ephemeral` that creates the unique key id, and then references it in our snapshot. + + + ```jinja2 @@ -121,3 +219,4 @@ from {{ source('erp', 'transactions') }} ``` + diff --git a/website/docs/reference/resource-configs/updated_at.md b/website/docs/reference/resource-configs/updated_at.md index c61b04264be..9c15e99c512 100644 --- a/website/docs/reference/resource-configs/updated_at.md +++ b/website/docs/reference/resource-configs/updated_at.md @@ -3,6 +3,30 @@ resource_types: [snapshots] description: "Updated_at - Read this in-depth guide to learn about configurations in dbt." datatype: column_name --- + + + + + + +```yaml +snapshots: + - name: snapshot + relation: source('my_source', 'my_table') + [config](/reference/snapshot-configs): + strategy: timestamp + updated_at: column_name +``` + + + + + +:::info Use the latest snapshot syntax + +In Versionless and dbt v1.9 and later, snapshots are defined in an updated syntax using a YAML file within your `snapshots/` directory (as defined by the [`snapshot-paths` config](/reference/project-configs/snapshot-paths)). For faster and more efficient management, consider the updated snapshot YAML syntax, [available in Versionless](/docs/dbt-versions/versionless-cloud) or [dbt Core v1.9 and later](/docs/dbt-versions/core). +::: + ```jinja2 @@ -14,6 +38,7 @@ datatype: column_name ``` + @@ -37,7 +62,6 @@ You will get a warning if the data type of the `updated_at` column does not matc - ## Description A column within the results of your snapshot query that represents when the record row was last updated. @@ -50,6 +74,25 @@ No default is provided. ## Examples ### Use a column name `updated_at` + + + + +```yaml +snapshots: + - name: orders_snapshot + relation: source('jaffle_shop', 'orders') + config: + schema: snapshots + unique_key: id + strategy: timestamp + updated_at: updated_at + +``` + + + + ```sql @@ -72,12 +115,55 @@ select * from {{ source('jaffle_shop', 'orders') }} ``` + ### Coalesce two columns to create a reliable `updated_at` column Consider a data source that only has an `updated_at` column filled in when a record is updated (so a `null` value indicates that the record hasn't been updated after it was created). Since the `updated_at` configuration only takes a column name, rather than an expression, you should update your snapshot query to include the coalesced column. + + + +1. Create an staging model to perform the transformation. + In your `models/` directory, create a SQL file that configures an staging model to coalesce the `updated_at` and `created_at` columns into a new column `updated_at_for_snapshot`. + + + + ```sql + select * coalesce (updated_at, created_at) as updated_at_for_snapshot + from {{ source('jaffle_shop', 'orders') }} + + ``` + + +2. Define the snapshot configuration in a YAML file. + In your `snapshots/` directory, create a YAML file that defines your snapshot and references the `updated_at_for_snapshot` staging model you just created. + + + + ```yaml + snapshots: + - name: orders_snapshot + relation: {{ ref('staging_orders') }} + config: + schema: snapshots + unique_key: id + strategy: timestamp + updated_at: updated_at_for_snapshot + + ``` + + +3. Run `dbt snapshot` to execute the snapshot. + +Alternatively, you can also create an ephemeral model to performs the required transformations. Then, you reference this model in your snapshot's `relation` key. + + + + + + ```sql @@ -104,3 +190,4 @@ from {{ source('jaffle_shop', 'orders') }} ``` + diff --git a/website/docs/reference/snapshot-configs.md b/website/docs/reference/snapshot-configs.md index f7005021940..bda6da5a26e 100644 --- a/website/docs/reference/snapshot-configs.md +++ b/website/docs/reference/snapshot-configs.md @@ -323,7 +323,7 @@ The following examples demonstrate how to configure snapshots using the `dbt_pro - + ```yaml snapshots: