Skip to content

Commit

Permalink
add new snapshot spec - main (#6187)
Browse files Browse the repository at this point in the history
  • Loading branch information
mirnawong1 authored Oct 3, 2024
2 parents 39ca60b + 271eb62 commit c8b4993
Show file tree
Hide file tree
Showing 15 changed files with 1,068 additions and 388 deletions.
633 changes: 355 additions & 278 deletions website/docs/docs/build/snapshots.md

Large diffs are not rendered by default.

4 changes: 4 additions & 0 deletions website/docs/docs/dbt-versions/release-notes.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,9 @@ Release notes are grouped by month for both multi-tenant and virtual private clo

## October 2024

- **New**: In dbt Cloud Versionless, [Snapshots](/docs/build/snapshots) have been updated to use YAML configuration files instead of SQL snapshot blocks. This new feature simplifies snapshot management and improves performance, and will soon be released in dbt Core 1.9.
- Who does this affect? New user on Versionless can define snapshots using the new YAML specification. Users upgrading to Versionless who use snapshots can keep their existing configuration or can choose to migrate their snapshot definitions to YAML.
- Users on dbt 1.8 and earlier: No action is needed; existing snapshots will continue to work as before. However, we recommend upgrading to Versionless to take advantage of the new snapshot features.
- **Behavior change:** Set [`state_modified_compare_more_unrendered`](/reference/global-configs/behavior-changes#source-definitions-for-state) to true to reduce false positives for `state:modified` when configs differ between `dev` and `prod` environments.
- **Behavior change:** Set the [`skip_nodes_if_on_run_start_fails`](/reference/global-configs/behavior-changes#failures-in-on-run-start-hooks) flag to `True` to skip all selected resources from running if there is a failure on an `on-run-start` hook.
- **Enhancement**: In dbt Cloud Versionless, snapshots defined in SQL files can now use `config` defined in `schema.yml` YAML files. This update resolves the previous limitation that required snapshot properties to be defined exclusively in `dbt_project.yml` and/or a `config()` block within the SQL file. This enhancement will be included in the upcoming dbt Core v1.9 release.
Expand All @@ -28,6 +31,7 @@ Read about the [order dbt infers columns can be used as primary key of a model](
- **New:** dbt Explorer now includes trust signal icons, which is currently available as a [Preview](/docs/dbt-versions/product-lifecycles#dbt-cloud). Trust signals offer a quick, at-a-glance view of data health when browsing your dbt models in Explorer. These icons indicate whether a model is **Healthy**, **Caution**, **Degraded**, or **Unknown**. For accurate health data, ensure the resource is up-to-date and has had a recent job run. Refer to [Trust signals](/docs/collaborate/explore-projects#trust-signals-for-resources) for more information.
- **New:** Auto exposures are now available in Preview in dbt Cloud. Auto-exposures helps users understand how their models are used in downstream analytics tools to inform investments and reduce incidents. It imports and auto-generates exposures based on Tableau dashboards, with user-defined curation. To learn more, refer to [Auto exposures](/docs/collaborate/auto-exposures).


## September 2024

- **New**: Use the new recommended syntax for [defining `foreign_key` constraints](/reference/resource-properties/constraints) using `refs`, available in dbt Cloud Versionless. This will soon be released in dbt Core v1.9. This new syntax will capture dependencies and works across different environments.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,3 +27,4 @@ A snapshot must have a materialized value of 'snapshot'
This tells you to change your `materialized` config to `snapshot`. But when you make that change, you might encounter an error message saying that certain fields like `dbt_scd_id` are missing. This error happens because, previously, when dbt treated snapshots as tables, it didn't include the necessary [snapshot meta-fields](/docs/build/snapshots#snapshot-meta-fields) in your target table. Since those meta-fields don't exist, dbt correctly identifies that you're trying to create a snapshot in a table that isn't actually a snapshot.

When this happens, you have to start from scratch — re-snapshotting your source data as if it was the first time by dropping your "snapshot" which isn't a real snapshot table. Then dbt snapshot will create a new snapshot and insert the snapshot meta-fields as expected.

11 changes: 10 additions & 1 deletion website/docs/reference/configs-and-properties.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,9 +26,18 @@ Whereas you can use **configurations** to:

Depending on the resource type, configurations can be defined in the dbt project and also in an installed package by:

<VersionBlock firstVersion="1.9">

1. Using a [`config` property](/reference/resource-properties/config) in a `.yml` file in the `models/`, `snapshots/`, `seeds/`, `analyses`, or `tests/` directory
2. From the [`dbt_project.yml` file](dbt_project.yml), under the corresponding resource key (`models:`, `snapshots:`, `tests:`, etc)
</VersionBlock>

<VersionBlock lastVersion="1.8">

1. Using a [`config()` Jinja macro](/reference/dbt-jinja-functions/config) within a `model`, `snapshot`, or `test` SQL file
2. Using a [`config` property](/reference/resource-properties/config) in a `.yml` file
2. Using a [`config` property](/reference/resource-properties/config) in a `.yml` file in the `models/`, `snapshots/`, `seeds/`, `analyses/`, or `tests/` directory.
3. From the [`dbt_project.yml` file](dbt_project.yml), under the corresponding resource key (`models:`, `snapshots:`, `tests:`, etc)
</VersionBlock>

### Config inheritance

Expand Down
11 changes: 10 additions & 1 deletion website/docs/reference/project-configs/snapshot-paths.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,16 @@ snapshot-paths: [directorypath]
</File>
## Definition
Optionally specify a custom list of directories where [snapshots](/docs/build/snapshots) are located. Note that you cannot co-locate models and snapshots.
Optionally specify a custom list of directories where [snapshots](/docs/build/snapshots) are located.
<VersionBlock firstVersion="1.9">
In [Versionless](/docs/dbt-versions/versionless-cloud) and on dbt v1.9 and higher, you can co-locate your snapshots with models if they are [defined using the latest YAML syntax](/docs/build/snapshots).
</VersionBlock>
<VersionBlock lastVersion="1.8">
Note that you cannot co-locate models and snapshots. However, in [Versionless](/docs/dbt-versions/versionless-cloud) and on dbt v1.9 and higher, you can co-locate your snapshots with models if they are [defined using the latest YAML syntax](/docs/build/snapshots).
</VersionBlock>
## Default
By default, dbt will search for snapshots in the `snapshots` directory, i.e. `snapshot-paths: ["snapshots"]`
Expand Down
75 changes: 74 additions & 1 deletion website/docs/reference/resource-configs/check_cols.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,31 @@ resource_types: [snapshots]
description: "Read this guide to understand the check_cols configuration in dbt."
datatype: "[column_name] | all"
---

<VersionBlock firstVersion="1.9">
<File name="snapshots/<filename>.yml">

```yml
snapshots:
- name: snapshot_name
relation: source('my_source', 'my_table')
config:
schema: string
unique_key: column_name_or_expression
strategy: check
check_cols:
- column_name
```
</File>
</VersionBlock>
<VersionBlock lastVersion="1.8">
import SnapshotYaml from '/snippets/_snapshot-yaml-spec.md';
<SnapshotYaml/>
<File name='snapshots/<filename>.sql'>
```jinja2
Expand All @@ -14,7 +39,7 @@ datatype: "[column_name] | all"
```

</File>

</VersionBlock>

<File name='dbt_project.yml'>

Expand Down Expand Up @@ -42,6 +67,29 @@ No default is provided.

### Check a list of columns for changes

<VersionBlock firstVersion="1.9">

<File name="snapshots/orders_snapshot_check.yml">

```yaml
snapshots:
- name: orders_snapshot_check
relation: source('jaffle_shop', 'orders')
config:
schema: snapshots
unique_key: id
strategy: check
check_cols:
- status
- is_cancelled
```
</File>
To select from this snapshot in a downstream model: `select * from {{ ref('orders_snapshot_check') }}`
</VersionBlock>

<VersionBlock lastVersion="1.8">

```sql
{% snapshot orders_snapshot_check %}
Expand All @@ -58,8 +106,32 @@ No default is provided.
{% endsnapshot %}
```

</VersionBlock>

### Check all columns for changes

<VersionBlock firstVersion="1.9">

<File name="orders_snapshot_check.yml">

```yaml
snapshots:
- name: orders_snapshot_check
relation: source('jaffle_shop', 'orders')
config:
schema: snapshots
unique_key: id
strategy: check
check_cols:
- all
```
</File>

To select from this snapshot in a downstream model: `select * from {{{ ref('orders_snapshot_check') }}`
</VersionBlock>

<VersionBlock lastVersion="1.8">

```sql
{% snapshot orders_snapshot_check %}
Expand All @@ -75,3 +147,4 @@ No default is provided.
{% endsnapshot %}
```
</VersionBlock>
47 changes: 47 additions & 0 deletions website/docs/reference/resource-configs/invalidate_hard_deletes.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,31 @@ description: "Invalidate_hard_deletes - Read this in-depth guide to learn about
datatype: column_name
---


<VersionBlock firstVersion="1.9">

<File name='snapshots/<filename>.yml'>

```yaml
snapshots:
- name: snapshot
relation: source('my_source', 'my_table')
[config](/reference/snapshot-configs):
strategy: timestamp
invalidate_hard_deletes: true | false
```
</File>
</VersionBlock>
<VersionBlock lastVersion="1.8">
import SnapshotYaml from '/snippets/_snapshot-yaml-spec.md';
<SnapshotYaml/>
<File name='snapshots/<filename>.sql'>
```jinja2
Expand All @@ -17,6 +42,7 @@ datatype: column_name
```

</File>
</VersionBlock>

<File name='dbt_project.yml'>

Expand All @@ -39,6 +65,26 @@ By default the feature is disabled.

## Example

<VersionBlock firstVersion="1.9">
<File name='snapshots/orders.yml'>

```yaml
snapshots:
- name: orders_snapshot
relation: source('jaffle_shop', 'orders')
config:
schema: snapshots
database: analytics
unique_key: id
strategy: timestamp
updated_at: updated_at
invalidate_hard_deletes: true
```
</File>
</VersionBlock>
<VersionBlock lastVersion="1.8">
<File name='snapshots/orders.sql'>
```sql
Expand All @@ -60,3 +106,4 @@ By default the feature is disabled.
```

</File>
</VersionBlock>
7 changes: 5 additions & 2 deletions website/docs/reference/resource-configs/pre-hook-post-hook.md
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,8 @@ snapshots:

</File>

<VersionBlock lastVersion="1.8">

<File name='snapshots/<filename>.sql'>

```sql
Expand All @@ -125,13 +127,14 @@ select ...
```

</File>
</VersionBlock>

<File name='snapshots/properties.yml'>
<File name='snapshots/snapshot.yml'>

```yml
snapshots:
- name: [<snapshot_name>]
config:
[config](/reference/resource-properties/config):
[pre_hook](/reference/resource-configs/pre-hook-post-hook): <sql-statement> | [<sql-statement>]
[post_hook](/reference/resource-configs/pre-hook-post-hook): <sql-statement> | [<sql-statement>]
```
Expand Down
50 changes: 49 additions & 1 deletion website/docs/reference/resource-configs/snapshot_name.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,27 @@
description: "Snapshot-name - Read this in-depth guide to learn about configurations in dbt."
---

<VersionBlock firstVersion="1.9">
<File name='snapshots/<filename>.yml'>

```yaml
snapshots:
- name: snapshot_name
relation: source('my_source', 'my_table')
config:
schema: string
database: string
unique_key: column_name_or_expression
strategy: timestamp | check
updated_at: column_name # Required if strategy is 'timestamp'

```

</File>
</VersionBlock>

<VersionBlock lastVersion="1.8">

<File name='snapshots/<filename>.sql'>

```jinja2
Expand All @@ -13,9 +34,15 @@ description: "Snapshot-name - Read this in-depth guide to learn about configurat

</File>

import SnapshotYaml from '/snippets/_snapshot-yaml-spec.md';

<SnapshotYaml/>

</VersionBlock>

## Description

The name of a snapshot, as defined in the `{% snapshot %}` block header. This name is used when selecting from a snapshot using the [`ref` function](/reference/dbt-jinja-functions/ref)
The name of a snapshot, which is used when selecting from a snapshot using the [`ref` function](/reference/dbt-jinja-functions/ref)

This name must not conflict with the name of any other "refable" resource (models, seeds, other snapshots) defined in this project or package.

Expand All @@ -24,6 +51,26 @@ The name does not need to match the file name. As a result, snapshot filenames d
## Examples
### Name a snapshot `order_snapshot`

<VersionBlock firstVersion="1.9">
<File name='snapshots/order_snapshot.yml'>


```yaml
snapshots:
- name: order_snapshot
relation: source('my_source', 'my_table')
config:
schema: string
database: string
unique_key: column_name_or_expression
strategy: timestamp | check
updated_at: column_name # Required if strategy is 'timestamp'
```
</File>
</VersionBlock>
<VersionBlock lastVersion="1.8">
<File name='snapshots/orders.sql'>
```jinja2
Expand All @@ -35,6 +82,7 @@ The name does not need to match the file name. As a result, snapshot filenames d

</File>

</VersionBlock>

To select from this snapshot in a downstream model:

Expand Down
Loading

0 comments on commit c8b4993

Please sign in to comment.