Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New snapshot_meta_column_names config for dbt snapshots #6211

Merged
merged 43 commits into from
Oct 3, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
f6a5780
Getting started on docs for `snapshot_meta_column_names` config
dbeatty10 Oct 2, 2024
008d818
Merge branch 'add/new-snapshot-spec-main' into dbeatty10-patch-1
mirnawong1 Oct 2, 2024
c4ead5f
The metadata columns for snapshots can be customized via the `snapsho…
dbeatty10 Oct 2, 2024
53bd2ff
Add the new `snapshot_meta_column_names` config to the release notes
dbeatty10 Oct 2, 2024
a5208f8
Skeleton reference page for `snapshot_meta_column_names` config for s…
dbeatty10 Oct 2, 2024
fcd5ecd
Links to `snapshot_meta_column_names` page across project file, prope…
dbeatty10 Oct 2, 2024
3d3e09d
Add new page to `sidebars.js`
dbeatty10 Oct 2, 2024
98a66b4
Rename `snapshot_meta_column_names` to `snapshot_meta_column_names.md`
dbeatty10 Oct 2, 2024
97aa3a2
Add hyperlinks for property YAML file `schema.yml` example
dbeatty10 Oct 2, 2024
aceaaf9
Uniform newlines with the code examples for project and property YAML…
dbeatty10 Oct 2, 2024
5090922
Link to the reference page for the `snapshot_meta_column_names` config
dbeatty10 Oct 2, 2024
e21dad5
Remove the `<Version>` tag so that it is always visible
dbeatty10 Oct 2, 2024
74991e6
Merge branch 'add/new-snapshot-spec-main' into dbeatty10-patch-1
dbeatty10 Oct 2, 2024
0fe2acd
Rough draft for `snapshot_meta_column_names` config
dbeatty10 Oct 2, 2024
31b4cfc
Remove the reference to the `dbt_valid_to_current` config until it is…
dbeatty10 Oct 2, 2024
911d83a
Remove extraneous content
dbeatty10 Oct 2, 2024
6781603
Add a filename for the example
dbeatty10 Oct 2, 2024
dc9746f
Align with other snapshot configs that lead with Jinja and project fi…
dbeatty10 Oct 2, 2024
faf60df
Add an example of the table output
dbeatty10 Oct 2, 2024
fa38b7b
Update release-notes.md
dbeatty10 Oct 2, 2024
2bb58f5
Add a property file / `schema.yml` example to the top
dbeatty10 Oct 2, 2024
28d84e5
Fix SQL file docs for Jinja config
dbeatty10 Oct 2, 2024
5e67484
Available in v1.9 or with versionless dbt Cloud
dbeatty10 Oct 2, 2024
4cc1efc
Link to SCD type 2 wikipedia page
dbeatty10 Oct 2, 2024
f28e48c
Fix hyperlink
dbeatty10 Oct 2, 2024
28e51d3
Link to page for building dbt snapshots
dbeatty10 Oct 2, 2024
f9dc63e
Add `datatype` and `id` for this docs page
dbeatty10 Oct 2, 2024
a55ab99
Add `default_value` for this page
dbeatty10 Oct 2, 2024
fb73bc3
Merge branch 'add/new-snapshot-spec-main' into dbeatty10-patch-1
dbeatty10 Oct 2, 2024
3e36fc3
Link to the metadata fields specifically
dbeatty10 Oct 2, 2024
cc8175e
Link to dbt Cloud Versionless
dbeatty10 Oct 2, 2024
2ac9a9f
Starting in 1.9
dbeatty10 Oct 2, 2024
f9ba325
Starting in v1.9
dbeatty10 Oct 2, 2024
3409771
Update snapshot_meta_column_names.md
dbeatty10 Oct 2, 2024
36c1547
Separate release note
dbeatty10 Oct 2, 2024
e47e381
Version entire page
dbeatty10 Oct 2, 2024
0b2c49b
Improve wording and fix misspellings
dbeatty10 Oct 2, 2024
464ce51
Convert from a note to a warning
dbeatty10 Oct 2, 2024
5467e1f
Update wording in release notes
dbeatty10 Oct 2, 2024
54e9f06
Merge branch 'add/new-snapshot-spec-main' into dbeatty10-patch-1
dbeatty10 Oct 2, 2024
af80431
Merge branch 'current' into dbeatty10-patch-1
dbeatty10 Oct 3, 2024
a9f25e3
Remove extraneous newline
dbeatty10 Oct 3, 2024
54153b4
Merge branch 'current' into dbeatty10-patch-1
dbeatty10 Oct 3, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions website/dbt-versions.js
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,10 @@ exports.versions = [
* @property {string} lastVersion The last version the page is visible in the sidebar
*/
exports.versionedPages = [
{
page: "reference/resource-configs/snapshot_meta_column_names",
firstVersion: "1.9",
},
{
page: "reference/resource-configs/target_database",
lastVersion: "1.8",
Expand Down
2 changes: 2 additions & 0 deletions website/docs/docs/build/snapshots.md
Original file line number Diff line number Diff line change
Expand Up @@ -409,6 +409,8 @@ Basically – keep your query as simple as possible! Some reasonable exceptions

Snapshot <Term id="table">tables</Term> will be created as a clone of your source dataset, plus some additional meta-fields*.

Starting in 1.9 or with [dbt Cloud Versionless](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless), these column names can be customized to your team or organizational conventions via the [`snapshot_meta_column_names`](/reference/resource-configs/snapshot_meta_column_names) config.

| Field | Meaning | Usage |
| -------------- | ------- | ----- |
| dbt_valid_from | The timestamp when this snapshot row was first inserted | This column can be used to order the different "versions" of a record. |
Expand Down
2 changes: 1 addition & 1 deletion website/docs/docs/dbt-versions/release-notes.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,12 +26,12 @@ Release notes are grouped by month for both multi-tenant and virtual private clo
- **Behavior change:** Set [`state_modified_compare_more_unrendered`](/reference/global-configs/behavior-changes#source-definitions-for-state) to true to reduce false positives for `state:modified` when configs differ between `dev` and `prod` environments.
- **Behavior change:** Set the [`skip_nodes_if_on_run_start_fails`](/reference/global-configs/behavior-changes#failures-in-on-run-start-hooks) flag to `True` to skip all selected resources from running if there is a failure on an `on-run-start` hook.
- **Enhancement**: In dbt Cloud Versionless, snapshots defined in SQL files can now use `config` defined in `schema.yml` YAML files. This update resolves the previous limitation that required snapshot properties to be defined exclusively in `dbt_project.yml` and/or a `config()` block within the SQL file. This enhancement will be included in the upcoming dbt Core v1.9 release.
- **New**: In dbt Cloud Versionless, the `snapshot_meta_column_names` config allows for customizing the snapshot metadata columns. This feature allows an organization to align these automatically-generated column names with their conventions, and will be included in the upcoming dbt Core 1.9 release.
- **Enhancement**: In May 2024, dbt Cloud versionless began inferring a model's `primary_key` based on configured data tests and/or constraints within `manifest.json`. The inferred `primary_key` is visible in dbt Explorer and utilized by the dbt Cloud [compare changes](/docs/deploy/run-visibility#compare-tab) feature. This will also be released in dbt Core 1.9.
Read about the [order dbt infers columns can be used as primary key of a model](https://github.com/dbt-labs/dbt-core/blob/7940ad5c7858ff11ef100260a372f2f06a86e71f/core/dbt/contracts/graph/nodes.py#L534-L541).
- **New:** dbt Explorer now includes trust signal icons, which is currently available as a [Preview](/docs/dbt-versions/product-lifecycles#dbt-cloud). Trust signals offer a quick, at-a-glance view of data health when browsing your dbt models in Explorer. These icons indicate whether a model is **Healthy**, **Caution**, **Degraded**, or **Unknown**. For accurate health data, ensure the resource is up-to-date and has had a recent job run. Refer to [Trust signals](/docs/collaborate/explore-projects#trust-signals-for-resources) for more information.
- **New:** Auto exposures are now available in Preview in dbt Cloud. Auto-exposures helps users understand how their models are used in downstream analytics tools to inform investments and reduce incidents. It imports and auto-generates exposures based on Tableau dashboards, with user-defined curation. To learn more, refer to [Auto exposures](/docs/collaborate/auto-exposures).


## September 2024

- **New**: Use the new recommended syntax for [defining `foreign_key` constraints](/reference/resource-properties/constraints) using `refs`, available in dbt Cloud Versionless. This will soon be released in dbt Core v1.9. This new syntax will capture dependencies and works across different environments.
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
---
resource_types: [snapshots]
description: "Snapshot meta column names"
datatype: "{<dictionary>}"
default_value: {"dbt_valid_from": "dbt_valid_from", "dbt_valid_to": "dbt_valid_to", "dbt_scd_id": "dbt_scd_id", "dbt_updated_at": "dbt_updated_at"}
id: "snapshot_meta_column_names"
---

Starting in 1.9 or with [versionless](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless) dbt Cloud.

<File name='snapshots/schema.yml'>

```yaml
snapshots:
- name: <snapshot_name>
config:
snapshot_meta_column_names:
dbt_valid_from: <string>
dbt_valid_to: <string>
dbt_scd_id: <string>
dbt_updated_at: <string>

```

</File>

<File name='snapshots/<filename>.sql'>

```jinja2
{{
config(
snapshot_meta_column_names={
"dbt_valid_from": "<string>",
"dbt_valid_to": "<string>",
"dbt_scd_id": "<string>",
"dbt_updated_at": "<string>",
}
)
}}

```

</File>

<File name='dbt_project.yml'>

```yml
snapshots:
[<resource-path>](/reference/resource-configs/resource-path):
+snapshot_meta_column_names:
dbt_valid_from: <string>
dbt_valid_to: <string>
dbt_scd_id: <string>
dbt_updated_at: <string>

```

</File>

## Description

In order to align with an organization's naming conventions, the `snapshot_meta_column_names` config can be used to customize the names of the [metadata columns](/docs/build/snapshots#snapshot-meta-fields) within each snapshot.

## Default

By default, dbt snapshots use the following column names to track change history using [Type 2 slowly changing dimension](https://en.wikipedia.org/wiki/Slowly_changing_dimension#Type_2:_add_new_row) records:

| Field | Meaning | Notes |
| -------------- | ------- | ----- |
| `dbt_valid_from` | The timestamp when this snapshot row was first inserted and became valid. | The value is affected by the [`strategy`](/reference/resource-configs/strategy). |
| `dbt_valid_to` | The timestamp when this row is no longer valid. | |
| `dbt_scd_id` | A unique key generated for each snapshot row. | This is used internally by dbt. |
| `dbt_updated_at` | The `updated_at` timestamp of the source record when this snapshot row was inserted. | This is used internally by dbt. |

However, these column names can be customized using the `snapshot_meta_column_names` config.

:::warning

To avoid any unintentional data modification, dbt will **not** automatically apply any column renames. So if a user applies `snapshot_meta_column_names` config for a snapshot without updating the pre-existing table, they will get an error. We recommend either only using these settings for net-new snapshots, or arranging an update of pre-existing tables prior to committing a column name change.

:::

## Example

<File name='snapshots/schema.yml'>

```yaml
snapshots:
- name: orders_snapshot
relation: ref("orders")
config:
unique_key: id
strategy: check
check_cols: all
snapshot_meta_column_names:
dbt_valid_from: start_date
dbt_valid_to: end_date
dbt_scd_id: scd_id
dbt_updated_at: modified_date
```

</File>

The resulting snapshot table contains the configured meta column names:

| id | scd_id | modified_date | start_date | end_date |
| -- | -------------------- | -------------------- | -------------------- | -------------------- |
| 1 | 60a1f1dbdf899a4dd... | 2024-10-02 ... | 2024-10-02 ... | 2024-10-02 ... |
| 2 | b1885d098f8bcff51... | 2024-10-02 ... | 2024-10-02 ... | |
3 changes: 3 additions & 0 deletions website/docs/reference/snapshot-configs.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,7 @@ snapshots:
[+](/reference/resource-configs/plus-prefix)[strategy](/reference/resource-configs/strategy): timestamp | check
[+](/reference/resource-configs/plus-prefix)[updated_at](/reference/resource-configs/updated_at): <column_name>
[+](/reference/resource-configs/plus-prefix)[check_cols](/reference/resource-configs/check_cols): [<column_name>] | all
[+](/reference/resource-configs/plus-prefix)[snapshot_meta_column_names](/reference/resource-configs/snapshot_meta_column_names): {<dictionary>}

```

Expand Down Expand Up @@ -111,6 +112,7 @@ snapshots:
[strategy](/reference/resource-configs/strategy): timestamp | check
[updated_at](/reference/resource-configs/updated_at): <column_name>
[check_cols](/reference/resource-configs/check_cols): [<column_name>] | all
[snapshot_meta_column_names](/reference/resource-configs/snapshot_meta_column_names): {<dictionary>}

```
</File>
Expand Down Expand Up @@ -138,6 +140,7 @@ Configurations can be applied to snapshots using [YAML syntax](/docs/build/snaps
[strategy](/reference/resource-configs/strategy)="timestamp" | "check",
[updated_at](/reference/resource-configs/updated_at)="<column_name>",
[check_cols](/reference/resource-configs/check_cols)=["<column_name>"] | "all"
[snapshot_meta_column_names](/reference/resource-configs/snapshot_meta_column_names)={<dictionary>}
) }}

```
Expand Down
1 change: 1 addition & 0 deletions website/sidebars.js
Original file line number Diff line number Diff line change
Expand Up @@ -974,6 +974,7 @@ const sidebarSettings = {
"reference/resource-configs/unique_key",
"reference/resource-configs/updated_at",
"reference/resource-configs/invalidate_hard_deletes",
"reference/resource-configs/snapshot_meta_column_names",
],
},
{
Expand Down
Loading