Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document new recommended syntax for foreign_key constraints #6189

Merged
merged 9 commits into from
Oct 1, 2024
81 changes: 62 additions & 19 deletions website/docs/reference/resource-properties/constraints.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,16 +21,60 @@ The structure of a constraint is:
- `type` (required): one of `not_null`, `unique`, `primary_key`, `foreign_key`, `check`, `custom`
- `expression`: Free text input to qualify the constraint. Required for certain constraint types, and optional for others.
- `name` (optional): Human-friendly name for this constraint. Supported by some data platforms.
- `columns` (model-level only): List of column names to apply the constraint over
- `columns` (model-level only): List of column names to apply the constraint over.

<VersionBlock lastVersion="1.8">
Foreign key constraints accept two additional inputs:
dbeatty10 marked this conversation as resolved.
Show resolved Hide resolved
- `to`: A relation input, likely `ref()`, indicating the referenced table.
- `to_columns`: A list of column(s) in that table containing the corresponding primary or unique key.

When using `foreign_key`, you need to specify the referenced table's schema manually. Use `{{ target.schema }}` in the `expression` field to automatically pass the schema used by the target environment. Note that later versions of dbt will have more efficient ways of handling this.
This syntax for defining foreign keys uses `ref`, meaning it will capture dependencies and works across different environments. It is available in [dbt Cloud Versionless](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless), and versions of dbt Core starting with v1.9.
mirnawong1 marked this conversation as resolved.
Show resolved Hide resolved

For example: `expression: "{{ target.schema }}.customers(customer_id)"`
<VersionBlock firstVersion="1.9">
<File name='models/schema.yml'>

```yml
models:
- name: <model_name>

# required
config:
contract: {enforced: true}

# model-level constraints
constraints:
- type: primary_key
columns: [first_column, second_column, ...]
- type: foreign_key # multi_column
columns: [first_column, second_column, ...]
to: "{{ ref('other_model_name') }}"
to_columns: [other_model_first_column, other_model_second_columns, ...]
- type: check
columns: [first_column, second_column, ...]
expression: "first_column != second_column"
name: human_friendly_name
- type: ...

columns:
- name: first_column
data_type: string

# column-level constraints
constraints:
- type: not_null
- type: unique
- type: foreign_key
to: "{{ ref('other_model_name') }}"
to_columns: other_model_column
- type: ...
```

</File>
</VersionBlock>

<VersionBlock lastVersion="1.8">

In older versions of dbt Core, when defining a `foreign_key` constraint, you need to manually specify the referenced table in the `expression` field. You can use `{{ target }}` variables to make this expression environment-aware, but the dependency between this model and the referenced table is not captured. Starting in dbt Core v1.9, you can specify the referenced table using the `ref()` function.

<File name='models/schema.yml'>

```yml
Expand All @@ -39,44 +83,43 @@ models:

# required
config:
contract:
enforced: true
contract: {enforced: true}

# model-level constraints
constraints:
- type: primary_key
columns: [FIRST_COLUMN, SECOND_COLUMN, ...]
- type: FOREIGN_KEY # multi_column
columns: [FIRST_COLUMN, SECOND_COLUMN, ...]
expression: "OTHER_MODEL_SCHEMA.OTHER_MODEL_NAME (OTHER_MODEL_FIRST_COLUMN, OTHER_MODEL_SECOND_COLUMN, ...)"
columns: [first_column, second_column, ...]
- type: foreign_key # multi_column
columns: [first_column, second_column, ...]
expression: "{{ target.schema }}.other_model_name (other_model_first_column, other_model_second_column, ...)"
- type: check
columns: [FIRST_COLUMN, SECOND_COLUMN, ...]
expression: "FIRST_COLUMN != SECOND_COLUMN"
name: HUMAN_FRIENDLY_NAME
columns: [first_column, second_column, ...]
expression: "first_column != second_column"
name: human_friendly_name
- type: ...

columns:
- name: FIRST_COLUMN
data_type: DATA_TYPE
- name: first_column
data_type: string

# column-level constraints
constraints:
- type: not_null
- type: unique
- type: foreign_key
expression: OTHER_MODEL_SCHEMA.OTHER_MODEL_NAME (OTHER_MODEL_COLUMN)
expression: "{{ target.schema }}.other_model_name (other_model_column)"
- type: ...
```

</File>


</VersionBlock>

## Platform-specific support

In transactional databases, it is possible to define "constraints" on the allowed values of certain columns, stricter than just the data type of those values. For example, Postgres supports and enforces all the constraints in the ANSI SQL standard (`not null`, `unique`, `primary key`, `foreign key`), plus a flexible row-level `check` constraint that evaluates to a boolean expression.

Most analytical data platforms support and enforce a `not null` constraint, but they either do not support or do not enforce the rest. It is sometimes still desirable to add an "informational" constraint, knowing it is _not_ enforced, for the purpose of integrating with legacy data catalog or entity-relation diagram tools ([dbt-core#3295](https://github.com/dbt-labs/dbt-core/issues/3295)).
Most analytical data platforms support and enforce a `not null` constraint, but they either do not support or do not enforce the rest. It is sometimes still desirable to add an "informational" constraint, knowing it is _not_ enforced, for the purpose of integrating with legacy data catalog or entity-relation diagram tools ([dbt-core#3295](https://github.com/dbt-labs/dbt-core/issues/3295)). Some data platforms can optionally use primary or foreign key constraints for query optimization if you specify an additional keyword.

To that end, there are two optional fields you can specify on any filter:
- `warn_unenforced: False` to skip warning on constraints that are supported, but not enforced, by this data platform. The constraint will be included in templated DDL.
Expand Down Expand Up @@ -244,7 +287,7 @@ select
Snowflake suppports four types of constraints: `unique`, `not null`, `primary key`, and `foreign key`.

It is important to note that only the `not null` (and the `not null` property of `primary key`) are actually checked at present.
The rest of the constraints are purely metadata, not verified when inserting data.
The rest of the constraints are purely metadata, not verified when inserting data. Although Snowflake does not validate `unique`, `primary`, or `foreign_key` constraints, you may optionally instruct Snowflake to use them for query optimization by specifying [`rely`](https://docs.snowflake.com/en/user-guide/join-elimination) in the constraint `expression` field.

Currently, Snowflake doesn't support the `check` syntax and dbt will skip the `check` config and raise a warning message if it is set on some models in the dbt project.

Expand Down
Loading