Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Unit testing docs and reference page #4603

Merged
merged 94 commits into from
Feb 14, 2024
Merged
Show file tree
Hide file tree
Changes from 88 commits
Commits
Show all changes
94 commits
Select commit Hold shift + click to select a range
61fdb9c
Adding multi cell
matthewshaver Dec 1, 2023
2e1aa1a
Updating links to access
matthewshaver Dec 1, 2023
b242461
Adding reference page for unit tests
matthewshaver Dec 6, 2023
840eed4
Unit tests
matthewshaver Dec 6, 2023
3dd2aad
Merge branch 'current' into next
matthewshaver Dec 11, 2023
6ce45ff
Update website/docs/reference/resource-properties/unit-tests.md
matthewshaver Jan 9, 2024
47d2827
Merge branch 'next' into unit-testing
matthewshaver Jan 9, 2024
214eb6f
Update website/dbt-versions.js
matthewshaver Jan 9, 2024
12fbe42
Fixing spacing changes
matthewshaver Jan 9, 2024
8eee934
Update website/docs/reference/resource-properties/unit-tests.md
matthewshaver Jan 10, 2024
a790962
Update website/docs/reference/resource-properties/unit-tests.md
matthewshaver Jan 10, 2024
b6d8ed6
Update website/docs/reference/resource-properties/unit-tests.md
matthewshaver Feb 7, 2024
9772609
Adding /docs page for unit tests
matthewshaver Feb 7, 2024
19fd463
fixing spacing on YML
matthewshaver Feb 7, 2024
f044ff1
Update unit-tests.md
matthewshaver Feb 7, 2024
56db5ca
Update unit-tests.md
matthewshaver Feb 7, 2024
6112e01
Update unit-tests.md
matthewshaver Feb 7, 2024
1687dd3
Update website/docs/docs/build/unit-tests.md
matthewshaver Feb 9, 2024
a18b387
Update website/docs/docs/build/unit-tests.md
matthewshaver Feb 9, 2024
5ffa00b
Updates
matthewshaver Feb 9, 2024
2329f3f
Updates from feedback
matthewshaver Feb 9, 2024
02e057b
Updates
matthewshaver Feb 9, 2024
d1e46f0
Merge branch 'current' into unit-testing
matthewshaver Feb 9, 2024
829c40c
Reverting changes
matthewshaver Feb 9, 2024
7547ecb
fixing edits
matthewshaver Feb 9, 2024
ada5919
fixing
matthewshaver Feb 9, 2024
59cbb84
fixing
matthewshaver Feb 9, 2024
4fb38a2
Updating sidebar folders
matthewshaver Feb 9, 2024
6fe72ff
Fixing errors
matthewshaver Feb 9, 2024
d621b2b
Update unit-tests.md
matthewshaver Feb 9, 2024
b6376c1
fixing yet another merge error
matthewshaver Feb 9, 2024
b0a3ba7
Merge branch 'unit-testing' of https://github.com/dbt-labs/docs.getdb…
matthewshaver Feb 9, 2024
688aede
Update unit-tests.md
matthewshaver Feb 9, 2024
80ec818
Update unit-tests.md
matthewshaver Feb 9, 2024
ed9e312
Updating based on feedback
matthewshaver Feb 9, 2024
0cf9022
Merge branch 'unit-testing' of https://github.com/dbt-labs/docs.getdb…
matthewshaver Feb 9, 2024
420e2f1
Update website/docs/docs/build/unit-tests.md
matthewshaver Feb 9, 2024
75ca6f5
Update website/docs/docs/build/unit-tests.md
matthewshaver Feb 9, 2024
4281976
Update unit-tests.md
matthewshaver Feb 9, 2024
9dbb79a
Update unit-tests.md
matthewshaver Feb 9, 2024
0de14e7
Apply suggestions from code review
dbeatty10 Feb 10, 2024
b94fb0a
Apply suggestions from code review
matthewshaver Feb 10, 2024
353c4f3
Update unit-tests.md
matthewshaver Feb 10, 2024
5bfff10
updating spacing
matthewshaver Feb 10, 2024
37df470
Update website/docs/reference/resource-properties/unit-tests.md
runleonarun Feb 10, 2024
39b83aa
Update website/docs/reference/commands/test.md
matthewshaver Feb 10, 2024
90526f6
Update website/docs/reference/resource-properties/unit-tests.md
runleonarun Feb 10, 2024
e213f47
Merge branch 'current' into unit-testing
runleonarun Feb 10, 2024
ab28f99
Apply suggestions from code review
matthewshaver Feb 10, 2024
f5753a7
Update website/docs/docs/build/unit-tests.md
runleonarun Feb 10, 2024
3872aaa
Update website/docs/docs/build/unit-tests.md
runleonarun Feb 10, 2024
567ef15
Update website/docs/docs/build/unit-tests.md
runleonarun Feb 10, 2024
57cbfe1
Update unit-tests.md
matthewshaver Feb 10, 2024
1a7d70d
Apply suggestions from code review
matthewshaver Feb 10, 2024
0d6b78a
Update unit-tests.md
matthewshaver Feb 10, 2024
5c50489
Update website/docs/docs/build/unit-tests.md
matthewshaver Feb 13, 2024
92b55c0
Update website/docs/reference/resource-properties/unit-tests.md
matthewshaver Feb 13, 2024
810180d
Update website/docs/docs/build/unit-tests.md
matthewshaver Feb 13, 2024
225f48f
Update website/docs/reference/resource-properties/unit-tests.md
matthewshaver Feb 13, 2024
ea79e05
Update unit-tests.md
matthewshaver Feb 13, 2024
fc0ba3a
Update unit-tests.md
matthewshaver Feb 13, 2024
fb45528
Update unit-tests.md
matthewshaver Feb 13, 2024
c0aa1d2
Merge branch 'current' into unit-testing
matthewshaver Feb 13, 2024
2edec42
Update website/sidebars.js
matthewshaver Feb 13, 2024
158ea89
Update unit-tests.md
matthewshaver Feb 13, 2024
8b4b31b
Update unit-tests.md
matthewshaver Feb 13, 2024
0a91b5e
Merge branch 'current' into unit-testing
runleonarun Feb 13, 2024
27ecde3
Update website/docs/docs/build/unit-tests.md
matthewshaver Feb 14, 2024
fba1903
Update website/docs/docs/build/unit-tests.md
matthewshaver Feb 14, 2024
bdfc000
Update website/docs/docs/build/unit-tests.md
matthewshaver Feb 14, 2024
58c41a6
Update website/docs/docs/build/unit-tests.md
matthewshaver Feb 14, 2024
f67b779
Update website/docs/docs/build/unit-tests.md
matthewshaver Feb 14, 2024
db435da
Update website/docs/docs/build/unit-tests.md
matthewshaver Feb 14, 2024
97f7e7a
Update website/docs/docs/build/unit-tests.md
matthewshaver Feb 14, 2024
62ce8c6
Update website/docs/docs/build/unit-tests.md
matthewshaver Feb 14, 2024
8bd64f7
Update website/docs/docs/build/unit-tests.md
matthewshaver Feb 14, 2024
72b84f2
Update website/docs/docs/build/unit-tests.md
matthewshaver Feb 14, 2024
dbd0aa4
Update website/docs/docs/build/unit-tests.md
matthewshaver Feb 14, 2024
12f92a8
Update website/docs/docs/build/unit-tests.md
matthewshaver Feb 14, 2024
dd87c90
Update website/docs/docs/build/unit-tests.md
matthewshaver Feb 14, 2024
694037a
Update unit-tests.md
matthewshaver Feb 14, 2024
528e653
Update website/docs/docs/build/unit-tests.md
matthewshaver Feb 14, 2024
5870f96
Update website/docs/reference/resource-properties/unit-tests.md
matthewshaver Feb 14, 2024
c10ea49
Update website/docs/reference/resource-properties/unit-tests.md
matthewshaver Feb 14, 2024
95cd74b
Update website/docs/reference/resource-properties/unit-tests.md
matthewshaver Feb 14, 2024
7b3f23b
Update website/docs/reference/resource-properties/unit-tests.md
matthewshaver Feb 14, 2024
ef9ceee
Update website/docs/reference/resource-properties/unit-tests.md
matthewshaver Feb 14, 2024
d6b4fa1
Apply suggestions from code review
matthewshaver Feb 14, 2024
8503acd
Apply suggestions from code review
matthewshaver Feb 14, 2024
a1b3614
Update website/docs/docs/build/unit-tests.md
matthewshaver Feb 14, 2024
ba74fe5
Apply suggestions from code review
graciegoheen Feb 14, 2024
9c626b9
Fixing format and spacing
matthewshaver Feb 14, 2024
6b8e034
fixing links
matthewshaver Feb 14, 2024
11fc966
Merge branch 'current' into unit-testing
matthewshaver Feb 14, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
256 changes: 256 additions & 0 deletions website/docs/docs/build/unit-tests.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,256 @@
---
title: "Unit tests"
graciegoheen marked this conversation as resolved.
Show resolved Hide resolved
sidebar_label: "Unit tests"
description: "Learn how to use unit tests on your SQL models."
graciegoheen marked this conversation as resolved.
Show resolved Hide resolved
search_weight: "heavy"
id: "unit-tests"
keywords:
- unit test, unit tests, unit testing, dag
---
:::note closed beta

Unit testing is currently in closed beta for dbt Cloud accounts that have updated to a [versionless environment](/docs/dbt-versions/upgrade-core-in-cloud).

It is available now as an alpha feature for dbt Core v1.8 users.

:::

Historically, dbt's test coverage was confined to [“data” tests](/docs/build/data-tests), assessing the quality of input data or resulting datasets' structure. However, these tests could only be executed _after_ a building a model.

Now, we are introducing a new type of test to dbt - unit tests. In software programming, unit tests validate small portions of your functional code, and they work much the same way here. Unit tests allow you to validate your SQL modeling logic on a small set of static inputs _before_ you materialize your full model in production. Unit tests enable test-driven development, benefiting developer efficiency and code reliability.

## Before you begin

- We currently only support unit testing SQL models.
- We currently only support adding unit tests to models in your _current_ project.
- If your model has multiple versions, by default the unit test will run on *all* versions of your model. Read [unit testing versioned models](#unit-testing-versioned-models) for more information.

Read the [reference doc](/reference/resource-properties/unit-tests) for more details about formatting your unit tests.

matthewshaver marked this conversation as resolved.
Show resolved Hide resolved
### When to add a unit test to your model

You should unit test a model:
- When your SQL contains complex logic:
- Regex
- Date math
- Window functions
- `case when` statements when there are many `when`s
- Truncation
- Recursion
- When you're writing custom logic to process input data, similar to creating a function.
- We don't recommend conducting unit testing for functions like `min()` since these functions are tested extensively by the warehouse. If an unexpected issue arises, it's more likely a result of issues in the underlying data rather than the function itself. Therefore, fixture data in the unit test won't provide valuable information.
- Logic for which you had bugs reported before.
- Edge cases not yet seen in your actual data that you want to handle.
- Prior to refactoring the transformation logic (especially if the refactor is significant).
- Models with high "criticality" (public, contracted models or models directly upstream of an exposure).

## Unit testing a model

This example creates a new `dim_customers` model with a field `is_valid_email_address` that calculates whether or not the customer’s email is valid:

<file name='dim_customers.sql'>

```sql
with customers as (

select * from {{ ref('stg_customers') }}

),

accepted_email_domains as (

select * from {{ ref('top_level_email_domains') }}

),

check_valid_emails as (

select
customers.customer_id,
customers.first_name,
customers.last_name,
coalesce (regexp_like(
customers.email, '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\\.[A-Za-z]{2,}$'
)
= true
and accepted_email_domains.tld is not null,
false) as is_valid_email_address
from customers
left join accepted_email_domains
on customers.email_top_level_domain = lower(accepted_email_domains.tld)

)

select * from check_valid_emails
```
</file>

The logic posed in this example can be challenging to validate. You can add a unit test to this model to ensure the `is_valid_email_address` logic captures all known edge cases: emails without `.`, emails without `@`, and emails from invalid domains.

<file name='dbt_project.yml'>

```yaml
matthewshaver marked this conversation as resolved.
Show resolved Hide resolved
unit_tests:
- name: test_is_valid_email_address
description: "Check my is_valid_email_address logic captures all known edge cases - emails without ., emails without @, and emails from invalid domains."
model: dim_customers
given:
- input: ref('stg_customers')
rows:
- {customer_id: 1, email: [email protected], email_top_level_domain: example.com}
- {customer_id: 2, email: [email protected], email_top_level_domain: unknown.com}
- {customer_id: 3, email: badgmail.com, email_top_level_domain: gmail.com}
- {customer_id: 4, email: missingdot@gmailcom, email_top_level_domain: gmail.com}
- input: ref('top_level_email_domains')
rows:
- {tld: example.com}
- {tld: gmail.com}
expect:
rows:
- {customer_id: 1, is_valid_email_address: true}
- {customer_id: 2, is_valid_email_address: false}
- {customer_id: 3, is_valid_email_address: false}
- {customer_id: 4, is_valid_email_address: false}

```
</file>

The previous example defines the mock data using the inline `dict` format, but you can also use `csv` either inline or in a separate fixture file.

You only have to define the mock data for the columns you care about. This enables you to write succinct and _specific_ unit tests.

:::note

The direct parents of the model that you’re unit testing (in this example, `stg_customers` and `top_level_email_domains`) need to exist in the warehouse before you can execute the unit test.

Use the `--empty` flag to build an empty version of the models to save warehouse spend.

runleonarun marked this conversation as resolved.
Show resolved Hide resolved
```bash

dbt run —-select "stg_customers top_level_email_domains" --empty
graciegoheen marked this conversation as resolved.
Show resolved Hide resolved

```

Alternatively, use `dbt build` to, in lineage order:

- Run the unit tests on your model.
- Materialize your model in the warehouse.
- Run the data tests on your model.

:::
matthewshaver marked this conversation as resolved.
Show resolved Hide resolved

Now you’re ready to run this unit test. You have a couple of options for commands depending on how specific you want to be:

- `dbt test —-select dim_customers` runs _all_ of the tests on `dim_customers`.
matthewshaver marked this conversation as resolved.
Show resolved Hide resolved
- `dbt test —-select "dim_customers,test_type:unit"` runs all of the _unit_ tests on `dim_customers`.
matthewshaver marked this conversation as resolved.
Show resolved Hide resolved
- `dbt test —-select test_is_valid_email_address` runs the test named `test_is_valid_email_address`.
graciegoheen marked this conversation as resolved.
Show resolved Hide resolved

```bash

dbt test --select test_is_valid_email_address
16:03:49 Running with dbt=1.8.0-a1
16:03:49 Registered adapter: postgres=1.8.0-a1
16:03:50 Found 6 models, 5 seeds, 4 data tests, 0 sources, 0 exposures, 0 metrics, 410 macros, 0 groups, 0 semantic models, 1 unit test
16:03:50
16:03:50 Concurrency: 5 threads (target='postgres')
16:03:50
16:03:50 1 of 1 START unit_test dim_customers::test_is_valid_email_address ................... [RUN]
16:03:51 1 of 1 FAIL 1 dim_customers::test_is_valid_email_address ............................ [FAIL 1 in 0.26s]
16:03:51
16:03:51 Finished running 1 unit_test in 0 hours 0 minutes and 0.67 seconds (0.67s).
16:03:51
16:03:51 Completed with 1 error and 0 warnings:
16:03:51
16:03:51 Failure in unit_test test_is_valid_email_address (models/marts/unit_tests.yml)
16:03:51

actual differs from expected:

@@ ,customer_id,is_valid_email_address
matthewshaver marked this conversation as resolved.
Show resolved Hide resolved
→ ,1 ,True→False
,2 ,False
...,... ,...


16:03:51
16:03:51 compiled Code at models/marts/unit_tests.yml
16:03:51
16:03:51 Done. PASS=0 WARN=0 ERROR=1 SKIP=0 TOTAL=1

```

The clever regex statement wasn’t as clever as initially thought, as the model incorrectly flagged `missingdot@gmailcom` as a valid email address.
matthewshaver marked this conversation as resolved.
Show resolved Hide resolved

Updating the regex logic to `'^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$'` (those pesky escape characters) and rerunning the unit test solves the problem:

```bash

dbt test --select test_is_valid_email_address
16:09:11 Running with dbt=1.8.0-a1
16:09:12 Registered adapter: postgres=1.8.0-a1
16:09:12 Found 6 models, 5 seeds, 4 data tests, 0 sources, 0 exposures, 0 metrics, 410 macros, 0 groups, 0 semantic models, 1 unit test
16:09:12
16:09:13 Concurrency: 5 threads (target='postgres')
16:09:13
16:09:13 1 of 1 START unit_test dim_wizards::test_is_valid_email_address ................... [RUN]
graciegoheen marked this conversation as resolved.
Show resolved Hide resolved
16:09:13 1 of 1 PASS dim_wizards::test_is_valid_email_address .............................. [PASS in 0.26s]
graciegoheen marked this conversation as resolved.
Show resolved Hide resolved
16:09:13
16:09:13 Finished running 1 unit_test in 0 hours 0 minutes and 0.75 seconds (0.75s).
16:09:13
16:09:13 Completed successfully
matthewshaver marked this conversation as resolved.
Show resolved Hide resolved
16:09:13
16:09:13 Done. PASS=1 WARN=0 ERROR=0 SKIP=0 TOTAL=1

```

Your model is now ready for production! Adding this unit test helped catch an issue with the SQL logic _before_ you materialized `dim_customers` in your warehouse and will better ensure the reliability of this model in the future.

## Unit testing versioned models

When a unit test is added to a model, it will run on all versions of the model by default.
Using the example in this article, if you have versions 1, 2, and 3 of `my_model`, the `my test_is_valid_email_address` unit test will run on all 3 versions.
graciegoheen marked this conversation as resolved.
Show resolved Hide resolved

To only unit test a specific version (or versions) of a model, include the desired version(s) in the model config:

```yml

unit-tests:
graciegoheen marked this conversation as resolved.
Show resolved Hide resolved
- name: test_is_valid_email_address # this is the unique name of the test
graciegoheen marked this conversation as resolved.
Show resolved Hide resolved
model: my_model # name of the model I'm unit testing
graciegoheen marked this conversation as resolved.
Show resolved Hide resolved
versions:
include:
- 2
given: # optional: list of inputs to provide as fixtures
graciegoheen marked this conversation as resolved.
Show resolved Hide resolved

```

In this scenario, if you have version 1, 2, and 3 of `my_model`, `my test_is_valid_email_address` unit test will run on _only_ version 2.
graciegoheen marked this conversation as resolved.
Show resolved Hide resolved

To unit test all versions except a specific version (or versions) of a model, you can exclude the relevant version(s) in the model config:

```yml

unit-tests:
graciegoheen marked this conversation as resolved.
Show resolved Hide resolved
- name: test_is_valid_email_address # this is the unique name of the test
graciegoheen marked this conversation as resolved.
Show resolved Hide resolved
model: my_model # name of the model I'm unit testing
graciegoheen marked this conversation as resolved.
Show resolved Hide resolved
versions:
exclude:
- 1
given: # optional: list of inputs to provide as fixtures
graciegoheen marked this conversation as resolved.
Show resolved Hide resolved

```
So, if you have versions 1, 2, and 3 of `my_model`, your `test_is_valid_email_address` unit test will run on _only_ versions 2 and 3.
graciegoheen marked this conversation as resolved.
Show resolved Hide resolved

If you want to unit test a model that references the pinned version of the model, you should specify that in the `ref` of your input:

```yml

unit-tests:
graciegoheen marked this conversation as resolved.
Show resolved Hide resolved
- name: test_is_valid_email_address # this is the unique name of the test
graciegoheen marked this conversation as resolved.
Show resolved Hide resolved
model: my_model # name of the model I am unit testing
graciegoheen marked this conversation as resolved.
Show resolved Hide resolved
given: # optional: list of inputs to provide as fixtures
graciegoheen marked this conversation as resolved.
Show resolved Hide resolved

```


matthewshaver marked this conversation as resolved.
Show resolved Hide resolved

matthewshaver marked this conversation as resolved.
Show resolved Hide resolved
47 changes: 46 additions & 1 deletion website/docs/reference/commands/test.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ title: "About dbt test command"
sidebar_label: "test"
id: "test"
---
<VersionBlock lastVersion="1.7">

`dbt test` runs tests defined on models, sources, snapshots, and seeds. It expects that you have already created those resources through the appropriate commands.

Expand All @@ -28,4 +29,48 @@ dbt test --select "one_specific_model,test_type:singular"
dbt test --select "one_specific_model,test_type:generic"
```

For more information on writing tests, see the [Testing Documentation](/docs/build/data-tests).
For more information on writing tests, see the [Testing Documentation](/docs/build/tests).

</VersionBlock>

<VersionBlock firstVersion="1.8">

`dbt test` runs data tests defined on models, sources, snapshots, and seeds and unit tests defined on SQL models. It expects that you have already created those resources through the appropriate commands.
matthewshaver marked this conversation as resolved.
Show resolved Hide resolved

The tests to run can be selected using the `--select` flag discussed [here](/reference/node-selection/syntax).

```bash
# run data and unit tests
dbt test

# run only data tests
dbt test --select test_type:data
matthewshaver marked this conversation as resolved.
Show resolved Hide resolved

# run only unit tests
dbt test --select test_type:unit

# run tests for one_specific_model
dbt test --select "one_specific_model"

# run tests for all models in package
dbt test --select "some_package.*"

# run only data tests defined singularly
dbt test --select "test_type:singular"

# run only data tests defined generically
dbt test --select "test_type:generic"

# run singular data tests limited to one_specific_model
graciegoheen marked this conversation as resolved.
Show resolved Hide resolved
dbt test --select "one_specific_model,test_type:singular"
graciegoheen marked this conversation as resolved.
Show resolved Hide resolved

# run generic data tests limited to one_specific_model
graciegoheen marked this conversation as resolved.
Show resolved Hide resolved
dbt test --select "one_specific_model,test_type:generic"
graciegoheen marked this conversation as resolved.
Show resolved Hide resolved
```

For more information on writing tests, read the [data testing](/docs/build/tests) and [unit testing](/docs/build/unit-tests) documentation.

</VersionBlock>



Loading
Loading