Skip to content

Commit

Permalink
Apply suggestions from code review
Browse files Browse the repository at this point in the history
Co-authored-by: Grace Goheen <[email protected]>
  • Loading branch information
matthewshaver and graciegoheen authored Dec 7, 2023
1 parent 7f758f4 commit 1ba9c68
Show file tree
Hide file tree
Showing 6 changed files with 14 additions and 14 deletions.
16 changes: 8 additions & 8 deletions website/docs/docs/build/data-tests.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,19 +25,19 @@ Like almost everything in dbt, data tests are SQL queries. In particular, they a

There are two ways of defining data tests in dbt:
* A **singular** data test is testing in its simplest form: If you can write a SQL query that returns failing rows, you can save that query in a `.sql` file within your [test directory](/reference/project-configs/test-paths). It's now a data test, and it will be executed by the `dbt test` command.
* A **generic** test is a parameterized query that accepts arguments. The test query is defined in a special `test` block (like a [macro](jinja-macros)). Once defined, you can reference the generic test by name throughout your `.yml` files—define it on models, columns, sources, snapshots, and seeds. dbt ships with four generic tests built in, and we think you should use them!
* A **generic** data test is a parameterized query that accepts arguments. The test query is defined in a special `test` block (like a [macro](jinja-macros)). Once defined, you can reference the generic test by name throughout your `.yml` files—define it on models, columns, sources, snapshots, and seeds. dbt ships with four generic data tests built in, and we think you should use them!

Defining data tests is a great way to confirm that your code is working correctly, and helps prevent regressions when your code changes. Because you can use them over and over again, making similar assertions with minor variations, generic data tests tend to be much more common—they should make up the bulk of your dbt testing suite. That said, both ways of defining data tests have their time and place.
Defining data tests is a great way to confirm that your outputs and inputs are as expected, and helps prevent regressions when your code changes. Because you can use them over and over again, making similar assertions with minor variations, generic data tests tend to be much more common—they should make up the bulk of your dbt data testing suite. That said, both ways of defining data tests have their time and place.

:::tip Creating your first data tests
If you're new to dbt, we recommend that you check out our [quickstart guide](/guides) to build your first dbt project with models and tests.
:::

## Singular data tests

The simplest way to define a data test is by writing the exact SQL that will return failing records. We call these "singular" tests, because they're one-off assertions usable for a single purpose.
The simplest way to define a data test is by writing the exact SQL that will return failing records. We call these "singular" data tests, because they're one-off assertions usable for a single purpose.

These tests are defined in `.sql` files, typically in your `tests` directory (as defined by your [`test-paths` config](/reference/project-configs/test-paths)). You can use Jinja (including `ref` and `source`) in the test definition, just like you can when creating models. Each `.sql` file contains one `select` statement, and it defines one test:
These tests are defined in `.sql` files, typically in your `tests` directory (as defined by your [`test-paths` config](/reference/project-configs/test-paths)). You can use Jinja (including `ref` and `source`) in the test definition, just like you can when creating models. Each `.sql` file contains one `select` statement, and it defines one data test:

<File name='tests/assert_total_payment_amount_is_positive.sql'>

Expand All @@ -58,8 +58,8 @@ The name of this test is the name of the file: `assert_total_payment_amount_is_p

Singular data tests are easy to write—so easy that you may find yourself writing the same basic structure over and over, only changing the name of a column or model. By that point, the test isn't so singular! In that case, we recommend...

## Generic tests
Certain data tests are generic: they can be reused over and over again. A generic test is defined in a `test` block, which contains a parametrized query and accepts arguments. It might look like:
## Generic data tests
Certain data tests are generic: they can be reused over and over again. A generic data test is defined in a `test` block, which contains a parametrized query and accepts arguments. It might look like:

```sql
{% test not_null(model, column_name) %}
Expand Down Expand Up @@ -108,11 +108,11 @@ In plain English, these data tests translate to:

Behind the scenes, dbt constructs a `select` query for each data test, using the parametrized query from the generic test block. These queries return the rows where your assertion is _not_ true; if the test returns zero rows, your assertion passes.

You can find more information about these tests, and additional configurations (including [`severity`](/reference/resource-configs/severity) and [`tags`](/reference/resource-configs/tags)) in the [reference section](/reference/resource-properties/data-tests).
You can find more information about these data tests, and additional configurations (including [`severity`](/reference/resource-configs/severity) and [`tags`](/reference/resource-configs/tags)) in the [reference section](/reference/resource-properties/data-tests).

### More generic data tests

Those four tests are enough to get you started. You'll quickly find you want to use a wider variety of tests—a good thing! You can also install generic tests from a package, or write your own, to use (and reuse) across your dbt project. Check out the [guide on custom generic tests](/best-practices/writing-custom-generic-tests) for more information.
Those four tests are enough to get you started. You'll quickly find you want to use a wider variety of tests—a good thing! You can also install generic data tests from a package, or write your own, to use (and reuse) across your dbt project. Check out the [guide on custom generic tests](/best-practices/writing-custom-generic-tests) for more information.

:::info
There are generic tests defined in some open source packages, such as [dbt-utils](https://hub.getdbt.com/dbt-labs/dbt_utils/latest/) and [dbt-expectations](https://hub.getdbt.com/calogica/dbt_expectations/latest/) — skip ahead to the docs on [packages](/docs/build/packages) to learn more!
Expand Down
2 changes: 1 addition & 1 deletion website/docs/docs/build/projects.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ At a minimum, all a project needs is the `dbt_project.yml` project configuration
| [models](/docs/build/models) | Each model lives in a single file and contains logic that either transforms raw data into a dataset that is ready for analytics or, more often, is an intermediate step in such a transformation. |
| [snapshots](/docs/build/snapshots) | A way to capture the state of your mutable tables so you can refer to it later. |
| [seeds](/docs/build/seeds) | CSV files with static data that you can load into your data platform with dbt. |
| [tests](/docs/build/data-tests) | SQL queries that you can write to test the models and resources in your project. |
| [data tests](/docs/build/data-tests) | SQL queries that you can write to test the models and resources in your project. |
| [macros](/docs/build/jinja-macros) | Blocks of code that you can reuse multiple times. |
| [docs](/docs/collaborate/documentation) | Docs for your project that you can build. |
| [sources](/docs/build/sources) | A way to name and describe the data loaded into your warehouse by your Extract and Load tools. |
Expand Down
2 changes: 1 addition & 1 deletion website/docs/docs/build/sources.md
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@ Using the `{{ source () }}` function also creates a dependency between the model

### Testing and documenting sources
You can also:
- Add tests to sources
- Add data tests to sources
- Add descriptions to sources, that get rendered as part of your documentation site

These should be familiar concepts if you've already added tests and descriptions to your models (if not check out the guides on [testing](/docs/build/data-tests) and [documentation](/docs/collaborate/documentation)).
Expand Down
4 changes: 2 additions & 2 deletions website/docs/docs/collaborate/govern/model-contracts.md
Original file line number Diff line number Diff line change
Expand Up @@ -183,9 +183,9 @@ Any model meeting the criteria described above _can_ define a contract. We recom

A model's contract defines the **shape** of the returned dataset. If the model's logic or input data doesn't conform to that shape, the model does not build.

[Tests](/docs/build/data-tests) are a more flexible mechanism for validating the content of your model _after_ it's built. So long as you can write the query, you can run the test. Tests are more configurable, such as with [custom severity thresholds](/reference/resource-configs/severity). They are easier to debug after finding failures, because you can query the already-built model, or [store the failing records in the data warehouse](/reference/resource-configs/store_failures).
[Data Tests](/docs/build/data-tests) are a more flexible mechanism for validating the content of your model _after_ it's built. So long as you can write the query, you can run the data test. Data tests are more configurable, such as with [custom severity thresholds](/reference/resource-configs/severity). They are easier to debug after finding failures, because you can query the already-built model, or [store the failing records in the data warehouse](/reference/resource-configs/store_failures).

In some cases, you can replace a test with its equivalent constraint. This has the advantage of guaranteeing the validation at build time, and it probably requires less compute (cost) in your data platform. The prerequisites for replacing a test with a constraint are:
In some cases, you can replace a data test with its equivalent constraint. This has the advantage of guaranteeing the validation at build time, and it probably requires less compute (cost) in your data platform. The prerequisites for replacing a data test with a constraint are:
- Making sure that your data platform can support and enforce the constraint that you need. Most platforms only enforce `not_null`.
- Materializing your model as `table` or `incremental` (**not** `view` or `ephemeral`).
- Defining a full contract for this model by specifying the `name` and `data_type` of each column.
Expand Down
2 changes: 1 addition & 1 deletion website/docs/faqs/Models/specifying-column-types.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,6 @@ So long as your model queries return the correct column type, the table you crea

To define additional column options:

* Rather than enforcing uniqueness and not-null constraints on your column, use dbt's [testing](/docs/build/data-tests) functionality to check that your assertions about your model hold true.
* Rather than enforcing uniqueness and not-null constraints on your column, use dbt's [data testing](/docs/build/data-tests) functionality to check that your assertions about your model hold true.
* Rather than creating default values for a column, use SQL to express defaults (e.g. `coalesce(updated_at, current_timestamp()) as updated_at`)
* In edge-cases where you _do_ need to alter a column (e.g. column-level encoding on Redshift), consider implementing this via a [post-hook](/reference/resource-configs/pre-hook-post-hook).
2 changes: 1 addition & 1 deletion website/docs/faqs/Tests/available-tests.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,6 @@ Out of the box, dbt ships with the following tests:
* `accepted_values`
* `relationships` (i.e. referential integrity)

You can also write your own [custom schema tests](/docs/build/data-tests).
You can also write your own [custom schema data tests](/docs/build/data-tests).

Some additional custom schema tests have been open-sourced in the [dbt-utils package](https://github.com/dbt-labs/dbt-utils/tree/0.2.4/#schema-tests), check out the docs on [packages](/docs/build/packages) to learn how to make these tests available in your project.

0 comments on commit 1ba9c68

Please sign in to comment.