Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Partial parsing UI #4646

Merged
merged 64 commits into from
Jan 5, 2024
Merged
Show file tree
Hide file tree
Changes from 54 commits
Commits
Show all changes
64 commits
Select commit Hold shift + click to select a range
3096821
Partial parsing in dbt Cloud
nghi-ly Dec 14, 2023
8ce208a
Merge branch 'current' into ly-docs-partial-parse-ui
nghi-ly Dec 14, 2023
6363a30
This branch was auto-updated!
github-actions[bot] Dec 14, 2023
0881667
This branch was auto-updated!
github-actions[bot] Dec 14, 2023
5950748
This branch was auto-updated!
github-actions[bot] Dec 14, 2023
f997203
This branch was auto-updated!
github-actions[bot] Dec 14, 2023
49826b4
This branch was auto-updated!
github-actions[bot] Dec 14, 2023
209cb41
Feedback
nghi-ly Dec 14, 2023
ee311b2
Merge branch 'ly-docs-partial-parse-ui' of github.com:dbt-labs/docs.g…
nghi-ly Dec 14, 2023
2ff4bbf
This branch was auto-updated!
github-actions[bot] Dec 14, 2023
e34514f
This branch was auto-updated!
github-actions[bot] Dec 14, 2023
1e91e29
This branch was auto-updated!
github-actions[bot] Dec 14, 2023
9f6099a
This branch was auto-updated!
github-actions[bot] Dec 14, 2023
93dc4a5
This branch was auto-updated!
github-actions[bot] Dec 14, 2023
1c61db3
This branch was auto-updated!
github-actions[bot] Dec 14, 2023
7cb32c8
This branch was auto-updated!
github-actions[bot] Dec 15, 2023
ccf3450
This branch was auto-updated!
github-actions[bot] Dec 15, 2023
311b8bb
This branch was auto-updated!
github-actions[bot] Dec 15, 2023
29ea69c
This branch was auto-updated!
github-actions[bot] Dec 15, 2023
c957197
This branch was auto-updated!
github-actions[bot] Dec 15, 2023
2ae32c6
This branch was auto-updated!
github-actions[bot] Dec 15, 2023
7013212
This branch was auto-updated!
github-actions[bot] Dec 18, 2023
7897b0e
This branch was auto-updated!
github-actions[bot] Dec 18, 2023
d851dc7
This branch was auto-updated!
github-actions[bot] Dec 18, 2023
bcad10b
This branch was auto-updated!
github-actions[bot] Dec 18, 2023
830ae22
This branch was auto-updated!
github-actions[bot] Dec 18, 2023
b5b63db
This branch was auto-updated!
github-actions[bot] Dec 18, 2023
979822a
This branch was auto-updated!
github-actions[bot] Dec 18, 2023
0c8192c
This branch was auto-updated!
github-actions[bot] Dec 18, 2023
522e825
This branch was auto-updated!
github-actions[bot] Dec 18, 2023
106859b
This branch was auto-updated!
github-actions[bot] Dec 18, 2023
631bbe6
This branch was auto-updated!
github-actions[bot] Dec 18, 2023
39ae887
This branch was auto-updated!
github-actions[bot] Dec 18, 2023
9fd5e51
This branch was auto-updated!
github-actions[bot] Dec 18, 2023
6b6ca77
This branch was auto-updated!
github-actions[bot] Dec 18, 2023
e217fb4
This branch was auto-updated!
github-actions[bot] Dec 18, 2023
35ecd70
This branch was auto-updated!
github-actions[bot] Dec 18, 2023
8e7091c
This branch was auto-updated!
github-actions[bot] Dec 18, 2023
97fab57
This branch was auto-updated!
github-actions[bot] Dec 19, 2023
2bba3c0
This branch was auto-updated!
github-actions[bot] Dec 19, 2023
a00749a
This branch was auto-updated!
github-actions[bot] Dec 19, 2023
dacccec
This branch was auto-updated!
github-actions[bot] Dec 19, 2023
256b31a
This branch was auto-updated!
github-actions[bot] Dec 19, 2023
55cf203
This branch was auto-updated!
github-actions[bot] Dec 19, 2023
d137220
This branch was auto-updated!
github-actions[bot] Dec 20, 2023
4f99dbf
This branch was auto-updated!
github-actions[bot] Dec 20, 2023
a1f8650
This branch was auto-updated!
github-actions[bot] Dec 20, 2023
1181dfa
This branch was auto-updated!
github-actions[bot] Dec 21, 2023
5ebcca4
This branch was auto-updated!
github-actions[bot] Dec 21, 2023
fc1aaaa
Update website/docs/reference/parsing.md
nghi-ly Jan 2, 2024
0a14d2e
Update website/docs/reference/parsing.md
nghi-ly Jan 2, 2024
cb418f9
Update website/snippets/_cloud-environments-info.md
nghi-ly Jan 2, 2024
2221548
Feedback
nghi-ly Jan 2, 2024
2f23b6b
Merge branch 'current' into ly-docs-partial-parse-ui
nghi-ly Jan 2, 2024
7b88dab
This branch was auto-updated!
github-actions[bot] Jan 2, 2024
c4ad423
This branch was auto-updated!
github-actions[bot] Jan 2, 2024
6e60440
This branch was auto-updated!
github-actions[bot] Jan 3, 2024
da0626d
This branch was auto-updated!
github-actions[bot] Jan 3, 2024
c096b9d
This branch was auto-updated!
github-actions[bot] Jan 3, 2024
3255d8f
This branch was auto-updated!
github-actions[bot] Jan 4, 2024
4c4f7e1
This branch was auto-updated!
github-actions[bot] Jan 4, 2024
402d96f
This branch was auto-updated!
github-actions[bot] Jan 4, 2024
748e1a7
This branch was auto-updated!
github-actions[bot] Jan 4, 2024
9db2463
This branch was auto-updated!
github-actions[bot] Jan 5, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
---
title: "New: Native support for partial parsing"
description: "December 2023: For faster run times with your dbt invocations, configure dbt Cloud to parse only the changed files in your project."
sidebar_label: "New: Native support for partial parsing"
sidebar_position: 09
tags: [Jan-2024]
date: 2024-01-03
---

By default, dbt parses all the files in your project at the beginning of every dbt invocation. Depending on the size of your project, this operation can take a long time to complete. With the new partial parsing feature in dbt Cloud, you can reduce the time it takes for dbt to parse your project. When enabled, dbt Cloud parses only the changed files in your project instead of parsing all the project files. As a result, your dbt invocations will take less time to run.

To learn more, refer to [Partial parsing](/docs/deploy/deploy-environments#partial-parsing).

<Lightbox src="/img/docs/deploy/example-account-settings.png" width="85%" title="Example of the Partial parsing option" />

Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,4 @@ Now available for dbt Cloud Enterprise plans is a new option to enable Git repos

To learn more, refer to [Repo caching](/docs/deploy/deploy-environments#git-repository-caching).

<Lightbox src="/img/docs/deploy/example-repo-caching.png" width="85%" title="Example of the Repository caching option" />
<Lightbox src="/img/docs/deploy/example-account-settings.png" width="85%" title="Example of the Repository caching option" />
6 changes: 4 additions & 2 deletions website/docs/reference/parsing.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ The [`PARTIAL_PARSE` global config](/reference/global-configs/parsing) can be en

Parse-time attributes (dependencies, configs, and resource properties) are resolved using the parse-time context. When partial parsing is enabled, and certain context variables change, those attributes will _not_ be re-resolved, and are likely to become stale.

In particular, you may see **incorrect results** if these attributes depend on "volatile" context variables, such as [`run_started_at`](/reference/dbt-jinja-functions/run_started_at), [`invocation_id`](/reference/dbt-jinja-functions/invocation_id), or [flags](/reference/dbt-jinja-functions/flags). These variables are likely (or even guaranteed!) to change in each invocation. We _highly discourage_ you from using these variables to set parse-time attributes (dependencies, configs, and resource properties).
In particular, you may see incorrect results if these attributes depend on "volatile" context variables, such as [`run_started_at`](/reference/dbt-jinja-functions/run_started_at), [`invocation_id`](/reference/dbt-jinja-functions/invocation_id), or [flags](/reference/dbt-jinja-functions/flags). These variables are likely (or even guaranteed!) to change in each invocation. dbt Labs _strongly discourages_ you from using these variables to set parse-time attributes (dependencies, configs, and resource properties).

Starting in v1.0, dbt _will_ detect changes in environment variables. It will selectively re-parse only the files that depend on that [`env_var`](/reference/dbt-jinja-functions/env_var) value. (If the env var is used in `profiles.yml` or `dbt_project.yml`, a full re-parse is needed.) However, dbt will _not_ re-render **descriptions** that include env vars. If your descriptions include frequently changing env vars (this is highly uncommon), we recommend that you fully re-parse when generating documentation: `dbt --no-partial-parse docs generate`.

nghi-ly marked this conversation as resolved.
Show resolved Hide resolved
Expand All @@ -51,7 +51,9 @@ If certain inputs change between runs, dbt will trigger a full re-parse. The res
- `dbt_project.yml` content (or `env_var` values used within)
- installed packages
- dbt version
- certain widely-used macros, e.g. [builtins](/reference/dbt-jinja-functions/builtins) overrides or `generate_x_name` for `database`/`schema`/`alias`
- certain widely-used macros (for example, [builtins](/reference/dbt-jinja-functions/builtins), overrides, or `generate_x_name` for `database`/`schema`/`alias`)

If you're triggering [CI](/docs/deploy/continuous-integration) job runs, the benefits of partial parsing are not applicable to new pull requests (PR) or new branches. However, they are applied on subsequent commits to the new PR or branch.

If you ever get into a bad state, you can disable partial parsing and trigger a full re-parse by setting the `PARTIAL_PARSE` global config to false, or by deleting `target/partial_parse.msgpack` (e.g. by running `dbt clean`).

Expand Down
64 changes: 39 additions & 25 deletions website/snippets/_cloud-environments-info.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,31 +34,6 @@ Both development and deployment environments have a section called **General Set
- If you select a current version with `(latest)` in the name, your environment will automatically install the latest stable version of the minor version selected.
:::

### Git repository caching
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this section was moved (not deleted) to reorder the four H3 subsections to be alphabetical


At the start of every job run, dbt Cloud clones the project's Git repository so it has the latest versions of your project's code and runs `dbt deps` to install your dependencies.

For improved reliability and performance on your job runs, you can enable dbt Cloud to keep a cache of the project's Git repository. So, if there's a third-party outage that causes the cloning operation to fail, dbt Cloud will instead use the cached copy of the repo so your jobs can continue running as scheduled.

dbt Cloud caches your project's Git repo after each successful run and retains it for 8 days if there are no repo updates. It caches all packages regardless of installation method and does not fetch code outside of the job runs.

dbt Cloud will use the cached copy of your project's Git repo under these circumstances:

- Outages from third-party services (for example, the [dbt package hub](https://hub.getdbt.com/)).
- Git authentication fails.
- There are syntax errors in the `packages.yml` file. You can set up and use [continuous integration (CI)](/docs/deploy/continuous-integration) to find these errors sooner.
- If a package doesn't work with the current dbt version. You can set up and use [continuous integration (CI)](/docs/deploy/continuous-integration) to identify this issue sooner.

To enable Git repository caching, select **Account settings** from the gear menu and enable the **Repository caching** option.

<Lightbox src="/img/docs/deploy/example-repo-caching.png" width="85%" title="Example of the Repository caching option" />

:::note

This feature is only available on the dbt Cloud Enterprise plan.

:::

### Custom branch behavior

By default, all environments will use the default branch in your repository (usually the `main` branch) when accessing your dbt code. This is overridable within each dbt Cloud Environment using the **Default to a custom branch** option. This setting have will have slightly different behavior depending on the environment type:
Expand Down Expand Up @@ -99,3 +74,42 @@ schema: dbt_alice
threads: 4
```

### Git repository caching

At the start of every job run, dbt Cloud clones the project's Git repository so it has the latest versions of your project's code and runs `dbt deps` to install your dependencies.

For improved reliability and performance on your job runs, you can enable dbt Cloud to keep a cache of the project's Git repository. So, if there's a third-party outage that causes the cloning operation to fail, dbt Cloud will instead use the cached copy of the repo so your jobs can continue running as scheduled.

dbt Cloud caches your project's Git repo after each successful run and retains it for 8 days if there are no repo updates. It caches all packages regardless of installation method and does not fetch code outside of the job runs.

dbt Cloud will use the cached copy of your project's Git repo under these circumstances:

- Outages from third-party services (for example, the [dbt package hub](https://hub.getdbt.com/)).
- Git authentication fails.
- There are syntax errors in the `packages.yml` file. You can set up and use [continuous integration (CI)](/docs/deploy/continuous-integration) to find these errors sooner.
- If a package doesn't work with the current dbt version. You can set up and use [continuous integration (CI)](/docs/deploy/continuous-integration) to identify this issue sooner.

To enable Git repository caching, select **Account settings** from the gear menu and enable the **Repository caching** option.

<Lightbox src="/img/docs/deploy/example-account-settings.png" width="85%" title="Example of the Repository caching option" />

:::note

This feature is only available on the dbt Cloud Enterprise plan.

:::

### Partial parsing

At the start of every dbt invocation, dbt reads all the files in your project, extracts information, and constructs an internal manifest containing every object (model, source, macro, and so on). Among other things, it uses the `ref()`, `source()`, and `config()` macro calls within models to set properties, infer dependencies, and construct your project's DAG. When dbt finishes parsing your project, it stores the internal manifest in a file called `partial_parse.msgpack`.
nghi-ly marked this conversation as resolved.
Show resolved Hide resolved

Parsing projects can be time-consuming, especially for large projects with hundreds of models and thousands of files. To reduce the time it takes dbt to parse your project, use the partial parsing feature in dbt Cloud for your environment. When enabled, dbt Cloud uses the `partial_parse.msgpack` file to determine which files have changed (if any) since the project was last parsed, and then it parses _only_ the changed files and the files related to those changes.

Partial parsing in dbt Cloud requires dbt version 1.4 or newer. The feature does have some known limitations. Refer to [Known limitations](/reference/parsing#known-limitations) to learn more about them.
nghi-ly marked this conversation as resolved.
Show resolved Hide resolved

nghi-ly marked this conversation as resolved.
Show resolved Hide resolved
To enable, select **Account settings** from the gear menu and enable the **Partial parsing** option.

<Lightbox src="/img/docs/deploy/example-account-settings.png" width="85%" title="Example of the Partial parsing option" />



Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file not shown.
Loading