-
Notifications
You must be signed in to change notification settings - Fork 984
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
resolves #3550 resolves #3632 resolves #3574 ## What are you changing in this pull request and why? Create a new page for "cross-project `ref` under `collaborate > govern`. I've decided to call the page "Project dependencies," and use it as an opportunity to highlight the differences between project + package dependencies. I started tackling two closely related issues, since we should be thematically consistent across all of them: - `enforce_access` for packages <> model access - `packages` can be configured in a file named `dependencies.yml` ## Previews - [Project dependencies](https://deploy-preview-3577--docs-getdbt-com.netlify.app/docs/collaborate/govern/project-dependencies) - [Packages: How do I add a package to my project?](https://deploy-preview-3577--docs-getdbt-com.netlify.app/docs/build/packages#how-do-i-add-a-package-to-my-project) - [Model access: How do I ref a model from another project?](https://deploy-preview-3577--docs-getdbt-com.netlify.app/docs/collaborate/govern/model-access#how-do-i-ref-a-model-from-another-project) - [dbt_project.yml: restrict-access](https://deploy-preview-3577--docs-getdbt-com.netlify.app/reference/dbt_project.yml) - [Upgrading to v1.6](https://deploy-preview-3577--docs-getdbt-com.netlify.app/guides/migration/versions/upgrading-to-v1.6) ## Checklist - [x] Add versioning components, as described in [Versioning Docs](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#versioning-entire-pages) - [x] Add a note to the prerelease version [Migration Guide](https://github.com/dbt-labs/docs.getdbt.com/tree/current/website/docs/guides/migration/versions) - [ ] Review the [Content style guide](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/content-style-guide.md) and [About versioning](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#adding-a-new-version) so my content adheres to these guidelines. - [ ] Add a checklist item for anything that needs to happen before this PR is merged, such as "needs technical review" or "change base branch." Adding new pages (delete if not applicable): - [x] Add page to `website/sidebars.js` - [x] Provide a unique filename for the new page --------- Co-authored-by: Matt Shaver <[email protected]> Co-authored-by: mirnawong1 <[email protected]>
- Loading branch information
1 parent
ebba69c
commit b62768c
Showing
7 changed files
with
141 additions
and
9 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
96 changes: 96 additions & 0 deletions
96
website/docs/docs/collaborate/govern/project-dependencies.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,96 @@ | ||
--- | ||
title: "Project dependencies" | ||
id: project-dependencies | ||
sidebar_label: "Project dependencies" | ||
description: "Reference public models across dbt projects" | ||
--- | ||
|
||
:::info | ||
"Project" dependencies and cross-project `ref` is currently in closed beta and are features of dbt Cloud Enterprise. To access these features, please contact your account team. | ||
::: | ||
|
||
For a long time, dbt has supported code reuse and extension by installing other projects as [packages](/docs/build/packages). When you install another project as a package, you are pulling in its full source code, and adding it to your own. This enables you to call macros and run models defined in that other project. | ||
|
||
While this is a great way to reuse code, share utility macros, and establish a starting point for common transformations, it's not a great way to enable collaboration across teams and at scale, especially at larger organizations. | ||
|
||
This year, dbt Labs is introducing an expanded notion of `dependencies` across multiple dbt projects: | ||
- **Packages** — Familiar and pre-existing type of dependency. You take this dependency by installing the package's full source code (like a software library). | ||
- **Projects** — A _new_ way to take a dependency on another project. Using a metadata service that runs behind the scenes, dbt Cloud resolves references on-the-fly to public models defined in other projects. You don't need to parse or run those upstream models yourself. Instead, you treat your dependency on those models as an API that returns a dataset. The maintainer of the public model is responsible for guaranteeing its quality and stability. | ||
|
||
## Example | ||
|
||
As an example, let's say you work on the Marketing team at the Jaffle Shop. The name of your team's project is `jaffle_marketing`: | ||
|
||
<File name="dbt_project.yml"> | ||
|
||
```yml | ||
name: jaffle_marketing | ||
``` | ||
</File> | ||
As part of your modeling of marketing data, you need to take a dependency on two other projects: | ||
- `dbt_utils` as a [package](#packages-use-case): An collection of utility macros that you can use while writing the SQL for your own models. This package is, open-source public, and maintained by dbt Labs. | ||
- `jaffle_finance` as a [project use-case](#projects-use-case): Data models about the Jaffle Shop's revenue. This project is private and maintained by your colleagues on the Finance team. You want to select from some of this project's final models, as a starting point for your own work. | ||
|
||
<File name="dependencies.yml"> | ||
|
||
```yml | ||
packages: | ||
- package: dbt-labs/dbt_utils | ||
version: 1.1.1 | ||
projects: | ||
- name: jaffle_finance # matches the 'name' in their 'dbt_project.yml' | ||
``` | ||
|
||
</File> | ||
|
||
What's happening here? | ||
|
||
The `dbt_utils` package — When you run `dbt deps`, dbt will pull down this package's full contents (100+ macros) as source code and add them to your environment. You can then call any macro from the package, just as you can call macros defined in your own project. | ||
|
||
The `jaffle_finance` projects — This is a new scenario. Unlike installing a package, the models in the `jaffle_finance` project will _not_ be pulled down as source code and parsed into your project. Instead, dbt Cloud provides a metadata service that resolves references to [**public models**](/docs/collaborate/govern/model-access) defined in the `jaffle_finance` project. | ||
|
||
### Advantages | ||
|
||
When you're building on top of another team's work, resolving the references in this way has several advantages: | ||
- You're using an intentional interface designated by the model's maintainer with `access: public`. | ||
- You're keeping the scope of your project narrow, and avoiding unnecessary resources and complexity. This is faster for you and faster for dbt. | ||
- You don't need to mirror any conditional configuration of the upstream project such as `vars`, environment variables, or `target.name`. You can reference them directly wherever the Finance team is building their models in production. Even if the Finance team makes changes like renaming the model, changing the name of its schema, or [bumping its version](/docs/collaborate/govern/model-versions), your `ref` would still resolve successfully. | ||
- You eliminate the risk of accidentally building those models with `dbt run` or `dbt build`. While you can select those models, you can't actually build them. This prevents unexpected warehouse costs and permissions issues. This also ensures proper ownership and cost allocation for each team's models. | ||
|
||
### Usage | ||
|
||
**Writing `ref`:** Models referenced from a `project`-type dependency must use [two-argument `ref`](/reference/dbt-jinja-functions/ref#two-argument-variant), including the project name: | ||
|
||
<File name="models/marts/roi_by_channel.sql"> | ||
|
||
```sql | ||
with monthly_revenue as ( | ||
select * from {{ ref('jaffle_finance', 'monthly_revenue') }} | ||
), | ||
... | ||
``` | ||
|
||
</File> | ||
|
||
**Cycle detection:** Currently, "project" dependencies can only go in one direction, meaning that the `jaffle_finance` project could not add a new model that depends, in turn, on `jaffle_marketing.roi_by_channel`. dbt will check for cycles across projects and raise errors if any are detected. We are considering support for this pattern in the future, whereby dbt would still check for node-level cycles while allowing cycles at the project level. | ||
|
||
### Comparison | ||
|
||
If you were to instead install the `jaffle_finance` project as a `package` dependency, you would instead be pulling down its full source code and adding it to your runtime environment. This means: | ||
- dbt needs to parse and resolve more inputs (which is slower) | ||
- dbt expects you to configure these models as if they were your own (with `vars`, env vars, etc) | ||
- dbt will run these models as your own unless you explicitly `--exclude` them | ||
- You could be using the project's models in a way that their maintainer (the Finance team) hasn't intended | ||
|
||
There are a few cases where installing another internal project as a package can be a useful pattern: | ||
- Unified deployments — In a production environment, if the central data platform team of Jaffle Shop wanted to schedule the deployment of models across both `jaffle_finance` and `jaffle_marketing`, they could use dbt's [selection syntax](/reference/node-selection/syntax) to create a new "passthrough" project that installed both projects as packages. | ||
- Coordinated changes — In development, if you wanted to test the effects of a change to a public model in an upstream project (`jaffle_finance.monthly_revenue`) on a downstream model (`jaffle_marketing.roi_by_channel`) _before_ introducing changes to a staging or production environment, you can install the `jaffle_finance` package as a package within `jaffle_marketing`. The installation can point to a specific git branch, however, if you find yourself frequently needing to perform end-to-end testing across both projects, we recommend you re-examine if this represents a stable interface boundary. | ||
|
||
These are the exceptions, rather than the rule. Installing another team's project as a package adds complexity, latency, and risk of unnecessary costs. By defining clear interface boundaries across teams, by serving one team's public models as "APIs" to another, and by enabling practitioners to develop with a more narrowly-defined scope, we can enable more people to contribute, with more confidence, while requiring less context upfront. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters