diff --git a/website/docs/docs/build/packages.md b/website/docs/docs/build/packages.md index 74e25262994..8d18a55e949 100644 --- a/website/docs/docs/build/packages.md +++ b/website/docs/docs/build/packages.md @@ -3,7 +3,7 @@ title: "Packages" id: "packages" --- -## What is a package? + Software engineers frequently modularize code into libraries. These libraries help programmers operate with leverage: they can spend more time focusing on their unique business logic, and less time implementing code that someone else has already spent the time perfecting. In dbt, libraries like these are called _packages_. dbt's packages are so powerful because so many of the analytic problems we encountered are shared across organizations, for example: @@ -22,13 +22,19 @@ dbt _packages_ are in fact standalone dbt projects, with models and macros that * Models in the package will be materialized when you `dbt run`. * You can use `ref` in your own models to refer to models from the package. * You can use macros in the package in your own project. +* It's important to note that defining and installing dbt packages is different from [defining and installing Python packages](/docs/build/python-models#using-pypi-packages) -:::note Using Python packages -Defining and installing dbt packages is different from [defining and installing Python packages](/docs/build/python-models#using-pypi-packages). +:::info `dependencies.yml` has replaced `packages.yml` +Starting from dbt v1.6, `dependencies.yml` has replaced `packages.yml`. This file can now contain both types of dependencies: "package" and "project" dependencies. +- "Package" dependencies lets you add source code from someone else's dbt project into your own, like a library. +- "Project" dependencies provide a different way to build on top of someone else's work in dbt. Refer to [Project dependencies](/docs/collaborate/govern/project-dependencies) for more info. +- +You can rename `packages.yml` to `dependencies.yml`, _unless_ you need to use Jinja within your packages specification. This could be necessary, for example, if you want to add an environment variable with a git token in a private git package specification. ::: + ## How do I add a package to my project? 1. Add a file named `dependencies.yml` or `packages.yml` to your dbt project. This should be at the same level as your `dbt_project.yml` file. 2. Specify the package(s) you wish to add using one of the supported syntaxes, for example: @@ -366,3 +372,4 @@ packages: ``` + diff --git a/website/docs/docs/collaborate/govern/project-dependencies.md b/website/docs/docs/collaborate/govern/project-dependencies.md index 68115aded30..3a111667c03 100644 --- a/website/docs/docs/collaborate/govern/project-dependencies.md +++ b/website/docs/docs/collaborate/govern/project-dependencies.md @@ -6,16 +6,14 @@ description: "Reference public models across dbt projects" pagination_next: null --- -:::caution Closed Beta - dbt Cloud Enterprise -"Project" dependencies and cross-project `ref` are features of dbt Cloud Enterprise, currently in Closed Beta. To access these features while they are in beta, please contact your account team at dbt Labs. +:::info Available in Public Preview for dbt Cloud Enterprise accounts -**Prerequisites:** In order to add project dependencies and resolve cross-project `ref`, you must: -- Have the feature enabled (speak to your account team) -- Use dbt v1.6 for **both** the upstream ("producer") project and the downstream ("consumer") project. -- Have a deployment environment in the upstream ("producer") project [that is set to be your production environment](/docs/deploy/deploy-environments#set-as-production-environment-beta) -- Have a successful run of the upstream ("producer") project +Project dependencies and cross-project `ref` are features available in [dbt Cloud Enterprise](https://www.getdbt.com/pricing), currently in [Public Preview](/docs/dbt-versions/product-lifecycles#dbt-cloud). + +Enterprise users can use these features by designating a [public model](/docs/collaborate/govern/model-access) and adding a [cross-project ref](#how-to-use-ref). ::: + For a long time, dbt has supported code reuse and extension by installing other projects as [packages](/docs/build/packages). When you install another project as a package, you are pulling in its full source code, and adding it to your own. This enables you to call macros and run models defined in that other project. While this is a great way to reuse code, share utility macros, and establish a starting point for common transformations, it's not a great way to enable collaboration across teams and at scale, especially at larger organizations. @@ -24,6 +22,33 @@ This year, dbt Labs is introducing an expanded notion of `dependencies` across m - **Packages** — Familiar and pre-existing type of dependency. You take this dependency by installing the package's full source code (like a software library). - **Projects** — A _new_ way to take a dependency on another project. Using a metadata service that runs behind the scenes, dbt Cloud resolves references on-the-fly to public models defined in other projects. You don't need to parse or run those upstream models yourself. Instead, you treat your dependency on those models as an API that returns a dataset. The maintainer of the public model is responsible for guaranteeing its quality and stability. + +Starting in dbt v1.6 or higher, `packages.yml` has been renamed to `dependencies.yml`. However, if you need use Jinja within your packages config, such as an environment variable for your private package, you need to keep using `packages.yml` for your packages for now. Refer to [FAQs](#faqs) for more info. + +## Prerequisites + +In order to add project dependencies and resolve cross-project `ref`, you must: +- Use dbt v1.6 or higher for **both** the upstream ("producer") project and the downstream ("consumer") project. +- Have a deployment environment in the upstream ("producer") project [that is set to be your production environment](/docs/deploy/deploy-environments#set-as-production-environment-beta) +- Have a successful run of the upstream ("producer") project +- Have a multi-tenant or single-tenant [dbt Cloud Enterprise](https://www.getdbt.com/pricing) account (Azure ST is not supported but coming soon) + + ## Example As an example, let's say you work on the Marketing team at the Jaffle Shop. The name of your team's project is `jaffle_marketing`: @@ -37,7 +62,7 @@ name: jaffle_marketing As part of your modeling of marketing data, you need to take a dependency on two other projects: -- `dbt_utils` as a [package](#packages-use-case): An collection of utility macros that you can use while writing the SQL for your own models. This package is, open-source public, and maintained by dbt Labs. +- `dbt_utils` as a [package](#packages-use-case): A collection of utility macros that you can use while writing the SQL for your own models. This package is, open-source public, and maintained by dbt Labs. - `jaffle_finance` as a [project use-case](#projects-use-case): Data models about the Jaffle Shop's revenue. This project is private and maintained by your colleagues on the Finance team. You want to select from some of this project's final models, as a starting point for your own work. @@ -67,7 +92,7 @@ When you're building on top of another team's work, resolving the references in - You don't need to mirror any conditional configuration of the upstream project such as `vars`, environment variables, or `target.name`. You can reference them directly wherever the Finance team is building their models in production. Even if the Finance team makes changes like renaming the model, changing the name of its schema, or [bumping its version](/docs/collaborate/govern/model-versions), your `ref` would still resolve successfully. - You eliminate the risk of accidentally building those models with `dbt run` or `dbt build`. While you can select those models, you can't actually build them. This prevents unexpected warehouse costs and permissions issues. This also ensures proper ownership and cost allocation for each team's models. -### Usage +### How to use ref **Writing `ref`:** Models referenced from a `project`-type dependency must use [two-argument `ref`](/reference/dbt-jinja-functions/ref#two-argument-variant), including the project name: @@ -88,6 +113,8 @@ with monthly_revenue as ( **Cycle detection:** Currently, "project" dependencies can only go in one direction, meaning that the `jaffle_finance` project could not add a new model that depends, in turn, on `jaffle_marketing.roi_by_channel`. dbt will check for cycles across projects and raise errors if any are detected. We are considering support for this pattern in the future, whereby dbt would still check for node-level cycles while allowing cycles at the project level. +For more guidance on how to use dbt Mesh, refer to the dedicated [dbt Mesh guide](/guides/best-practices/how-we-mesh/mesh-1-intro). + ### Comparison If you were to instead install the `jaffle_finance` project as a `package` dependency, you would instead be pulling down its full source code and adding it to your runtime environment. This means: @@ -100,7 +127,7 @@ There are a few cases where installing another internal project as a package can - Unified deployments — In a production environment, if the central data platform team of Jaffle Shop wanted to schedule the deployment of models across both `jaffle_finance` and `jaffle_marketing`, they could use dbt's [selection syntax](/reference/node-selection/syntax) to create a new "passthrough" project that installed both projects as packages. - Coordinated changes — In development, if you wanted to test the effects of a change to a public model in an upstream project (`jaffle_finance.monthly_revenue`) on a downstream model (`jaffle_marketing.roi_by_channel`) _before_ introducing changes to a staging or production environment, you can install the `jaffle_finance` package as a package within `jaffle_marketing`. The installation can point to a specific git branch, however, if you find yourself frequently needing to perform end-to-end testing across both projects, we recommend you re-examine if this represents a stable interface boundary. -These are the exceptions, rather than the rule. Installing another team's project as a package adds complexity, latency, and risk of unnecessary costs. By defining clear interface boundaries across teams, by serving one team's public models as "APIs" to another, and by enabling practitioners to develop with a more narrowly-defined scope, we can enable more people to contribute, with more confidence, while requiring less context upfront. +These are the exceptions, rather than the rule. Installing another team's project as a package adds complexity, latency, and risk of unnecessary costs. By defining clear interface boundaries across teams, by serving one team's public models as "APIs" to another, and by enabling practitioners to develop with a more narrowly defined scope, we can enable more people to contribute, with more confidence, while requiring less context upfront. ## FAQs @@ -109,3 +136,7 @@ These are the exceptions, rather than the rule. Installing another team's projec If you're using private packages with the [git token method](/docs/build/packages#git-token-method), you must define them in the `packages.yml` file instead of the `dependencies.yml` file. This is because conditional rendering (like Jinja-in-yaml) is not supported. + + +## Related docs +- [dbt Mesh guide](/guides/best-practices/how-we-mesh/mesh-1-intro)