From 8d2e5181180d6bddd65ae3c35793b0d2bde66ef6 Mon Sep 17 00:00:00 2001 From: mirnawong1 Date: Wed, 20 Dec 2023 10:48:11 -0500 Subject: [PATCH 01/31] add dbt mesh faqs --- .../best-practices/how-we-mesh/mesh-4-faqs.md | 80 +++++++++++++++++++ website/sidebars.js | 1 + 2 files changed, 81 insertions(+) create mode 100644 website/docs/best-practices/how-we-mesh/mesh-4-faqs.md diff --git a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md new file mode 100644 index 00000000000..7faf4c31a3d --- /dev/null +++ b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md @@ -0,0 +1,80 @@ +--- +title: "dbt Mesh FAQs" +description: "Read some frequently asked questions about dbt Mesh." +hoverSnippet: "dbt Mesh FAQs" +sidebar_label: "Frequently asked dbt Mesh questions" +--- + +dbt Mesh is a new architecture enabled by dbt Cloud. It allows you to better manage complexity by deploying multiple interconnected dbt projects instead of a single large, monolithic project. It’s designed to accelerate development, without compromising governance. + +The following frequently asked questions (FAQs) are categorized into the following topics: + +- Overview of dbt Mesh +- How dbt Mesh works +- Access and permissions +- Compatibility with other features +- When and where dbt Mesh is available +- Tips on implementing dbt Mesh + +## Overview of Mesh + + + +In dbt, [model contracts](https://docs.getdbt.com/docs/collaborate/govern/model-contracts) serve as a governance tool enabling the definition and enforcement of data structure standards in your dbt models. They allow you to specify and uphold data model guarantees, including column data types, allowing for stability of dependent models. Should a model fail to adhere to its established contracts, it will not successfully build. + + + + + +dbt [model versions](https://docs.getdbt.com/docs/collaborate/govern/model-versions) are iterations of your dbt models made over time. In many cases, you may knowingly choose to change a model’s structure in a way that “breaks” the previous model contract — when you do so, you find it useful to create a new version of the model to signify this change. + +You can use use **model versions** to: + +- Test "prerelease" changes (in production, in downstream systems) +- Bump the latest version, to be used as the canonical source of truth +- Offer a migration window off the "old" version + + + + + +A [model access modifier](/docs/collaborate/govern/model-access) in dbt is a feature that defines the level of accessibility of a model to other parts of the dbt project, or to other dbt projects. It specifies who can reference a model using [the `ref` function](https://docs.getdbt.com/reference/dbt-jinja-functions/ref). There are three types of access modifiers: + +1. **Private:** A model with a private access modifier is only referenceable by models within the same group. This is intended for models that are implementation details and are meant to be used only within a specific group of related models. +2. **Protected:** Models with a protected access modifier can be referenced by any other model within the same dbt project or when the project is installed as a package. This is the default setting for all models, ensuring backward compatibility, especially when groups are assigned to an existing set of models. +3. **Public:** A public model can be referenced across different groups, packages, or projects. This is suitable for stable and mature models that serve as interfaces for other teams or projects. + + + + + +A model group in dbt is a concept used to organize models under a common category or ownership. This categorization can be based on various criteria, such as the team responsible for the models or the specific data source they model, like GitHub data. + + + + + +1. **Agility in Development**: With a more modular architecture, teams can make changes rapidly and independently in specific areas without impacting the entire system, leading to faster development cycles. +2. **Improved Collaboration**: Teams are able to share and build upon each other's work without duplicating efforts. +3. **Reduced Complexity**: By organizing transformation logic into distinct domains, dbt Mesh reduces the complexity inherent in large, monolithic projects, making them easier to manage and understand. +4. **Improved Data Trust.** Adopting a dbt Mesh can help ensure that changes in one domain's data models do not unexpectedly break dependencies in other domain areas, leading to a more secure and predictable data environment. + +Most importantly, all this can be accomplished without the central data team losing the ability to see lineage across the entire organization, or compromising on governance mechanisms. + + + + + +This is a new way of working, and the intentionality required to build, and then maintain, cross-project interfaces and dependencies may feel like a slowdown versus what some developers are used to. This way of working introduces intentional friction that makes it more difficult to change everything at once. + +Orchestration across multiple projects is also likely to be slightly more challenging for many organizations, although we’re currently developing new functionality that will make this process simpler. + + + + + +This is a new way of working, and the intentionality required to build, and then maintain, cross-project interfaces and dependencies may feel like a slowdown versus what some developers are used to. This way of working introduces intentional friction that makes it more difficult to change everything at once. + +Orchestration across multiple projects is also likely to be slightly more challenging for many organizations, although we’re currently developing new functionality that will make this process simpler. + + diff --git a/website/sidebars.js b/website/sidebars.js index 23a58360bbc..a3160ce481c 100644 --- a/website/sidebars.js +++ b/website/sidebars.js @@ -1044,6 +1044,7 @@ const sidebarSettings = { items: [ "best-practices/how-we-mesh/mesh-2-structures", "best-practices/how-we-mesh/mesh-3-implementation", + "best-practices/how-we-mesh/mesh-4-faqs", ], }, { From c0364934e064c958d1a389c829870f694cb7821d Mon Sep 17 00:00:00 2001 From: mirnawong1 Date: Wed, 3 Jan 2024 11:27:07 -0500 Subject: [PATCH 02/31] toggles --- .../best-practices/how-we-mesh/mesh-4-faqs.md | 203 +++++++++++++++++- 1 file changed, 199 insertions(+), 4 deletions(-) diff --git a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md index 7faf4c31a3d..da9e26f20f9 100644 --- a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md +++ b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md @@ -18,9 +18,14 @@ The following frequently asked questions (FAQs) are categorized into the followi ## Overview of Mesh + + +dbt Mesh is a new architecture enabled by dbt Cloud. It allows you to better manage complexity by deploying multiple interconnected dbt projects instead of a single large, monolithic project. It’s designed to accelerate development, without compromising governance. + + -In dbt, [model contracts](https://docs.getdbt.com/docs/collaborate/govern/model-contracts) serve as a governance tool enabling the definition and enforcement of data structure standards in your dbt models. They allow you to specify and uphold data model guarantees, including column data types, allowing for stability of dependent models. Should a model fail to adhere to its established contracts, it will not successfully build. +In dbt, [model contracts](/docs/collaborate/govern/model-contracts) serve as a governance tool enabling the definition and enforcement of data structure standards in your dbt models. They allow you to specify and uphold data model guarantees, including column data types, allowing for stability of dependent models. Should a model fail to adhere to its established contracts, it will not successfully build. @@ -71,10 +76,200 @@ Orchestration across multiple projects is also likely to be slightly more challe - +## How dbt Mesh works -This is a new way of working, and the intentionality required to build, and then maintain, cross-project interfaces and dependencies may feel like a slowdown versus what some developers are used to. This way of working introduces intentional friction that makes it more difficult to change everything at once. + -Orchestration across multiple projects is also likely to be slightly more challenging for many organizations, although we’re currently developing new functionality that will make this process simpler. +Yes. In addition to being viewable natively through [dbt Explorer](https://www.getdbt.com/product/dbt-explorer), it is possible to view cross-project lineage connect using partner integrations with data cataloging tools. For a list of available dbt Cloud integrations, refer to the [Integrations page](https://www.getdbt.com/product/integrations). + + + + + +Yes, all of this metadata is accessible via the [dbt Cloud Admin API](/docs/dbt-cloud-apis/admin-cloud-api). This metadata can be fed into a monitoring tool, or used to create reports and dashboards. + +We also expose some of this information in dbt Cloud itself in [jobs](/docs/deploy/jobs), [environments](https://docs.getdbt.com/docs/environments-in-dbt) and in [dbt Explorer](https://www.getdbt.com/product/dbt-explorer). + + + + + +Like resource dependencies, project dependencies are acyclic, meaning they only move in one direction. This prevents `ref` cycles (or loops). For example, if project B depends on project A, a new model in project A could not import and use a public model from project B. Refer to [Project dependencies](/docs/collaborate/govern/project-dependencies#how-to-use-ref) for more information. + + + + + +While it’s not currently possible to share sources across projects, it would be possible to have a shared foundational project, with staging models on top of those sources, exposed as “public” models to other teams/projects. + + + + + +This would be a breaking change for downstream consumers of that model. If the maintainers of the upstream project wish to remove the model (or “downgrade” its access modifier, effectively the same thing), they should mark that model for deprecation (using [deprecation_date](/reference/resource-properties/deprecation_date)), which will deliver a warning to all downstream consumers of that model. + +In the future, we plan for dbt Cloud to also be able to proactively flag this scenario in [continuous integration](/docs/deploy/continuous-integration) for the maintainers of the upstream public model. + + + + + +No, unless downstream projects are installed as [packages](/docs/build/packages) (source code). In that case, the models in project installed as a project become “your” models, and you can select or run them. There are cases in which this can be desirable; see docs on [project dependencies](/docs/collaborate/govern/project-dependencies). + + + + + +Yes; as long as they’re in the same data platform (such as BigQuery, Databricks, Redshift, Snowflake, or Starburst) and you have configured permissions and sharing in that data platform provider to allow for this. + + + + + +Yes, because the cross-project collaboration is done using the `{{ ref() }}` macro, you can use those models from other teams in singular tests. + + + + + +Each team defines their connection to the data warehouse, and the default schema names for dbt to use when materializing datasets. + +By default, each project belonging to a team will create: + +- One schema for production runs (for example, `finance`) +- One schema per developer (for example, `dev_jerco`) + +Depending on each team’s needs, this can be customized with model-level [schema configurations](/docs/build/custom-schemas), including the ability to define different rules by environment. + + + + + +No, contracts can currently only be applied at the [model level](/docs/collaborate/govern/model-contracts). It is a recommended best practice to [define staging models](https://docs.getdbt.com/best-practices/how-we-structure/2-staging) on top of sources, and it is possible to define contracts on top of those staging models. + + + + + +No. A contract applies to an entire model, including all columns in the model’s output. This is the same set of columns that a consumer would see when viewing the model’s details in Explorer, or when querying the model in the data platform. If you wish to contract only a subset of columns, you can create a separate model (materialized as a view) selecting only that subset. If you wish to limit which rows or columns a downstream consumer can see when they query the model’s data, depending on who they are, specific data platforms offer advanced capabilities around dynamic row-level access and column-level data masking. + + + + + +No, a [group](/docs/collaborate/govern/model-access#groups) can currently only be assigned to have a single owner. Note however that the assigned can be a team, not just an individual. + + + + + +Not directly, but contracts are [assigned to models](/docs/collaborate/govern/model-contracts) and models can be assigned to individual owners. You can use meta fields for this purpose. + + + + + +This is not currently possible, but something we hope to enable in the near future. If you’re interested in this functionality, please reach out to your dbt Labs account team. + + + + + +dbt Cloud will soon offer a capability to trigger jobs on the completion of another job, including a job in a different project. This offers one mechanism for executing a pipeline from start to finish across projects. + + + +## Access and permissions + + + +The existence of projects that have at least one public model will be visible to everyone in the organization with read-only access. Private or protected models require a user to have read-only access on the specific project in order to see its existence. + + + + + +There is not currently! But this is something we may evaluate for the future. + + + + + +The "multi-project" view in dbt Cloud Explorer (and the underlying Discovery API) will include any project that has defined at least one "public" model, including the list of public models. (Read more about [model access modifiers](/docs/collaborate/govern/model-access#access-modifiers).) If a user has additional permissions on that project (managed via dbt Cloud RBAC), including read-only permissions, they can explore further into that project and see details about all models (private, protected, and protected). + + + + + +Yes! As long as a user has permissions (at least read-only access) on all projects in a dbt Cloud account, they can navigate across the entirety of the organization’s DAG in dbt Explorer, and see models at all levels of detail. + + + +## Compatibility with other features + + + +The [dbt Semantic Layer](/docs/use-dbt-semantic-layer/dbt-sl) and dbt Mesh are complementary mechanisms enabled by dbt Cloud that work together to enhance the management, usability, and governance of data in large-scale data environments. + +The Semantic Layer in dbt Cloud allows teams to centrally define business metrics and dimensions. It ensures consistent and reliable metric definitions across various analytics tools and platforms. + +dbt Mesh enables organizations to split their data architecture into multiple domain-specific projects, while retaining the ability to reference “public” models across projects. It is also possible to reference a “public” model from another project for the purpose of defining semantic models and metrics. In this way, your organization can have multiple dbt projects feed into a unified semantic layer, ensuring that metrics and dimensions are consistently defined and understood across these different domains. + + + + + +**[dbt Explorer](/docs/collaborate/explore-projects)** is a tool within dbt Cloud that serves as a knowledge base and lineage visualization platform. It provides a comprehensive view of your dbt assets, including models, tests, sources, and their interdependencies. + +Used in conjunction with dbt Mesh, dbt Explorer becomes a powerful tool for visualizing and understanding the relationships and dependencies between models across multiple dbt projects. + + + + + +The [dbt Cloud CLI](/docs/cloud/cloud-cli-installation) allows users to develop and run dbt commands from their preferred development environments, like VS Code, Sublime Text, or terminal interfaces. This flexibility is particularly beneficial in a dbt Mesh setup, where managing multiple projects can be complex. Developers can work in their preferred tools while leveraging the centralized capabilities of dbt Cloud. + + + + + +The [dbt Cloud CLI](/docs/cloud/cloud-cli-installation) allows users to develop and run dbt commands from their preferred development environments, like VS Code, Sublime Text, or terminal interfaces. This flexibility is particularly beneficial in a dbt Mesh setup, where managing multiple projects can be complex. Developers can work in their preferred tools while leveraging the centralized capabilities of dbt Cloud. + + + +## When and where dbt Mesh is available + + + +Yes — your account must be on at least dbt v1.6 to take advantage of [cross-project dependencies](/docs/collaborate/govern/project-dependencies), one of the most crucial underlying capabilities required to implement a dbt Mesh. + + + + + +Not all of them. While dbt Core provides some of the foundational elements for dbt Mesh, dbt Cloud offers an enhanced experience that leverages these elements for scaled collaboration across multiple teams. + +Many of the key components that underpin dbt Mesh functionality, such as model contracts, versions, and access modifier levels (public and private), are available in dbt Core. To enable cross-project dependencies, users can also leverage [packages](/docs/build/packages). This enables users to import models from an upstream project, which allows the resolution of cross-project references. + +The major distinction comes with dbt Cloud's metadata service, which is unique to the dbt Cloud platform and allows for the resolution of references to only the public models in a project. This service enables users to take dependencies on resources from upstream projects without needing to load the full complexity of those upstream projects into their local development environment. + + + + + +Yes, a [dbt Cloud Enterprise](https://www.getdbt.com/pricing) plan is required to set up multiple projects and reference models across them. + + + +## Tips on implementing dbt Mesh + + + +Refer to our developer guide on [How we structure our dbt Mesh projects](https://docs.getdbt.com/best-practices/how-we-mesh/mesh-1-intro). You may also be interested in watching the recording of this talk from Coalesce 2023: [Unlocking model governance and multi-project deployments with dbt-meshify](https://docs.getdbt.com/best-practices/how-we-mesh/mesh-1-intro). + + + + +`dbt-meshify` is a [CLI tool](https://github.com/dbt-labs/dbt-meshify) that automates the creation of model governance and cross-project lineage features introduced in dbt-core v1.5 and v1.6. This package will leverage your dbt project metadata to create and/or edit the files in your project to properly configure the models in your project with these features. From a49d93908af200473c2f2dd6c48bd5ef7427e9fd Mon Sep 17 00:00:00 2001 From: mirnawong1 Date: Wed, 3 Jan 2024 12:06:00 -0500 Subject: [PATCH 03/31] remove toggle --- .../best-practices/how-we-mesh/mesh-4-faqs.md | 38 ++++++------------- 1 file changed, 11 insertions(+), 27 deletions(-) diff --git a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md index da9e26f20f9..783a79fc6ff 100644 --- a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md +++ b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md @@ -18,20 +18,14 @@ The following frequently asked questions (FAQs) are categorized into the followi ## Overview of Mesh - - +**What is dbt Mesh?**
dbt Mesh is a new architecture enabled by dbt Cloud. It allows you to better manage complexity by deploying multiple interconnected dbt projects instead of a single large, monolithic project. It’s designed to accelerate development, without compromising governance. -
- - +**How do I implement contracts for my models?**
In dbt, [model contracts](/docs/collaborate/govern/model-contracts) serve as a governance tool enabling the definition and enforcement of data structure standards in your dbt models. They allow you to specify and uphold data model guarantees, including column data types, allowing for stability of dependent models. Should a model fail to adhere to its established contracts, it will not successfully build. -
- - - -dbt [model versions](https://docs.getdbt.com/docs/collaborate/govern/model-versions) are iterations of your dbt models made over time. In many cases, you may knowingly choose to change a model’s structure in a way that “breaks” the previous model contract — when you do so, you find it useful to create a new version of the model to signify this change. +**What are model versions?**
+dbt [model versions](/docs/collaborate/govern/model-versions) are iterations of your dbt models made over time. In many cases, you may knowingly choose to change a model’s structure in a way that “breaks” the previous model contract — when you do so, you find it useful to create a new version of the model to signify this change. You can use use **model versions** to: @@ -39,34 +33,24 @@ You can use use **model versions** to: - Bump the latest version, to be used as the canonical source of truth - Offer a migration window off the "old" version -
- - - +**What are model access modifiers?**
A [model access modifier](/docs/collaborate/govern/model-access) in dbt is a feature that defines the level of accessibility of a model to other parts of the dbt project, or to other dbt projects. It specifies who can reference a model using [the `ref` function](https://docs.getdbt.com/reference/dbt-jinja-functions/ref). There are three types of access modifiers: 1. **Private:** A model with a private access modifier is only referenceable by models within the same group. This is intended for models that are implementation details and are meant to be used only within a specific group of related models. 2. **Protected:** Models with a protected access modifier can be referenced by any other model within the same dbt project or when the project is installed as a package. This is the default setting for all models, ensuring backward compatibility, especially when groups are assigned to an existing set of models. 3. **Public:** A public model can be referenced across different groups, packages, or projects. This is suitable for stable and mature models that serve as interfaces for other teams or projects. -
- - - +**What are model groups?**
A model group in dbt is a concept used to organize models under a common category or ownership. This categorization can be based on various criteria, such as the team responsible for the models or the specific data source they model, like GitHub data. -
- - - -1. **Agility in Development**: With a more modular architecture, teams can make changes rapidly and independently in specific areas without impacting the entire system, leading to faster development cycles. -2. **Improved Collaboration**: Teams are able to share and build upon each other's work without duplicating efforts. -3. **Reduced Complexity**: By organizing transformation logic into distinct domains, dbt Mesh reduces the complexity inherent in large, monolithic projects, making them easier to manage and understand. -4. **Improved Data Trust.** Adopting a dbt Mesh can help ensure that changes in one domain's data models do not unexpectedly break dependencies in other domain areas, leading to a more secure and predictable data environment. +**What are the main benefits of implementing dbt Mesh?**
+1. **Agility in development**: With a more modular architecture, teams can make changes rapidly and independently in specific areas without impacting the entire system, leading to faster development cycles. +2. **Improved collaboration**: Teams are able to share and build upon each other's work without duplicating efforts. +3. **Reduced complexity**: By organizing transformation logic into distinct domains, dbt Mesh reduces the complexity inherent in large, monolithic projects, making them easier to manage and understand. +4. **Improved data trust.** Adopting a dbt Mesh can help ensure that changes in one domain's data models do not unexpectedly break dependencies in other domain areas, leading to a more secure and predictable data environment. Most importantly, all this can be accomplished without the central data team losing the ability to see lineage across the entire organization, or compromising on governance mechanisms. -
From 4e4572708a4710c508ccc9c78a4feee70a345ac3 Mon Sep 17 00:00:00 2001 From: mirnawong1 Date: Wed, 3 Jan 2024 12:19:00 -0500 Subject: [PATCH 04/31] turn to bullets --- .../best-practices/how-we-mesh/mesh-4-faqs.md | 208 +++++------------- 1 file changed, 61 insertions(+), 147 deletions(-) diff --git a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md index 783a79fc6ff..f7cc450911f 100644 --- a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md +++ b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md @@ -18,242 +18,156 @@ The following frequently asked questions (FAQs) are categorized into the followi ## Overview of Mesh -**What is dbt Mesh?**
+1. **What is dbt Mesh?**
dbt Mesh is a new architecture enabled by dbt Cloud. It allows you to better manage complexity by deploying multiple interconnected dbt projects instead of a single large, monolithic project. It’s designed to accelerate development, without compromising governance. -**How do I implement contracts for my models?**
+2. **How do I implement contracts for my models?**
In dbt, [model contracts](/docs/collaborate/govern/model-contracts) serve as a governance tool enabling the definition and enforcement of data structure standards in your dbt models. They allow you to specify and uphold data model guarantees, including column data types, allowing for stability of dependent models. Should a model fail to adhere to its established contracts, it will not successfully build. -**What are model versions?**
+3. **What are model versions?**
dbt [model versions](/docs/collaborate/govern/model-versions) are iterations of your dbt models made over time. In many cases, you may knowingly choose to change a model’s structure in a way that “breaks” the previous model contract — when you do so, you find it useful to create a new version of the model to signify this change. -You can use use **model versions** to: + You can use use **model versions** to: -- Test "prerelease" changes (in production, in downstream systems) -- Bump the latest version, to be used as the canonical source of truth -- Offer a migration window off the "old" version + - Test "prerelease" changes (in production, in downstream systems) + - Bump the latest version, to be used as the canonical source of truth + - Offer a migration window off the "old" version -**What are model access modifiers?**
+4. **What are model access modifiers?**
A [model access modifier](/docs/collaborate/govern/model-access) in dbt is a feature that defines the level of accessibility of a model to other parts of the dbt project, or to other dbt projects. It specifies who can reference a model using [the `ref` function](https://docs.getdbt.com/reference/dbt-jinja-functions/ref). There are three types of access modifiers: -1. **Private:** A model with a private access modifier is only referenceable by models within the same group. This is intended for models that are implementation details and are meant to be used only within a specific group of related models. -2. **Protected:** Models with a protected access modifier can be referenced by any other model within the same dbt project or when the project is installed as a package. This is the default setting for all models, ensuring backward compatibility, especially when groups are assigned to an existing set of models. -3. **Public:** A public model can be referenced across different groups, packages, or projects. This is suitable for stable and mature models that serve as interfaces for other teams or projects. + - **Private:** A model with a private access modifier is only referenceable by models within the same group. This is intended for models that are implementation details and are meant to be used only within a specific group of related models. + - **Protected:** Models with a protected access modifier can be referenced by any other model within the same dbt project or when the project is installed as a package. This is the default setting for all models, ensuring backward compatibility, especially when groups are assigned to an existing set of models. + - **Public:** A public model can be referenced across different groups, packages, or projects. This is suitable for stable and mature models that serve as interfaces for other teams or projects. -**What are model groups?**
+5. **What are model groups?**
A model group in dbt is a concept used to organize models under a common category or ownership. This categorization can be based on various criteria, such as the team responsible for the models or the specific data source they model, like GitHub data. -**What are the main benefits of implementing dbt Mesh?**
-1. **Agility in development**: With a more modular architecture, teams can make changes rapidly and independently in specific areas without impacting the entire system, leading to faster development cycles. -2. **Improved collaboration**: Teams are able to share and build upon each other's work without duplicating efforts. -3. **Reduced complexity**: By organizing transformation logic into distinct domains, dbt Mesh reduces the complexity inherent in large, monolithic projects, making them easier to manage and understand. -4. **Improved data trust.** Adopting a dbt Mesh can help ensure that changes in one domain's data models do not unexpectedly break dependencies in other domain areas, leading to a more secure and predictable data environment. +6. **What are the main benefits of implementing dbt Mesh?**
+ - **Agility in development**: With a more modular architecture, teams can make changes rapidly and independently in specific areas without impacting the entire system, leading to faster development cycles. + - **Improved collaboration**: Teams are able to share and build upon each other's work without duplicating efforts. + - **Reduced complexity**: By organizing transformation logic into distinct domains, dbt Mesh reduces the complexity inherent in large, monolithic projects, making them easier to manage and understand. + - **Improved data trust.** Adopting a dbt Mesh can help ensure that changes in one domain's data models do not unexpectedly break dependencies in other domain areas, leading to a more secure and predictable data environment. -Most importantly, all this can be accomplished without the central data team losing the ability to see lineage across the entire organization, or compromising on governance mechanisms. - - - + Most importantly, all this can be accomplished without the central data team losing the ability to see lineage across the entire organization, or compromising on governance mechanisms. +7. **What are some potential drawbacks of using a dbt Mesh?**
This is a new way of working, and the intentionality required to build, and then maintain, cross-project interfaces and dependencies may feel like a slowdown versus what some developers are used to. This way of working introduces intentional friction that makes it more difficult to change everything at once. -Orchestration across multiple projects is also likely to be slightly more challenging for many organizations, although we’re currently developing new functionality that will make this process simpler. - -
+ Orchestration across multiple projects is also likely to be slightly more challenging for many organizations, although we’re currently developing new functionality that will make this process simpler. ## How dbt Mesh works - - +1. **Are there integrations between the dbt Cloud Discovery API and other data cataloging tools that would make it possible to view cross-project lineage in those tools?**
Yes. In addition to being viewable natively through [dbt Explorer](https://www.getdbt.com/product/dbt-explorer), it is possible to view cross-project lineage connect using partner integrations with data cataloging tools. For a list of available dbt Cloud integrations, refer to the [Integrations page](https://www.getdbt.com/product/integrations). -
- - - +2. **How does dbt handle job run logs? Can it feed them to standard monitoring tools and reports or dashboards?**
Yes, all of this metadata is accessible via the [dbt Cloud Admin API](/docs/dbt-cloud-apis/admin-cloud-api). This metadata can be fed into a monitoring tool, or used to create reports and dashboards. -We also expose some of this information in dbt Cloud itself in [jobs](/docs/deploy/jobs), [environments](https://docs.getdbt.com/docs/environments-in-dbt) and in [dbt Explorer](https://www.getdbt.com/product/dbt-explorer). - -
- - + We also expose some of this information in dbt Cloud itself in [jobs](/docs/deploy/jobs), [environments](/docs/environments-in-dbt) and in [dbt Explorer](https://www.getdbt.com/product/dbt-explorer). -Like resource dependencies, project dependencies are acyclic, meaning they only move in one direction. This prevents `ref` cycles (or loops). For example, if project B depends on project A, a new model in project A could not import and use a public model from project B. Refer to [Project dependencies](/docs/collaborate/govern/project-dependencies#how-to-use-ref) for more information. - - - - +3. **Can dbt Mesh handle cyclic dependencies between projects?**
+Like resource dependencies, project dependencies are acyclic, meaning they only move in one direction. This prevents `ref` cycles (or loops). For example, if project B depends on project A, a new model in project A could not import and use a public model from project B. Refer to [Project dependencies](/docs/collaborate/govern/project-dependencies#how-to-write-cross-project-ref) for more information. +4. **Is it possible for multiple projects to directly reference a shared source?**
While it’s not currently possible to share sources across projects, it would be possible to have a shared foundational project, with staging models on top of those sources, exposed as “public” models to other teams/projects. -
- - - +5. **What if a model I've already built on from another project later becomes protected?****
This would be a breaking change for downstream consumers of that model. If the maintainers of the upstream project wish to remove the model (or “downgrade” its access modifier, effectively the same thing), they should mark that model for deprecation (using [deprecation_date](/reference/resource-properties/deprecation_date)), which will deliver a warning to all downstream consumers of that model. -In the future, we plan for dbt Cloud to also be able to proactively flag this scenario in [continuous integration](/docs/deploy/continuous-integration) for the maintainers of the upstream public model. - -
- - + In the future, we plan for dbt Cloud to also be able to proactively flag this scenario in [continuous integration](/docs/deploy/continuous-integration) for the maintainers of the upstream public model. +6. **If I run `dbt build --select +model`, will this trigger a run of upstream models in other projects?****
No, unless downstream projects are installed as [packages](/docs/build/packages) (source code). In that case, the models in project installed as a project become “your” models, and you can select or run them. There are cases in which this can be desirable; see docs on [project dependencies](/docs/collaborate/govern/project-dependencies). -
- - - +7. **If each project/domain has its own data warehouse, is it still possible to build models across them?**
Yes; as long as they’re in the same data platform (such as BigQuery, Databricks, Redshift, Snowflake, or Starburst) and you have configured permissions and sharing in that data platform provider to allow for this. -
- - - +8. **Can I run tests that involve tables from multiple different projects?**
Yes, because the cross-project collaboration is done using the `{{ ref() }}` macro, you can use those models from other teams in singular tests. -
- - - +9. **Which team's data schema would dbt Mesh create?**
Each team defines their connection to the data warehouse, and the default schema names for dbt to use when materializing datasets. -By default, each project belonging to a team will create: - -- One schema for production runs (for example, `finance`) -- One schema per developer (for example, `dev_jerco`) - -Depending on each team’s needs, this can be customized with model-level [schema configurations](/docs/build/custom-schemas), including the ability to define different rules by environment. + By default, each project belonging to a team will create: -
+ - One schema for production runs (for example, `finance`) + - One schema per developer (for example, `dev_jerco`) - + Depending on each team’s needs, this can be customized with model-level [schema configurations](/docs/build/custom-schemas), including the ability to define different rules by environment. +10. **Is it possible to apply model contracts to source data?**
No, contracts can currently only be applied at the [model level](/docs/collaborate/govern/model-contracts). It is a recommended best practice to [define staging models](https://docs.getdbt.com/best-practices/how-we-structure/2-staging) on top of sources, and it is possible to define contracts on top of those staging models. -
- - - +11. **Can contracts be partially enforced (for example, to ensure specific columns exist and go unchanged, but allow for other columns to be added or removed)?**
No. A contract applies to an entire model, including all columns in the model’s output. This is the same set of columns that a consumer would see when viewing the model’s details in Explorer, or when querying the model in the data platform. If you wish to contract only a subset of columns, you can create a separate model (materialized as a view) selecting only that subset. If you wish to limit which rows or columns a downstream consumer can see when they query the model’s data, depending on who they are, specific data platforms offer advanced capabilities around dynamic row-level access and column-level data masking. -
- - - +12. **Can you have multiple owners in a group?**
No, a [group](/docs/collaborate/govern/model-access#groups) can currently only be assigned to have a single owner. Note however that the assigned can be a team, not just an individual. -
- - - +13. **Can contracts be assigned individual owners?**
Not directly, but contracts are [assigned to models](/docs/collaborate/govern/model-contracts) and models can be assigned to individual owners. You can use meta fields for this purpose. -
- - - +14. **Can I make a model “public” only for specific team(s) to use?**
This is not currently possible, but something we hope to enable in the near future. If you’re interested in this functionality, please reach out to your dbt Labs account team. -
- - - +15. **Is it possible to orchestrate job runs across multiple different projects?**
dbt Cloud will soon offer a capability to trigger jobs on the completion of another job, including a job in a different project. This offers one mechanism for executing a pipeline from start to finish across projects. -
## Access and permissions - - +1. **How do user access permissions work in a dbt Mesh? Who in the organization can see which projects and models?**
The existence of projects that have at least one public model will be visible to everyone in the organization with read-only access. Private or protected models require a user to have read-only access on the specific project in order to see its existence. -
- - - +2. **Is it possible to request access permissions from other teams? Is there a built-in workflow for this in dbt Cloud?**
There is not currently! But this is something we may evaluate for the future. -
- - - +3. **Can projects be hidden for confidentiality?**
The "multi-project" view in dbt Cloud Explorer (and the underlying Discovery API) will include any project that has defined at least one "public" model, including the list of public models. (Read more about [model access modifiers](/docs/collaborate/govern/model-access#access-modifiers).) If a user has additional permissions on that project (managed via dbt Cloud RBAC), including read-only permissions, they can explore further into that project and see details about all models (private, protected, and protected). -
- - - +4. **As a central data team member, can I still maintain visibility on the entire organizational DAG?**
Yes! As long as a user has permissions (at least read-only access) on all projects in a dbt Cloud account, they can navigate across the entirety of the organization’s DAG in dbt Explorer, and see models at all levels of detail. -
- ## Compatibility with other features - - +1. **How does the dbt Semantic Layer relate to and work with dbt Mesh?**
The [dbt Semantic Layer](/docs/use-dbt-semantic-layer/dbt-sl) and dbt Mesh are complementary mechanisms enabled by dbt Cloud that work together to enhance the management, usability, and governance of data in large-scale data environments. -The Semantic Layer in dbt Cloud allows teams to centrally define business metrics and dimensions. It ensures consistent and reliable metric definitions across various analytics tools and platforms. - -dbt Mesh enables organizations to split their data architecture into multiple domain-specific projects, while retaining the ability to reference “public” models across projects. It is also possible to reference a “public” model from another project for the purpose of defining semantic models and metrics. In this way, your organization can have multiple dbt projects feed into a unified semantic layer, ensuring that metrics and dimensions are consistently defined and understood across these different domains. - -
+ The Semantic Layer in dbt Cloud allows teams to centrally define business metrics and dimensions. It ensures consistent and reliable metric definitions across various analytics tools and platforms. - + dbt Mesh enables organizations to split their data architecture into multiple domain-specific projects, while retaining the ability to reference “public” models across projects. It is also possible to reference a “public” model from another project for the purpose of defining semantic models and metrics. In this way, your organization can have multiple dbt projects feed into a unified semantic layer, ensuring that metrics and dimensions are consistently defined and understood across these different domains. -**[dbt Explorer](/docs/collaborate/explore-projects)** is a tool within dbt Cloud that serves as a knowledge base and lineage visualization platform. It provides a comprehensive view of your dbt assets, including models, tests, sources, and their interdependencies. +2. **How does dbt Explorer relate to and work with dbt Mesh?**
+[dbt Explorer](/docs/collaborate/explore-projects) is a tool within dbt Cloud that serves as a knowledge base and lineage visualization platform. It provides a comprehensive view of your dbt assets, including models, tests, sources, and their interdependencies. -Used in conjunction with dbt Mesh, dbt Explorer becomes a powerful tool for visualizing and understanding the relationships and dependencies between models across multiple dbt projects. - -
- - + Used in conjunction with dbt Mesh, dbt Explorer becomes a powerful tool for visualizing and understanding the relationships and dependencies between models across multiple dbt projects. +3. **How does the dbt Cloud CLI relate to and work with dbt Mesh?**
The [dbt Cloud CLI](/docs/cloud/cloud-cli-installation) allows users to develop and run dbt commands from their preferred development environments, like VS Code, Sublime Text, or terminal interfaces. This flexibility is particularly beneficial in a dbt Mesh setup, where managing multiple projects can be complex. Developers can work in their preferred tools while leveraging the centralized capabilities of dbt Cloud. -
- - - +4. **How does the dbt Cloud CLI relate to and work with dbt Mesh?**
The [dbt Cloud CLI](/docs/cloud/cloud-cli-installation) allows users to develop and run dbt commands from their preferred development environments, like VS Code, Sublime Text, or terminal interfaces. This flexibility is particularly beneficial in a dbt Mesh setup, where managing multiple projects can be complex. Developers can work in their preferred tools while leveraging the centralized capabilities of dbt Cloud. -
- ## When and where dbt Mesh is available - - +1. **Does dbt Mesh require me to be on a specific version of dbt?**
Yes — your account must be on at least dbt v1.6 to take advantage of [cross-project dependencies](/docs/collaborate/govern/project-dependencies), one of the most crucial underlying capabilities required to implement a dbt Mesh. -
- - - +2. **Is there a way to leverage dbt Mesh capabilities in dbt Core?**
Not all of them. While dbt Core provides some of the foundational elements for dbt Mesh, dbt Cloud offers an enhanced experience that leverages these elements for scaled collaboration across multiple teams. -Many of the key components that underpin dbt Mesh functionality, such as model contracts, versions, and access modifier levels (public and private), are available in dbt Core. To enable cross-project dependencies, users can also leverage [packages](/docs/build/packages). This enables users to import models from an upstream project, which allows the resolution of cross-project references. - -The major distinction comes with dbt Cloud's metadata service, which is unique to the dbt Cloud platform and allows for the resolution of references to only the public models in a project. This service enables users to take dependencies on resources from upstream projects without needing to load the full complexity of those upstream projects into their local development environment. + Many of the key components that underpin dbt Mesh functionality, such as model contracts, versions, and access modifier levels (public and private), are available in dbt Core. To enable cross-project dependencies, users can also leverage [packages](/docs/build/packages). This enables users to import models from an upstream project, which allows the resolution of cross-project references. -
- - + The major distinction comes with dbt Cloud's metadata service, which is unique to the dbt Cloud platform and allows for the resolution of references to only the public models in a project. This service enables users to take dependencies on resources from upstream projects without needing to load the full complexity of those upstream projects into their local development environment. +3. **Does dbt Mesh require a specific dbt Cloud plan?**
Yes, a [dbt Cloud Enterprise](https://www.getdbt.com/pricing) plan is required to set up multiple projects and reference models across them. -
- ## Tips on implementing dbt Mesh - - +1. **Is there a recommended migration or implementation process?**
Refer to our developer guide on [How we structure our dbt Mesh projects](https://docs.getdbt.com/best-practices/how-we-mesh/mesh-1-intro). You may also be interested in watching the recording of this talk from Coalesce 2023: [Unlocking model governance and multi-project deployments with dbt-meshify](https://docs.getdbt.com/best-practices/how-we-mesh/mesh-1-intro). -
- - - +2. **Are there tools available to help me migrate to a dbt Mesh?**
`dbt-meshify` is a [CLI tool](https://github.com/dbt-labs/dbt-meshify) that automates the creation of model governance and cross-project lineage features introduced in dbt-core v1.5 and v1.6. This package will leverage your dbt project metadata to create and/or edit the files in your project to properly configure the models in your project with these features. -
From 4f0352849126ac03442fd6e3f0eeecaeb4ab8e94 Mon Sep 17 00:00:00 2001 From: mirnawong1 Date: Wed, 3 Jan 2024 12:39:39 -0500 Subject: [PATCH 05/31] shorten faqs --- .../best-practices/how-we-mesh/mesh-4-faqs.md | 233 +++++++++++++----- 1 file changed, 167 insertions(+), 66 deletions(-) diff --git a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md index f7cc450911f..92fbb56185f 100644 --- a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md +++ b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md @@ -18,156 +18,257 @@ The following frequently asked questions (FAQs) are categorized into the followi ## Overview of Mesh -1. **What is dbt Mesh?**
+ + dbt Mesh is a new architecture enabled by dbt Cloud. It allows you to better manage complexity by deploying multiple interconnected dbt projects instead of a single large, monolithic project. It’s designed to accelerate development, without compromising governance. + + + + +In dbt, [model contracts](/docs/collaborate/govern/model-contracts) serve as a governance tool enabling the definition and enforcement of data structure standards in your dbt models. They allow you to specify and uphold data model guarantees, including column data types, allowing for the stability of dependent models. Should a model fail to adhere to its established contracts, it will not successfully build. + + + + -2. **How do I implement contracts for my models?**
-In dbt, [model contracts](/docs/collaborate/govern/model-contracts) serve as a governance tool enabling the definition and enforcement of data structure standards in your dbt models. They allow you to specify and uphold data model guarantees, including column data types, allowing for stability of dependent models. Should a model fail to adhere to its established contracts, it will not successfully build. +dbt [model versions](https://docs.getdbt.com/docs/collaborate/govern/model-versions) are iterations of your dbt models made over time. In many cases, you may knowingly choose to change a model’s structure in a way that “breaks” the previous model contract — when you do so, you find it useful to create a new version of the model to signify this change. -3. **What are model versions?**
-dbt [model versions](/docs/collaborate/govern/model-versions) are iterations of your dbt models made over time. In many cases, you may knowingly choose to change a model’s structure in a way that “breaks” the previous model contract — when you do so, you find it useful to create a new version of the model to signify this change. +You can use **model versions** to: - You can use use **model versions** to: +- Test "prerelease" changes (in production, in downstream systems) +- Bump the latest version, to be used as the canonical source of truth +- Offer a migration window off the "old" version - - Test "prerelease" changes (in production, in downstream systems) - - Bump the latest version, to be used as the canonical source of truth - - Offer a migration window off the "old" version +
+ + -4. **What are model access modifiers?**
A [model access modifier](/docs/collaborate/govern/model-access) in dbt is a feature that defines the level of accessibility of a model to other parts of the dbt project, or to other dbt projects. It specifies who can reference a model using [the `ref` function](https://docs.getdbt.com/reference/dbt-jinja-functions/ref). There are three types of access modifiers: - - **Private:** A model with a private access modifier is only referenceable by models within the same group. This is intended for models that are implementation details and are meant to be used only within a specific group of related models. - - **Protected:** Models with a protected access modifier can be referenced by any other model within the same dbt project or when the project is installed as a package. This is the default setting for all models, ensuring backward compatibility, especially when groups are assigned to an existing set of models. - - **Public:** A public model can be referenced across different groups, packages, or projects. This is suitable for stable and mature models that serve as interfaces for other teams or projects. +1. **Private:** A model with a private access modifier is only referenceable by models within the same group. This is intended for models that are implementation details and are meant to be used only within a specific group of related models. +2. **Protected:** Models with a protected access modifier can be referenced by any other model within the same dbt project or when the project is installed as a package. This is the default setting for all models, ensuring backward compatibility, especially when groups are assigned to an existing set of models. +3. **Public:** A public model can be referenced across different groups, packages, or projects. This is suitable for stable and mature models that serve as interfaces for other teams or projects. + +
+ + -5. **What are model groups?**
A model group in dbt is a concept used to organize models under a common category or ownership. This categorization can be based on various criteria, such as the team responsible for the models or the specific data source they model, like GitHub data. -6. **What are the main benefits of implementing dbt Mesh?**
- - **Agility in development**: With a more modular architecture, teams can make changes rapidly and independently in specific areas without impacting the entire system, leading to faster development cycles. - - **Improved collaboration**: Teams are able to share and build upon each other's work without duplicating efforts. - - **Reduced complexity**: By organizing transformation logic into distinct domains, dbt Mesh reduces the complexity inherent in large, monolithic projects, making them easier to manage and understand. - - **Improved data trust.** Adopting a dbt Mesh can help ensure that changes in one domain's data models do not unexpectedly break dependencies in other domain areas, leading to a more secure and predictable data environment. +
+ + - Most importantly, all this can be accomplished without the central data team losing the ability to see lineage across the entire organization, or compromising on governance mechanisms. +1. **Agility in Development**: With a more modular architecture, teams can make changes rapidly and independently in specific areas without impacting the entire system, leading to faster development cycles. +2. **Improved Collaboration**: Teams are able to share and build upon each other's work without duplicating efforts. +3. **Reduced Complexity**: By organizing transformation logic into distinct domains, dbt Mesh reduces the complexity inherent in large, monolithic projects, making them easier to manage and understand. +4. **Improved Data Trust.** Adopting a dbt Mesh can help ensure that changes in one domain's data models do not unexpectedly break dependencies in other domain areas, leading to a more secure and predictable data environment. + +Most importantly, all this can be accomplished without the central data team losing the ability to see lineage across the entire organization, or compromising on governance mechanisms. + + + + -7. **What are some potential drawbacks of using a dbt Mesh?**
This is a new way of working, and the intentionality required to build, and then maintain, cross-project interfaces and dependencies may feel like a slowdown versus what some developers are used to. This way of working introduces intentional friction that makes it more difficult to change everything at once. - Orchestration across multiple projects is also likely to be slightly more challenging for many organizations, although we’re currently developing new functionality that will make this process simpler. +Orchestration across multiple projects is also likely to be slightly more challenging for many organizations, although we’re currently developing new functionality that will make this process simpler. + +
## How dbt Mesh works -1. **Are there integrations between the dbt Cloud Discovery API and other data cataloging tools that would make it possible to view cross-project lineage in those tools?**
+ + Yes. In addition to being viewable natively through [dbt Explorer](https://www.getdbt.com/product/dbt-explorer), it is possible to view cross-project lineage connect using partner integrations with data cataloging tools. For a list of available dbt Cloud integrations, refer to the [Integrations page](https://www.getdbt.com/product/integrations). -2. **How does dbt handle job run logs? Can it feed them to standard monitoring tools and reports or dashboards?**
+
+ + + Yes, all of this metadata is accessible via the [dbt Cloud Admin API](/docs/dbt-cloud-apis/admin-cloud-api). This metadata can be fed into a monitoring tool, or used to create reports and dashboards. - We also expose some of this information in dbt Cloud itself in [jobs](/docs/deploy/jobs), [environments](/docs/environments-in-dbt) and in [dbt Explorer](https://www.getdbt.com/product/dbt-explorer). +We also expose some of this information in dbt Cloud itself in [jobs](/docs/deploy/jobs), [environments](https://docs.getdbt.com/docs/environments-in-dbt) and in [dbt Explorer](https://www.getdbt.com/product/dbt-explorer). + + -3. **Can dbt Mesh handle cyclic dependencies between projects?**
-Like resource dependencies, project dependencies are acyclic, meaning they only move in one direction. This prevents `ref` cycles (or loops). For example, if project B depends on project A, a new model in project A could not import and use a public model from project B. Refer to [Project dependencies](/docs/collaborate/govern/project-dependencies#how-to-write-cross-project-ref) for more information. + + +Like resource dependencies, project dependencies are acyclic, meaning they only move in one direction. This prevents `ref` cycles (or loops). For example, if project B depends on project A, a new model in project A could not import and use a public model from project B. Refer to [Project dependencies](/docs/collaborate/govern/project-dependencies#how-to-use-ref) for more information. + + + + -4. **Is it possible for multiple projects to directly reference a shared source?**
While it’s not currently possible to share sources across projects, it would be possible to have a shared foundational project, with staging models on top of those sources, exposed as “public” models to other teams/projects. -5. **What if a model I've already built on from another project later becomes protected?****
+
+ + + This would be a breaking change for downstream consumers of that model. If the maintainers of the upstream project wish to remove the model (or “downgrade” its access modifier, effectively the same thing), they should mark that model for deprecation (using [deprecation_date](/reference/resource-properties/deprecation_date)), which will deliver a warning to all downstream consumers of that model. - In the future, we plan for dbt Cloud to also be able to proactively flag this scenario in [continuous integration](/docs/deploy/continuous-integration) for the maintainers of the upstream public model. +In the future, we plan for dbt Cloud to also be able to proactively flag this scenario in [continuous integration](/docs/deploy/continuous-integration) for the maintainers of the upstream public model. + + + + -6. **If I run `dbt build --select +model`, will this trigger a run of upstream models in other projects?****
No, unless downstream projects are installed as [packages](/docs/build/packages) (source code). In that case, the models in project installed as a project become “your” models, and you can select or run them. There are cases in which this can be desirable; see docs on [project dependencies](/docs/collaborate/govern/project-dependencies). -7. **If each project/domain has its own data warehouse, is it still possible to build models across them?**
+
+ + + Yes; as long as they’re in the same data platform (such as BigQuery, Databricks, Redshift, Snowflake, or Starburst) and you have configured permissions and sharing in that data platform provider to allow for this. -8. **Can I run tests that involve tables from multiple different projects?**
+
+ + + Yes, because the cross-project collaboration is done using the `{{ ref() }}` macro, you can use those models from other teams in singular tests. -9. **Which team's data schema would dbt Mesh create?**
+
+ + + Each team defines their connection to the data warehouse, and the default schema names for dbt to use when materializing datasets. - By default, each project belonging to a team will create: +By default, each project belonging to a team will create: - - One schema for production runs (for example, `finance`) - - One schema per developer (for example, `dev_jerco`) +- One schema for production runs (for example, `finance`) +- One schema per developer (for example, `dev_jerco`) - Depending on each team’s needs, this can be customized with model-level [schema configurations](/docs/build/custom-schemas), including the ability to define different rules by environment. +Depending on each team’s needs, this can be customized with model-level [schema configurations](/docs/build/custom-schemas), including the ability to define different rules by environment. + + + + -10. **Is it possible to apply model contracts to source data?**
No, contracts can currently only be applied at the [model level](/docs/collaborate/govern/model-contracts). It is a recommended best practice to [define staging models](https://docs.getdbt.com/best-practices/how-we-structure/2-staging) on top of sources, and it is possible to define contracts on top of those staging models. -11. **Can contracts be partially enforced (for example, to ensure specific columns exist and go unchanged, but allow for other columns to be added or removed)?**
-No. A contract applies to an entire model, including all columns in the model’s output. This is the same set of columns that a consumer would see when viewing the model’s details in Explorer, or when querying the model in the data platform. If you wish to contract only a subset of columns, you can create a separate model (materialized as a view) selecting only that subset. If you wish to limit which rows or columns a downstream consumer can see when they query the model’s data, depending on who they are, specific data platforms offer advanced capabilities around dynamic row-level access and column-level data masking. +
+ + + +No. A contract applies to an entire model, including all columns in the model’s output. This is the same set of columns that a consumer would see when viewing the model’s details in Explorer, or when querying the model in the data platform. +- This means it's not possible to selectively enforce contracts on specific columns, such as ensuring certain columns exist and remain unchanged while allowing modifications to others. +- If you wish to contract only a subset of columns, you can create a separate model (materialized as a view) selecting only that subset. If you wish to limit which rows or columns a downstream consumer can see when they query the model’s data, depending on who they are, specific data platforms offer advanced capabilities around dynamic row-level access and column-level data masking. + + + + -12. **Can you have multiple owners in a group?**
No, a [group](/docs/collaborate/govern/model-access#groups) can currently only be assigned to have a single owner. Note however that the assigned can be a team, not just an individual. -13. **Can contracts be assigned individual owners?**
+
+ + + Not directly, but contracts are [assigned to models](/docs/collaborate/govern/model-contracts) and models can be assigned to individual owners. You can use meta fields for this purpose. -14. **Can I make a model “public” only for specific team(s) to use?**
+
+ + + This is not currently possible, but something we hope to enable in the near future. If you’re interested in this functionality, please reach out to your dbt Labs account team. -15. **Is it possible to orchestrate job runs across multiple different projects?**
+
+ + + dbt Cloud will soon offer a capability to trigger jobs on the completion of another job, including a job in a different project. This offers one mechanism for executing a pipeline from start to finish across projects. + ## Access and permissions -1. **How do user access permissions work in a dbt Mesh? Who in the organization can see which projects and models?**
-The existence of projects that have at least one public model will be visible to everyone in the organization with read-only access. Private or protected models require a user to have read-only access on the specific project in order to see its existence. + + +The existence of projects that have at least one public model will be visible to everyone in the organization with read-only access. + +Private or protected models require a user to have read-only access on the specific project in order to see its existence. + + + + -2. **Is it possible to request access permissions from other teams? Is there a built-in workflow for this in dbt Cloud?**
There is not currently! But this is something we may evaluate for the future. -3. **Can projects be hidden for confidentiality?**
+
+ + + The "multi-project" view in dbt Cloud Explorer (and the underlying Discovery API) will include any project that has defined at least one "public" model, including the list of public models. (Read more about [model access modifiers](/docs/collaborate/govern/model-access#access-modifiers).) If a user has additional permissions on that project (managed via dbt Cloud RBAC), including read-only permissions, they can explore further into that project and see details about all models (private, protected, and protected). -4. **As a central data team member, can I still maintain visibility on the entire organizational DAG?**
+
+ + + Yes! As long as a user has permissions (at least read-only access) on all projects in a dbt Cloud account, they can navigate across the entirety of the organization’s DAG in dbt Explorer, and see models at all levels of detail. + + ## Compatibility with other features -1. **How does the dbt Semantic Layer relate to and work with dbt Mesh?**
+ + The [dbt Semantic Layer](/docs/use-dbt-semantic-layer/dbt-sl) and dbt Mesh are complementary mechanisms enabled by dbt Cloud that work together to enhance the management, usability, and governance of data in large-scale data environments. - The Semantic Layer in dbt Cloud allows teams to centrally define business metrics and dimensions. It ensures consistent and reliable metric definitions across various analytics tools and platforms. +The Semantic Layer in dbt Cloud allows teams to centrally define business metrics and dimensions. It ensures consistent and reliable metric definitions across various analytics tools and platforms. - dbt Mesh enables organizations to split their data architecture into multiple domain-specific projects, while retaining the ability to reference “public” models across projects. It is also possible to reference a “public” model from another project for the purpose of defining semantic models and metrics. In this way, your organization can have multiple dbt projects feed into a unified semantic layer, ensuring that metrics and dimensions are consistently defined and understood across these different domains. +dbt Mesh enables organizations to split their data architecture into multiple domain-specific projects, while retaining the ability to reference “public” models across projects. It is also possible to reference a “public” model from another project for the purpose of defining semantic models and metrics. In this way, your organization can have multiple dbt projects feed into a unified semantic layer, ensuring that metrics and dimensions are consistently defined and understood across these different domains. -2. **How does dbt Explorer relate to and work with dbt Mesh?**
-[dbt Explorer](/docs/collaborate/explore-projects) is a tool within dbt Cloud that serves as a knowledge base and lineage visualization platform. It provides a comprehensive view of your dbt assets, including models, tests, sources, and their interdependencies. +
- Used in conjunction with dbt Mesh, dbt Explorer becomes a powerful tool for visualizing and understanding the relationships and dependencies between models across multiple dbt projects. + -3. **How does the dbt Cloud CLI relate to and work with dbt Mesh?**
-The [dbt Cloud CLI](/docs/cloud/cloud-cli-installation) allows users to develop and run dbt commands from their preferred development environments, like VS Code, Sublime Text, or terminal interfaces. This flexibility is particularly beneficial in a dbt Mesh setup, where managing multiple projects can be complex. Developers can work in their preferred tools while leveraging the centralized capabilities of dbt Cloud. +**[dbt Explorer](/docs/collaborate/explore-projects)** is a tool within dbt Cloud that serves as a knowledge base and lineage visualization platform. It provides a comprehensive view of your dbt assets, including models, tests, sources, and their interdependencies. + +Used in conjunction with dbt Mesh, dbt Explorer becomes a powerful tool for visualizing and understanding the relationships and dependencies between models across multiple dbt projects. + +
+ + -4. **How does the dbt Cloud CLI relate to and work with dbt Mesh?**
The [dbt Cloud CLI](/docs/cloud/cloud-cli-installation) allows users to develop and run dbt commands from their preferred development environments, like VS Code, Sublime Text, or terminal interfaces. This flexibility is particularly beneficial in a dbt Mesh setup, where managing multiple projects can be complex. Developers can work in their preferred tools while leveraging the centralized capabilities of dbt Cloud. +
+ + ## When and where dbt Mesh is available -1. **Does dbt Mesh require me to be on a specific version of dbt?**
+ + Yes — your account must be on at least dbt v1.6 to take advantage of [cross-project dependencies](/docs/collaborate/govern/project-dependencies), one of the most crucial underlying capabilities required to implement a dbt Mesh. -2. **Is there a way to leverage dbt Mesh capabilities in dbt Core?**
+
+ + + Not all of them. While dbt Core provides some of the foundational elements for dbt Mesh, dbt Cloud offers an enhanced experience that leverages these elements for scaled collaboration across multiple teams. - Many of the key components that underpin dbt Mesh functionality, such as model contracts, versions, and access modifier levels (public and private), are available in dbt Core. To enable cross-project dependencies, users can also leverage [packages](/docs/build/packages). This enables users to import models from an upstream project, which allows the resolution of cross-project references. +Many of the key components that underpin dbt Mesh functionality, such as model contracts, versions, and access modifier levels (public and private), are available in dbt Core. To enable cross-project dependencies, users can also leverage [packages](/docs/build/packages). This enables users to import models from an upstream project, which allows the resolution of cross-project references. + +The major distinction comes with dbt Cloud's metadata service, which is unique to the dbt Cloud platform and allows for the resolution of references to only the public models in a project. This service enables users to take dependencies on resources from upstream projects without needing to load the full complexity of those upstream projects into their local development environment. + + - The major distinction comes with dbt Cloud's metadata service, which is unique to the dbt Cloud platform and allows for the resolution of references to only the public models in a project. This service enables users to take dependencies on resources from upstream projects without needing to load the full complexity of those upstream projects into their local development environment. + -3. **Does dbt Mesh require a specific dbt Cloud plan?**
Yes, a [dbt Cloud Enterprise](https://www.getdbt.com/pricing) plan is required to set up multiple projects and reference models across them. +
+ ## Tips on implementing dbt Mesh -1. **Is there a recommended migration or implementation process?**
+ + Refer to our developer guide on [How we structure our dbt Mesh projects](https://docs.getdbt.com/best-practices/how-we-mesh/mesh-1-intro). You may also be interested in watching the recording of this talk from Coalesce 2023: [Unlocking model governance and multi-project deployments with dbt-meshify](https://docs.getdbt.com/best-practices/how-we-mesh/mesh-1-intro). -2. **Are there tools available to help me migrate to a dbt Mesh?**
+
+ + + `dbt-meshify` is a [CLI tool](https://github.com/dbt-labs/dbt-meshify) that automates the creation of model governance and cross-project lineage features introduced in dbt-core v1.5 and v1.6. This package will leverage your dbt project metadata to create and/or edit the files in your project to properly configure the models in your project with these features. + From 3953b5a10ac6119f3a2f75b45ddb43da287a3806 Mon Sep 17 00:00:00 2001 From: mirnawong1 Date: Wed, 3 Jan 2024 13:01:40 -0500 Subject: [PATCH 06/31] add faqs --- .../best-practices/how-we-mesh/mesh-4-faqs.md | 12 +- website/docs/faqs/dbt Mesh/dbt-mesh-faqs.md | 272 ++++++++++++++++++ 2 files changed, 278 insertions(+), 6 deletions(-) create mode 100644 website/docs/faqs/dbt Mesh/dbt-mesh-faqs.md diff --git a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md index 92fbb56185f..20f857f8c9d 100644 --- a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md +++ b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md @@ -9,12 +9,12 @@ dbt Mesh is a new architecture enabled by dbt Cloud. It allows you to better man The following frequently asked questions (FAQs) are categorized into the following topics: -- Overview of dbt Mesh -- How dbt Mesh works -- Access and permissions -- Compatibility with other features -- When and where dbt Mesh is available -- Tips on implementing dbt Mesh +- [Overview of Mesh](#overview-of-mesh) +- [How dbt Mesh works](#how-dbt-mesh-works) +- [Access and permissions](#access-and-permissions) +- [Compatibility with other features](#compatibility-with-other-features) +- [When and where dbt Mesh is available](#when-and-where-dbt-mesh-is-available) +- [Tips on implementing dbt Mesh](#tips-on-implementing-dbt-mesh) ## Overview of Mesh diff --git a/website/docs/faqs/dbt Mesh/dbt-mesh-faqs.md b/website/docs/faqs/dbt Mesh/dbt-mesh-faqs.md new file mode 100644 index 00000000000..d13fd449d21 --- /dev/null +++ b/website/docs/faqs/dbt Mesh/dbt-mesh-faqs.md @@ -0,0 +1,272 @@ +--- +title: "dbt Mesh FAQs" +description: "Read some frequently asked questions about dbt Mesh." +--- + +dbt Mesh is a new architecture enabled by dbt Cloud. It allows you to better manage complexity by deploying multiple interconnected dbt projects instead of a single large, monolithic project. It’s designed to accelerate development, without compromising governance. + +The following frequently asked questions (FAQs) are categorized into the following topics: + +- [Overview of Mesh](#overview-of-mesh) +- [How dbt Mesh works](#how-dbt-mesh-works) +- [Access and permissions](#access-and-permissions) +- [Compatibility with other features](#compatibility-with-other-features) +- [When and where dbt Mesh is available](#when-and-where-dbt-mesh-is-available) +- [Tips on implementing dbt Mesh](#tips-on-implementing-dbt-mesh) + +## Overview of Mesh + + + +dbt Mesh is a new architecture enabled by dbt Cloud. It allows you to better manage complexity by deploying multiple interconnected dbt projects instead of a single large, monolithic project. It’s designed to accelerate development, without compromising governance. + + + + +In dbt, [model contracts](/docs/collaborate/govern/model-contracts) serve as a governance tool enabling the definition and enforcement of data structure standards in your dbt models. They allow you to specify and uphold data model guarantees, including column data types, allowing for the stability of dependent models. Should a model fail to adhere to its established contracts, it will not successfully build. + + + + + +dbt [model versions](https://docs.getdbt.com/docs/collaborate/govern/model-versions) are iterations of your dbt models made over time. In many cases, you may knowingly choose to change a model’s structure in a way that “breaks” the previous model contract — when you do so, you find it useful to create a new version of the model to signify this change. + +You can use **model versions** to: + +- Test "prerelease" changes (in production, in downstream systems) +- Bump the latest version, to be used as the canonical source of truth +- Offer a migration window off the "old" version + + + + + +A [model access modifier](/docs/collaborate/govern/model-access) in dbt is a feature that defines the level of accessibility of a model to other parts of the dbt project, or to other dbt projects. It specifies who can reference a model using [the `ref` function](https://docs.getdbt.com/reference/dbt-jinja-functions/ref). There are three types of access modifiers: + +1. **Private:** A model with a private access modifier is only referenceable by models within the same group. This is intended for models that are implementation details and are meant to be used only within a specific group of related models. +2. **Protected:** Models with a protected access modifier can be referenced by any other model within the same dbt project or when the project is installed as a package. This is the default setting for all models, ensuring backward compatibility, especially when groups are assigned to an existing set of models. +3. **Public:** A public model can be referenced across different groups, packages, or projects. This is suitable for stable and mature models that serve as interfaces for other teams or projects. + + + + + +A model group in dbt is a concept used to organize models under a common category or ownership. This categorization can be based on various criteria, such as the team responsible for the models or the specific data source they model, like GitHub data. + + + + + +1. **Agility in Development**: With a more modular architecture, teams can make changes rapidly and independently in specific areas without impacting the entire system, leading to faster development cycles. +2. **Improved Collaboration**: Teams are able to share and build upon each other's work without duplicating efforts. +3. **Reduced Complexity**: By organizing transformation logic into distinct domains, dbt Mesh reduces the complexity inherent in large, monolithic projects, making them easier to manage and understand. +4. **Improved Data Trust.** Adopting a dbt Mesh can help ensure that changes in one domain's data models do not unexpectedly break dependencies in other domain areas, leading to a more secure and predictable data environment. + +Most importantly, all this can be accomplished without the central data team losing the ability to see lineage across the entire organization, or compromising on governance mechanisms. + + + + + +This is a new way of working, and the intentionality required to build, and then maintain, cross-project interfaces and dependencies may feel like a slowdown versus what some developers are used to. This way of working introduces intentional friction that makes it more difficult to change everything at once. + +Orchestration across multiple projects is also likely to be slightly more challenging for many organizations, although we’re currently developing new functionality that will make this process simpler. + + + +## How dbt Mesh works + + + +Yes. In addition to being viewable natively through [dbt Explorer](https://www.getdbt.com/product/dbt-explorer), it is possible to view cross-project lineage connect using partner integrations with data cataloging tools. For a list of available dbt Cloud integrations, refer to the [Integrations page](https://www.getdbt.com/product/integrations). + + + + + +Yes, all of this metadata is accessible via the [dbt Cloud Admin API](/docs/dbt-cloud-apis/admin-cloud-api). This metadata can be fed into a monitoring tool, or used to create reports and dashboards. + +We also expose some of this information in dbt Cloud itself in [jobs](/docs/deploy/jobs), [environments](https://docs.getdbt.com/docs/environments-in-dbt) and in [dbt Explorer](https://www.getdbt.com/product/dbt-explorer). + + + + + +Like resource dependencies, project dependencies are acyclic, meaning they only move in one direction. This prevents `ref` cycles (or loops). For example, if project B depends on project A, a new model in project A could not import and use a public model from project B. Refer to [Project dependencies](/docs/collaborate/govern/project-dependencies#how-to-use-ref) for more information. + + + + + +While it’s not currently possible to share sources across projects, it would be possible to have a shared foundational project, with staging models on top of those sources, exposed as “public” models to other teams/projects. + + + + + +This would be a breaking change for downstream consumers of that model. If the maintainers of the upstream project wish to remove the model (or “downgrade” its access modifier, effectively the same thing), they should mark that model for deprecation (using [deprecation_date](/reference/resource-properties/deprecation_date)), which will deliver a warning to all downstream consumers of that model. + +In the future, we plan for dbt Cloud to also be able to proactively flag this scenario in [continuous integration](/docs/deploy/continuous-integration) for the maintainers of the upstream public model. + + + + + +No, unless downstream projects are installed as [packages](/docs/build/packages) (source code). In that case, the models in project installed as a project become “your” models, and you can select or run them. There are cases in which this can be desirable; see docs on [project dependencies](/docs/collaborate/govern/project-dependencies). + + + + + +Yes; as long as they’re in the same data platform (such as BigQuery, Databricks, Redshift, Snowflake, or Starburst) and you have configured permissions and sharing in that data platform provider to allow for this. + + + + + +Yes, because the cross-project collaboration is done using the `{{ ref() }}` macro, you can use those models from other teams in singular tests. + + + + + +Each team defines their connection to the data warehouse, and the default schema names for dbt to use when materializing datasets. + +By default, each project belonging to a team will create: + +- One schema for production runs (for example, `finance`) +- One schema per developer (for example, `dev_jerco`) + +Depending on each team’s needs, this can be customized with model-level [schema configurations](/docs/build/custom-schemas), including the ability to define different rules by environment. + + + + + +No, contracts can currently only be applied at the [model level](/docs/collaborate/govern/model-contracts). It is a recommended best practice to [define staging models](https://docs.getdbt.com/best-practices/how-we-structure/2-staging) on top of sources, and it is possible to define contracts on top of those staging models. + + + + + +No. A contract applies to an entire model, including all columns in the model’s output. This is the same set of columns that a consumer would see when viewing the model’s details in Explorer, or when querying the model in the data platform. +- This means it's not possible to selectively enforce contracts on specific columns, such as ensuring certain columns exist and remain unchanged while allowing modifications to others. +- If you wish to contract only a subset of columns, you can create a separate model (materialized as a view) selecting only that subset. If you wish to limit which rows or columns a downstream consumer can see when they query the model’s data, depending on who they are, specific data platforms offer advanced capabilities around dynamic row-level access and column-level data masking. + + + + + +No, a [group](/docs/collaborate/govern/model-access#groups) can currently only be assigned to have a single owner. Note however that the assigned can be a team, not just an individual. + + + + + +Not directly, but contracts are [assigned to models](/docs/collaborate/govern/model-contracts) and models can be assigned to individual owners. You can use meta fields for this purpose. + + + + + +This is not currently possible, but something we hope to enable in the near future. If you’re interested in this functionality, please reach out to your dbt Labs account team. + + + + + +dbt Cloud will soon offer a capability to trigger jobs on the completion of another job, including a job in a different project. This offers one mechanism for executing a pipeline from start to finish across projects. + + + +## Access and permissions + + + +The existence of projects that have at least one public model will be visible to everyone in the organization with read-only access. + +Private or protected models require a user to have read-only access on the specific project in order to see its existence. + + + + + +There is not currently! But this is something we may evaluate for the future. + + + + + +The "multi-project" view in dbt Cloud Explorer (and the underlying Discovery API) will include any project that has defined at least one "public" model, including the list of public models. (Read more about [model access modifiers](/docs/collaborate/govern/model-access#access-modifiers).) If a user has additional permissions on that project (managed via dbt Cloud RBAC), including read-only permissions, they can explore further into that project and see details about all models (private, protected, and protected). + + + + + +Yes! As long as a user has permissions (at least read-only access) on all projects in a dbt Cloud account, they can navigate across the entirety of the organization’s DAG in dbt Explorer, and see models at all levels of detail. + + + +## Compatibility with other features + + + +The [dbt Semantic Layer](/docs/use-dbt-semantic-layer/dbt-sl) and dbt Mesh are complementary mechanisms enabled by dbt Cloud that work together to enhance the management, usability, and governance of data in large-scale data environments. + +The Semantic Layer in dbt Cloud allows teams to centrally define business metrics and dimensions. It ensures consistent and reliable metric definitions across various analytics tools and platforms. + +dbt Mesh enables organizations to split their data architecture into multiple domain-specific projects, while retaining the ability to reference “public” models across projects. It is also possible to reference a “public” model from another project for the purpose of defining semantic models and metrics. In this way, your organization can have multiple dbt projects feed into a unified semantic layer, ensuring that metrics and dimensions are consistently defined and understood across these different domains. + + + + + +**[dbt Explorer](/docs/collaborate/explore-projects)** is a tool within dbt Cloud that serves as a knowledge base and lineage visualization platform. It provides a comprehensive view of your dbt assets, including models, tests, sources, and their interdependencies. + +Used in conjunction with dbt Mesh, dbt Explorer becomes a powerful tool for visualizing and understanding the relationships and dependencies between models across multiple dbt projects. + + + + + +The [dbt Cloud CLI](/docs/cloud/cloud-cli-installation) allows users to develop and run dbt commands from their preferred development environments, like VS Code, Sublime Text, or terminal interfaces. This flexibility is particularly beneficial in a dbt Mesh setup, where managing multiple projects can be complex. Developers can work in their preferred tools while leveraging the centralized capabilities of dbt Cloud. + + + + +## When and where dbt Mesh is available + + + +Yes — your account must be on at least dbt v1.6 to take advantage of [cross-project dependencies](/docs/collaborate/govern/project-dependencies), one of the most crucial underlying capabilities required to implement a dbt Mesh. + + + + + +Not all of them. While dbt Core provides some of the foundational elements for dbt Mesh, dbt Cloud offers an enhanced experience that leverages these elements for scaled collaboration across multiple teams. + +Many of the key components that underpin dbt Mesh functionality, such as model contracts, versions, and access modifier levels (public and private), are available in dbt Core. To enable cross-project dependencies, users can also leverage [packages](/docs/build/packages). This enables users to import models from an upstream project, which allows the resolution of cross-project references. + +The major distinction comes with dbt Cloud's metadata service, which is unique to the dbt Cloud platform and allows for the resolution of references to only the public models in a project. This service enables users to take dependencies on resources from upstream projects without needing to load the full complexity of those upstream projects into their local development environment. + + + + + +Yes, a [dbt Cloud Enterprise](https://www.getdbt.com/pricing) plan is required to set up multiple projects and reference models across them. + + + +## Tips on implementing dbt Mesh + + + +Refer to our developer guide on [How we structure our dbt Mesh projects](https://docs.getdbt.com/best-practices/how-we-mesh/mesh-1-intro). You may also be interested in watching the recording of this talk from Coalesce 2023: [Unlocking model governance and multi-project deployments with dbt-meshify](https://docs.getdbt.com/best-practices/how-we-mesh/mesh-1-intro). + + + + + +`dbt-meshify` is a [CLI tool](https://github.com/dbt-labs/dbt-meshify) that automates the creation of model governance and cross-project lineage features introduced in dbt-core v1.5 and v1.6. This package will leverage your dbt project metadata to create and/or edit the files in your project to properly configure the models in your project with these features. + From 64ab3d306a88bea2f72f782038ae475074c81bfc Mon Sep 17 00:00:00 2001 From: mirnawong1 Date: Wed, 3 Jan 2024 14:54:26 -0500 Subject: [PATCH 07/31] flag outstanding questions --- .../best-practices/how-we-mesh/mesh-4-faqs.md | 25 +- website/docs/faqs/dbt Mesh/dbt-mesh-faqs.md | 272 ------------------ 2 files changed, 24 insertions(+), 273 deletions(-) delete mode 100644 website/docs/faqs/dbt Mesh/dbt-mesh-faqs.md diff --git a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md index 20f857f8c9d..f83dd065b82 100644 --- a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md +++ b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md @@ -209,6 +209,18 @@ Yes! As long as a user has permissions (at least read-only access) on all projec
+ + +OUTSTANDING COPY + + + + + +My production environments contain sensitive data. How can I limit my developers from accessing production data when referencing from other projects? + + + ## Compatibility with other features @@ -221,6 +233,12 @@ dbt Mesh enables organizations to split their data architecture into multiple do + + +OUTSTANDING COPY + + + **[dbt Explorer](/docs/collaborate/explore-projects)** is a tool within dbt Cloud that serves as a knowledge base and lineage visualization platform. It provides a comprehensive view of your dbt assets, including models, tests, sources, and their interdependencies. @@ -264,7 +282,7 @@ Yes, a [dbt Cloud Enterprise](https://www.getdbt.com/pricing) plan is required t -Refer to our developer guide on [How we structure our dbt Mesh projects](https://docs.getdbt.com/best-practices/how-we-mesh/mesh-1-intro). You may also be interested in watching the recording of this talk from Coalesce 2023: [Unlocking model governance and multi-project deployments with dbt-meshify](https://docs.getdbt.com/best-practices/how-we-mesh/mesh-1-intro). +Refer to our developer guide on [How we structure our dbt Mesh projects](https://docs.getdbt.com/best-practices/how-we-mesh/mesh-1-intro). You may also be interested in watching the recording of this talk from Coalesce 2023: [Unlocking model governance and multi-project deployments with dbt-meshify](https://www.youtube.com/watch?v=FAsY0Qx8EyU). @@ -272,3 +290,8 @@ Refer to our developer guide on [How we structure our dbt Mesh projects](https:/ `dbt-meshify` is a [CLI tool](https://github.com/dbt-labs/dbt-meshify) that automates the creation of model governance and cross-project lineage features introduced in dbt-core v1.5 and v1.6. This package will leverage your dbt project metadata to create and/or edit the files in your project to properly configure the models in your project with these features. + + + + My team isn’t structured in a way that would require multiple projects today (though that may change in the future). What aspects of dbt Mesh are relevant to me? + diff --git a/website/docs/faqs/dbt Mesh/dbt-mesh-faqs.md b/website/docs/faqs/dbt Mesh/dbt-mesh-faqs.md deleted file mode 100644 index d13fd449d21..00000000000 --- a/website/docs/faqs/dbt Mesh/dbt-mesh-faqs.md +++ /dev/null @@ -1,272 +0,0 @@ ---- -title: "dbt Mesh FAQs" -description: "Read some frequently asked questions about dbt Mesh." ---- - -dbt Mesh is a new architecture enabled by dbt Cloud. It allows you to better manage complexity by deploying multiple interconnected dbt projects instead of a single large, monolithic project. It’s designed to accelerate development, without compromising governance. - -The following frequently asked questions (FAQs) are categorized into the following topics: - -- [Overview of Mesh](#overview-of-mesh) -- [How dbt Mesh works](#how-dbt-mesh-works) -- [Access and permissions](#access-and-permissions) -- [Compatibility with other features](#compatibility-with-other-features) -- [When and where dbt Mesh is available](#when-and-where-dbt-mesh-is-available) -- [Tips on implementing dbt Mesh](#tips-on-implementing-dbt-mesh) - -## Overview of Mesh - - - -dbt Mesh is a new architecture enabled by dbt Cloud. It allows you to better manage complexity by deploying multiple interconnected dbt projects instead of a single large, monolithic project. It’s designed to accelerate development, without compromising governance. - - - - -In dbt, [model contracts](/docs/collaborate/govern/model-contracts) serve as a governance tool enabling the definition and enforcement of data structure standards in your dbt models. They allow you to specify and uphold data model guarantees, including column data types, allowing for the stability of dependent models. Should a model fail to adhere to its established contracts, it will not successfully build. - - - - - -dbt [model versions](https://docs.getdbt.com/docs/collaborate/govern/model-versions) are iterations of your dbt models made over time. In many cases, you may knowingly choose to change a model’s structure in a way that “breaks” the previous model contract — when you do so, you find it useful to create a new version of the model to signify this change. - -You can use **model versions** to: - -- Test "prerelease" changes (in production, in downstream systems) -- Bump the latest version, to be used as the canonical source of truth -- Offer a migration window off the "old" version - - - - - -A [model access modifier](/docs/collaborate/govern/model-access) in dbt is a feature that defines the level of accessibility of a model to other parts of the dbt project, or to other dbt projects. It specifies who can reference a model using [the `ref` function](https://docs.getdbt.com/reference/dbt-jinja-functions/ref). There are three types of access modifiers: - -1. **Private:** A model with a private access modifier is only referenceable by models within the same group. This is intended for models that are implementation details and are meant to be used only within a specific group of related models. -2. **Protected:** Models with a protected access modifier can be referenced by any other model within the same dbt project or when the project is installed as a package. This is the default setting for all models, ensuring backward compatibility, especially when groups are assigned to an existing set of models. -3. **Public:** A public model can be referenced across different groups, packages, or projects. This is suitable for stable and mature models that serve as interfaces for other teams or projects. - - - - - -A model group in dbt is a concept used to organize models under a common category or ownership. This categorization can be based on various criteria, such as the team responsible for the models or the specific data source they model, like GitHub data. - - - - - -1. **Agility in Development**: With a more modular architecture, teams can make changes rapidly and independently in specific areas without impacting the entire system, leading to faster development cycles. -2. **Improved Collaboration**: Teams are able to share and build upon each other's work without duplicating efforts. -3. **Reduced Complexity**: By organizing transformation logic into distinct domains, dbt Mesh reduces the complexity inherent in large, monolithic projects, making them easier to manage and understand. -4. **Improved Data Trust.** Adopting a dbt Mesh can help ensure that changes in one domain's data models do not unexpectedly break dependencies in other domain areas, leading to a more secure and predictable data environment. - -Most importantly, all this can be accomplished without the central data team losing the ability to see lineage across the entire organization, or compromising on governance mechanisms. - - - - - -This is a new way of working, and the intentionality required to build, and then maintain, cross-project interfaces and dependencies may feel like a slowdown versus what some developers are used to. This way of working introduces intentional friction that makes it more difficult to change everything at once. - -Orchestration across multiple projects is also likely to be slightly more challenging for many organizations, although we’re currently developing new functionality that will make this process simpler. - - - -## How dbt Mesh works - - - -Yes. In addition to being viewable natively through [dbt Explorer](https://www.getdbt.com/product/dbt-explorer), it is possible to view cross-project lineage connect using partner integrations with data cataloging tools. For a list of available dbt Cloud integrations, refer to the [Integrations page](https://www.getdbt.com/product/integrations). - - - - - -Yes, all of this metadata is accessible via the [dbt Cloud Admin API](/docs/dbt-cloud-apis/admin-cloud-api). This metadata can be fed into a monitoring tool, or used to create reports and dashboards. - -We also expose some of this information in dbt Cloud itself in [jobs](/docs/deploy/jobs), [environments](https://docs.getdbt.com/docs/environments-in-dbt) and in [dbt Explorer](https://www.getdbt.com/product/dbt-explorer). - - - - - -Like resource dependencies, project dependencies are acyclic, meaning they only move in one direction. This prevents `ref` cycles (or loops). For example, if project B depends on project A, a new model in project A could not import and use a public model from project B. Refer to [Project dependencies](/docs/collaborate/govern/project-dependencies#how-to-use-ref) for more information. - - - - - -While it’s not currently possible to share sources across projects, it would be possible to have a shared foundational project, with staging models on top of those sources, exposed as “public” models to other teams/projects. - - - - - -This would be a breaking change for downstream consumers of that model. If the maintainers of the upstream project wish to remove the model (or “downgrade” its access modifier, effectively the same thing), they should mark that model for deprecation (using [deprecation_date](/reference/resource-properties/deprecation_date)), which will deliver a warning to all downstream consumers of that model. - -In the future, we plan for dbt Cloud to also be able to proactively flag this scenario in [continuous integration](/docs/deploy/continuous-integration) for the maintainers of the upstream public model. - - - - - -No, unless downstream projects are installed as [packages](/docs/build/packages) (source code). In that case, the models in project installed as a project become “your” models, and you can select or run them. There are cases in which this can be desirable; see docs on [project dependencies](/docs/collaborate/govern/project-dependencies). - - - - - -Yes; as long as they’re in the same data platform (such as BigQuery, Databricks, Redshift, Snowflake, or Starburst) and you have configured permissions and sharing in that data platform provider to allow for this. - - - - - -Yes, because the cross-project collaboration is done using the `{{ ref() }}` macro, you can use those models from other teams in singular tests. - - - - - -Each team defines their connection to the data warehouse, and the default schema names for dbt to use when materializing datasets. - -By default, each project belonging to a team will create: - -- One schema for production runs (for example, `finance`) -- One schema per developer (for example, `dev_jerco`) - -Depending on each team’s needs, this can be customized with model-level [schema configurations](/docs/build/custom-schemas), including the ability to define different rules by environment. - - - - - -No, contracts can currently only be applied at the [model level](/docs/collaborate/govern/model-contracts). It is a recommended best practice to [define staging models](https://docs.getdbt.com/best-practices/how-we-structure/2-staging) on top of sources, and it is possible to define contracts on top of those staging models. - - - - - -No. A contract applies to an entire model, including all columns in the model’s output. This is the same set of columns that a consumer would see when viewing the model’s details in Explorer, or when querying the model in the data platform. -- This means it's not possible to selectively enforce contracts on specific columns, such as ensuring certain columns exist and remain unchanged while allowing modifications to others. -- If you wish to contract only a subset of columns, you can create a separate model (materialized as a view) selecting only that subset. If you wish to limit which rows or columns a downstream consumer can see when they query the model’s data, depending on who they are, specific data platforms offer advanced capabilities around dynamic row-level access and column-level data masking. - - - - - -No, a [group](/docs/collaborate/govern/model-access#groups) can currently only be assigned to have a single owner. Note however that the assigned can be a team, not just an individual. - - - - - -Not directly, but contracts are [assigned to models](/docs/collaborate/govern/model-contracts) and models can be assigned to individual owners. You can use meta fields for this purpose. - - - - - -This is not currently possible, but something we hope to enable in the near future. If you’re interested in this functionality, please reach out to your dbt Labs account team. - - - - - -dbt Cloud will soon offer a capability to trigger jobs on the completion of another job, including a job in a different project. This offers one mechanism for executing a pipeline from start to finish across projects. - - - -## Access and permissions - - - -The existence of projects that have at least one public model will be visible to everyone in the organization with read-only access. - -Private or protected models require a user to have read-only access on the specific project in order to see its existence. - - - - - -There is not currently! But this is something we may evaluate for the future. - - - - - -The "multi-project" view in dbt Cloud Explorer (and the underlying Discovery API) will include any project that has defined at least one "public" model, including the list of public models. (Read more about [model access modifiers](/docs/collaborate/govern/model-access#access-modifiers).) If a user has additional permissions on that project (managed via dbt Cloud RBAC), including read-only permissions, they can explore further into that project and see details about all models (private, protected, and protected). - - - - - -Yes! As long as a user has permissions (at least read-only access) on all projects in a dbt Cloud account, they can navigate across the entirety of the organization’s DAG in dbt Explorer, and see models at all levels of detail. - - - -## Compatibility with other features - - - -The [dbt Semantic Layer](/docs/use-dbt-semantic-layer/dbt-sl) and dbt Mesh are complementary mechanisms enabled by dbt Cloud that work together to enhance the management, usability, and governance of data in large-scale data environments. - -The Semantic Layer in dbt Cloud allows teams to centrally define business metrics and dimensions. It ensures consistent and reliable metric definitions across various analytics tools and platforms. - -dbt Mesh enables organizations to split their data architecture into multiple domain-specific projects, while retaining the ability to reference “public” models across projects. It is also possible to reference a “public” model from another project for the purpose of defining semantic models and metrics. In this way, your organization can have multiple dbt projects feed into a unified semantic layer, ensuring that metrics and dimensions are consistently defined and understood across these different domains. - - - - - -**[dbt Explorer](/docs/collaborate/explore-projects)** is a tool within dbt Cloud that serves as a knowledge base and lineage visualization platform. It provides a comprehensive view of your dbt assets, including models, tests, sources, and their interdependencies. - -Used in conjunction with dbt Mesh, dbt Explorer becomes a powerful tool for visualizing and understanding the relationships and dependencies between models across multiple dbt projects. - - - - - -The [dbt Cloud CLI](/docs/cloud/cloud-cli-installation) allows users to develop and run dbt commands from their preferred development environments, like VS Code, Sublime Text, or terminal interfaces. This flexibility is particularly beneficial in a dbt Mesh setup, where managing multiple projects can be complex. Developers can work in their preferred tools while leveraging the centralized capabilities of dbt Cloud. - - - - -## When and where dbt Mesh is available - - - -Yes — your account must be on at least dbt v1.6 to take advantage of [cross-project dependencies](/docs/collaborate/govern/project-dependencies), one of the most crucial underlying capabilities required to implement a dbt Mesh. - - - - - -Not all of them. While dbt Core provides some of the foundational elements for dbt Mesh, dbt Cloud offers an enhanced experience that leverages these elements for scaled collaboration across multiple teams. - -Many of the key components that underpin dbt Mesh functionality, such as model contracts, versions, and access modifier levels (public and private), are available in dbt Core. To enable cross-project dependencies, users can also leverage [packages](/docs/build/packages). This enables users to import models from an upstream project, which allows the resolution of cross-project references. - -The major distinction comes with dbt Cloud's metadata service, which is unique to the dbt Cloud platform and allows for the resolution of references to only the public models in a project. This service enables users to take dependencies on resources from upstream projects without needing to load the full complexity of those upstream projects into their local development environment. - - - - - -Yes, a [dbt Cloud Enterprise](https://www.getdbt.com/pricing) plan is required to set up multiple projects and reference models across them. - - - -## Tips on implementing dbt Mesh - - - -Refer to our developer guide on [How we structure our dbt Mesh projects](https://docs.getdbt.com/best-practices/how-we-mesh/mesh-1-intro). You may also be interested in watching the recording of this talk from Coalesce 2023: [Unlocking model governance and multi-project deployments with dbt-meshify](https://docs.getdbt.com/best-practices/how-we-mesh/mesh-1-intro). - - - - - -`dbt-meshify` is a [CLI tool](https://github.com/dbt-labs/dbt-meshify) that automates the creation of model governance and cross-project lineage features introduced in dbt-core v1.5 and v1.6. This package will leverage your dbt project metadata to create and/or edit the files in your project to properly configure the models in your project with these features. - From ab071e8b1306c105562e6c18f519c158abbb85ef Mon Sep 17 00:00:00 2001 From: mirnawong1 Date: Wed, 3 Jan 2024 15:08:51 -0500 Subject: [PATCH 08/31] final tweaks --- website/docs/best-practices/how-we-mesh/mesh-4-faqs.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md index f83dd065b82..d3bd7e308c9 100644 --- a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md +++ b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md @@ -209,7 +209,7 @@ Yes! As long as a user has permissions (at least read-only access) on all projec
- + OUTSTANDING COPY From 39c8b41b30274a221df57446f6746887bcb772b5 Mon Sep 17 00:00:00 2001 From: mirnawong1 Date: Wed, 3 Jan 2024 15:15:54 -0500 Subject: [PATCH 09/31] add link to --- website/docs/best-practices/how-we-mesh/mesh-1-intro.md | 2 ++ website/docs/best-practices/how-we-mesh/mesh-4-faqs.md | 4 ++-- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/website/docs/best-practices/how-we-mesh/mesh-1-intro.md b/website/docs/best-practices/how-we-mesh/mesh-1-intro.md index ba1660a8d82..b50746fc6b6 100644 --- a/website/docs/best-practices/how-we-mesh/mesh-1-intro.md +++ b/website/docs/best-practices/how-we-mesh/mesh-1-intro.md @@ -32,6 +32,8 @@ dbt Cloud is designed to coordinate the features above and simplify the complexi If you're just starting your dbt journey, don't worry about building a multi-project architecture right away. You can _incrementally_ adopt the features in this guide as you scale. The collection of features work effectively as independent tools. Familiarizing yourself with the tooling and features that make up a multi-project architecture, and how they can apply to your organization will help you make better decisions as you grow. +For additional information, refer to the [dbt Mesh frequently asked questions](/best-practices/how-we-mesh/mesh-4-faqs) (FAQs). + ## Learning goals - Understand the **purpose and tradeoffs** of building a multi-project architecture. diff --git a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md index d3bd7e308c9..5d17dc9b995 100644 --- a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md +++ b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md @@ -1,8 +1,8 @@ --- title: "dbt Mesh FAQs" -description: "Read some frequently asked questions about dbt Mesh." +description: "Read the FAQs to learn more about dbt Mesh, how it works, compatibility, and more." hoverSnippet: "dbt Mesh FAQs" -sidebar_label: "Frequently asked dbt Mesh questions" +sidebar_label: "dbt Mesh FAQs" --- dbt Mesh is a new architecture enabled by dbt Cloud. It allows you to better manage complexity by deploying multiple interconnected dbt projects instead of a single large, monolithic project. It’s designed to accelerate development, without compromising governance. From cf70c31b1e2c0ce7619e8acff0fed2407a36870d Mon Sep 17 00:00:00 2001 From: mirnawong1 Date: Wed, 3 Jan 2024 15:21:41 -0500 Subject: [PATCH 10/31] sentence case --- website/docs/best-practices/how-we-mesh/mesh-4-faqs.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md index 5d17dc9b995..3579819c9c5 100644 --- a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md +++ b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md @@ -59,10 +59,10 @@ A model group in dbt is a concept used to organize models under a common categor -1. **Agility in Development**: With a more modular architecture, teams can make changes rapidly and independently in specific areas without impacting the entire system, leading to faster development cycles. -2. **Improved Collaboration**: Teams are able to share and build upon each other's work without duplicating efforts. +1. **Agility in development**: With a more modular architecture, teams can make changes rapidly and independently in specific areas without impacting the entire system, leading to faster development cycles. +2. **Improved collaboration**: Teams are able to share and build upon each other's work without duplicating efforts. 3. **Reduced Complexity**: By organizing transformation logic into distinct domains, dbt Mesh reduces the complexity inherent in large, monolithic projects, making them easier to manage and understand. -4. **Improved Data Trust.** Adopting a dbt Mesh can help ensure that changes in one domain's data models do not unexpectedly break dependencies in other domain areas, leading to a more secure and predictable data environment. +4. **Improved data trust.** Adopting a dbt Mesh can help ensure that changes in one domain's data models do not unexpectedly break dependencies in other domain areas, leading to a more secure and predictable data environment. Most importantly, all this can be accomplished without the central data team losing the ability to see lineage across the entire organization, or compromising on governance mechanisms. From db6445ce7804d11ccd2c6499c6efac3c788ee2c5 Mon Sep 17 00:00:00 2001 From: mirnawong1 Date: Wed, 3 Jan 2024 15:34:53 -0500 Subject: [PATCH 11/31] sentence case --- website/docs/best-practices/how-we-mesh/mesh-4-faqs.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md index 3579819c9c5..fcbe457be01 100644 --- a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md +++ b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md @@ -61,7 +61,7 @@ A model group in dbt is a concept used to organize models under a common categor 1. **Agility in development**: With a more modular architecture, teams can make changes rapidly and independently in specific areas without impacting the entire system, leading to faster development cycles. 2. **Improved collaboration**: Teams are able to share and build upon each other's work without duplicating efforts. -3. **Reduced Complexity**: By organizing transformation logic into distinct domains, dbt Mesh reduces the complexity inherent in large, monolithic projects, making them easier to manage and understand. +3. **Reduced complexity**: By organizing transformation logic into distinct domains, dbt Mesh reduces the complexity inherent in large, monolithic projects, making them easier to manage and understand. 4. **Improved data trust.** Adopting a dbt Mesh can help ensure that changes in one domain's data models do not unexpectedly break dependencies in other domain areas, leading to a more secure and predictable data environment. Most importantly, all this can be accomplished without the central data team losing the ability to see lineage across the entire organization, or compromising on governance mechanisms. From 71c385c696eccde63183308bfb7d862b67fe288d Mon Sep 17 00:00:00 2001 From: "Leona B. Campbell" <3880403+runleonarun@users.noreply.github.com> Date: Fri, 5 Jan 2024 11:37:48 -0800 Subject: [PATCH 12/31] updating FAQs to match current version --- .../best-practices/how-we-mesh/mesh-4-faqs.md | 132 +++++++++++------- website/docs/docs/dbt-support.md | 1 + 2 files changed, 80 insertions(+), 53 deletions(-) diff --git a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md index fcbe457be01..82ff64e3e7e 100644 --- a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md +++ b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md @@ -18,20 +18,15 @@ The following frequently asked questions (FAQs) are categorized into the followi ## Overview of Mesh - - -dbt Mesh is a new architecture enabled by dbt Cloud. It allows you to better manage complexity by deploying multiple interconnected dbt projects instead of a single large, monolithic project. It’s designed to accelerate development, without compromising governance. - - -In dbt, [model contracts](/docs/collaborate/govern/model-contracts) serve as a governance tool enabling the definition and enforcement of data structure standards in your dbt models. They allow you to specify and uphold data model guarantees, including column data types, allowing for the stability of dependent models. Should a model fail to adhere to its established contracts, it will not successfully build. +In dbt, [model contracts](/docs/collaborate/govern/model-contracts) serve as a governance tool enabling the definition and enforcement of data structure standards in your dbt models. They allow you to specify and uphold data model guarantees, including column data types, allowing for stability of dependent models. Should a model fail to adhere to its established contracts, it will not successfully build. -dbt [model versions](https://docs.getdbt.com/docs/collaborate/govern/model-versions) are iterations of your dbt models made over time. In many cases, you may knowingly choose to change a model’s structure in a way that “breaks” the previous model contract — when you do so, you find it useful to create a new version of the model to signify this change. +dbt [model versions](https://docs.getdbt.com/docs/collaborate/govern/model-versions) are iterations of your dbt models made over time. In many cases, you may knowingly choose to change a model’s structure in a way that “breaks” the previous model contract, and may break downstream queries depending on that model’s structure. When you do so, you may find it useful to create a new version of the model to signify this change. You can use **model versions** to: @@ -43,7 +38,7 @@ You can use **model versions** to: -A [model access modifier](/docs/collaborate/govern/model-access) in dbt is a feature that defines the level of accessibility of a model to other parts of the dbt project, or to other dbt projects. It specifies who can reference a model using [the `ref` function](https://docs.getdbt.com/reference/dbt-jinja-functions/ref). There are three types of access modifiers: +A [model access modifier](/docs/collaborate/govern/model-access) in dbt determines if a model is accessible as an input to other dbt models and projects. It specifies where a model can be referenced using [the `ref` function](https://docs.getdbt.com/reference/dbt-jinja-functions/ref). There are three types of access modifiers: 1. **Private:** A model with a private access modifier is only referenceable by models within the same group. This is intended for models that are implementation details and are meant to be used only within a specific group of related models. 2. **Protected:** Models with a protected access modifier can be referenced by any other model within the same dbt project or when the project is installed as a package. This is the default setting for all models, ensuring backward compatibility, especially when groups are assigned to an existing set of models. @@ -53,16 +48,16 @@ A [model access modifier](/docs/collaborate/govern/model-access) in dbt is a fea -A model group in dbt is a concept used to organize models under a common category or ownership. This categorization can be based on various criteria, such as the team responsible for the models or the specific data source they model, like GitHub data. +A [model group](/docs/collaborate/govern/model-access#groups) in dbt is a concept used to organize models under a common category or ownership. This categorization can be based on various criteria, such as the team responsible for the models or the specific data source they model. -1. **Agility in development**: With a more modular architecture, teams can make changes rapidly and independently in specific areas without impacting the entire system, leading to faster development cycles. -2. **Improved collaboration**: Teams are able to share and build upon each other's work without duplicating efforts. -3. **Reduced complexity**: By organizing transformation logic into distinct domains, dbt Mesh reduces the complexity inherent in large, monolithic projects, making them easier to manage and understand. -4. **Improved data trust.** Adopting a dbt Mesh can help ensure that changes in one domain's data models do not unexpectedly break dependencies in other domain areas, leading to a more secure and predictable data environment. +1. **Shop data products faster**: With a more modular architecture, teams can make changes rapidly and independently in specific areas without impacting the entire system, leading to faster development cycles. +2. **Improve trust in data:** Adopting a dbt Mesh can help ensure that changes in one domain's data models do not unexpectedly break dependencies in other domain areas, leading to a more secure and predictable data environment. +3. **Reduce complexity**: By organizing transformation logic into distinct domains, dbt Mesh reduces the complexity inherent in large, monolithic projects, making them easier to manage and understand. +4. **Improve collaboration**: Teams are able to share and build upon each other's work without duplicating efforts. Most importantly, all this can be accomplished without the central data team losing the ability to see lineage across the entire organization, or compromising on governance mechanisms. @@ -76,22 +71,18 @@ Orchestration across multiple projects is also likely to be slightly more challe -## How dbt Mesh works - - - -Yes. In addition to being viewable natively through [dbt Explorer](https://www.getdbt.com/product/dbt-explorer), it is possible to view cross-project lineage connect using partner integrations with data cataloging tools. For a list of available dbt Cloud integrations, refer to the [Integrations page](https://www.getdbt.com/product/integrations). + - +dbt Mesh allows you to better **operationalize** a Data Mesh by enabling decentralized, domain-specific data ownership and collaboration. - +In a Data Mesh, each business domain is responsible for its own data as a product. This is the same goal that dbt Mesh facilitates by enabling organizations to break down large, monolithic data projects into smaller, domain-specific dbt projects. Each team or domain can independently develop, maintain, and share their data models, fostering a decentralized data environment. -Yes, all of this metadata is accessible via the [dbt Cloud Admin API](/docs/dbt-cloud-apis/admin-cloud-api). This metadata can be fed into a monitoring tool, or used to create reports and dashboards. - -We also expose some of this information in dbt Cloud itself in [jobs](/docs/deploy/jobs), [environments](https://docs.getdbt.com/docs/environments-in-dbt) and in [dbt Explorer](https://www.getdbt.com/product/dbt-explorer). +dbt Mesh also enhances the interoperability and reusability of data across different domains, a key aspect of the Data Mesh philosophy. By allowing cross-project references and shared governance through model contracts and access controls, dbt Mesh ensures that while data ownership is decentralized, there is still a governed structure to the overall data architecture. +## How dbt Mesh works + Like resource dependencies, project dependencies are acyclic, meaning they only move in one direction. This prevents `ref` cycles (or loops). For example, if project B depends on project A, a new model in project A could not import and use a public model from project B. Refer to [Project dependencies](/docs/collaborate/govern/project-dependencies#how-to-use-ref) for more information. @@ -126,7 +117,7 @@ Yes; as long as they’re in the same data platform (such as BigQuery, Databrick -Yes, because the cross-project collaboration is done using the `{{ ref() }}` macro, you can use those models from other teams in singular tests. +Yes! because the cross-project collaboration is done using the `{{ ref() }}` macro, you can use those models from other teams in [singular tests](/docs/build/data-tests#singular-data-tests). @@ -145,21 +136,22 @@ Depending on each team’s needs, this can be customized with model-level [schem -No, contracts can currently only be applied at the [model level](/docs/collaborate/govern/model-contracts). It is a recommended best practice to [define staging models](https://docs.getdbt.com/best-practices/how-we-structure/2-staging) on top of sources, and it is possible to define contracts on top of those staging models. +No, contracts can currently only be applied at the [model level](/docs/collaborate/govern/model-contracts). It is a recommended best practice to [define staging models](/best-practices/how-we-structure/2-staging) on top of sources, and it is possible to define contracts on top of those staging models. No. A contract applies to an entire model, including all columns in the model’s output. This is the same set of columns that a consumer would see when viewing the model’s details in Explorer, or when querying the model in the data platform. -- This means it's not possible to selectively enforce contracts on specific columns, such as ensuring certain columns exist and remain unchanged while allowing modifications to others. -- If you wish to contract only a subset of columns, you can create a separate model (materialized as a view) selecting only that subset. If you wish to limit which rows or columns a downstream consumer can see when they query the model’s data, depending on who they are, specific data platforms offer advanced capabilities around dynamic row-level access and column-level data masking. + +- If you wish to contract only a subset of columns, you can create a separate model (materialized as a view) selecting only that subset. +- If you wish to limit which rows or columns a downstream consumer can see when they query the model’s data, depending on who they are, specific data platforms offer advanced capabilities around dynamic row-level access and column-level data masking. -No, a [group](/docs/collaborate/govern/model-access#groups) can currently only be assigned to have a single owner. Note however that the assigned can be a team, not just an individual. +No, a [group](/docs/collaborate/govern/model-access#groups) can currently only be assigned to have a single owner. Note, however, that the assigned can be a _team_, not just an individual. @@ -181,25 +173,63 @@ dbt Cloud will soon offer a capability to trigger jobs on the completion of anot -## Access and permissions + + +Yes. In addition to being viewable natively through [dbt Explorer](https://www.getdbt.com/product/dbt-explorer), it is possible to view cross-project lineage connect using partner integrations with data cataloging tools. For a list of available dbt Cloud integrations, refer to the [Integrations page](https://www.getdbt.com/product/integrations). + + + + + +Tests and model contracts in dbt help eliminate the need to restate data in the first place. With these tools, you can incorporate checks at the source and output layers of your dbt projects to assess data quality in the most critical places. When there are changes in transformation logic (for example, the definition of a particular column is changed), restating the data is as easy as merging the updated code and running a dbt Cloud job. + +If a data quality issue does slip through, you also have the option of simply rolling back the git commit, and then re-running the dbt Cloud job with the old code. + + + + + +Yes, all of this metadata is accessible via the [dbt Cloud Admin API](/docs/dbt-cloud-apis/admin-cloud-api). This metadata can be fed into a monitoring tool, or used to create reports and dashboards. + +We also expose some of this information in dbt Cloud itself in [jobs](/docs/deploy/jobs), [environments](/docs/environments-in-dbt) and in [dbt Explorer](https://www.getdbt.com/product/dbt-explorer). + + + +## Permissions and access -The existence of projects that have at least one public model will be visible to everyone in the organization with read-only access. +The existence of projects that have at least one public model will be visible to everyone in the organization with [read-only access](/docs/cloud/manage-access/seats-and-users). Private or protected models require a user to have read-only access on the specific project in order to see its existence. - + -There is not currently! But this is something we may evaluate for the future. +There’s model-level access within dbt; role-based access for users and groups in dbt Cloud; and access to actual underlying data within the data platform. + +First things first: access to underlying data is always defined and enforced by the underlying data platform (for example, BigQuery, Databricks, Redshift, Snowflake, Starburst.) This access is managed by executing “DCL statements” (namely `grant`). dbt [makes it easy to configure `grants` on models](/reference/resource-configs/grants), which provision data access for other roles/users/groups in the data warehouse. However, dbt does **not** automatically define or coordinate those grants unless they are configured explicitly. It’s possible your organization prefers to use a separate system for managing data warehouse permissions. + +[dbt Cloud Enterprise plans](https://www.getdbt.com/pricing) allow a system of role-based access that manages granular permissions for users and user groups. In this way, you can control which users can see or edit all aspects of a dbt Cloud project. A user’s access on a dbt Cloud projects also informs whether they can “explore” that project in detail. Roles, users, and groups are defined within the dbt Cloud application, via the UI or by integrating with an identity provider. + +[Model access](/docs/collaborate/govern/model-access) is about defining where models can be **referenced.** It also informs the discoverability of those projects within dbt Explorer. Model `access` is defined in code, just like any other model configuration (`materialized`, `tags`, etc). + +**Public:** Models with `public` access can be referenced everywhere. These are the “data products” of your organization. + +**Protected:** Models with `protected` access can only be referenced within the same project. This is the default level of model access. (We are discussing a future extension to `protected` models that will allow for their reference in *specific* downstream projects. Please read [the GitHub issue](https://github.com/dbt-labs/dbt-core/issues/9340), and upvote/comment if you’re interested in this use case.) + +**Private:** Model `groups` enable more-granular control over where `private` models can be referenced. By defining a group, and configuring models to belong to that group, you can restrict other models (not in the same group) from referencing any `private` models the group contains. Groups also provide a standard mechanism for defining the `owner` of all resources it contains. + +Within dbt Explorer, `public` models are discoverable for every user in the dbt Cloud account — every public model is listed in the “multi-project” view. By contrast, `protected` and `private` models in a project are visible only to users who have access to that project (including read-only access). + +Because dbt does not implicitly coordinate data warehouse `grants` with model-level `access`, it is possible for there to be a mismatch between them. For example, a `public` model’s metadata is viewable to all dbt Cloud users, anyone can write a `ref` to that model, but when they actually run or preview, they realize they do not have access to the underlying data in the data warehouse. **This is intentional.** In this way, your organization can retain least-privileged access to underlying data, while providing visibility and discoverability for the wider organization. Armed with the knowledge of which other “data products” (public models) exist — their descriptions, their ownership, which columns they contain — an analyst on another team can prepare a well-informed request for access to the underlying data. - + -The "multi-project" view in dbt Cloud Explorer (and the underlying Discovery API) will include any project that has defined at least one "public" model, including the list of public models. (Read more about [model access modifiers](/docs/collaborate/govern/model-access#access-modifiers).) If a user has additional permissions on that project (managed via dbt Cloud RBAC), including read-only permissions, they can explore further into that project and see details about all models (private, protected, and protected). +There is not currently! But this is something we may evaluate for the future. @@ -209,15 +239,13 @@ Yes! As long as a user has permissions (at least read-only access) on all projec - + -OUTSTANDING COPY +By default, cross-project references resolve to the “Production” deployment environment of the upstream project. If your organization has genuinely different data in production versus non-production environments, this poses an issue. - - - +For this reason, we will soon roll out a new canonical type of deployment environment: “Staging.” If a project defines both a “Production” environment and a “Staging” environment, then cross-project references from development and “Staging” environments will resolve to “Staging,” whereas only references coming from “Production” environments will resolve to “Production.” In this way, you are guaranteed separation of data environments, without needing to duplicate project configurations. -My production environments contain sensitive data. How can I limit my developers from accessing production data when referencing from other projects? +If you’re interested in beta access to “Staging” environments, let your dbt Labs account representative know! @@ -233,12 +261,6 @@ dbt Mesh enables organizations to split their data architecture into multiple do - - -OUTSTANDING COPY - - - **[dbt Explorer](/docs/collaborate/explore-projects)** is a tool within dbt Cloud that serves as a knowledge base and lineage visualization platform. It provides a comprehensive view of your dbt assets, including models, tests, sources, and their interdependencies. @@ -253,8 +275,7 @@ The [dbt Cloud CLI](/docs/cloud/cloud-cli-installation) allows users to develop - -## When and where dbt Mesh is available +## Availability @@ -264,11 +285,13 @@ Yes — your account must be on at least dbt v1.6 to take advantage of [cross-pr -Not all of them. While dbt Core provides some of the foundational elements for dbt Mesh, dbt Cloud offers an enhanced experience that leverages these elements for scaled collaboration across multiple teams. +Not all of them. While dbt Core defines several of the foundational elements for dbt Mesh, dbt Cloud offers an enhanced experience that leverages these elements for scaled collaboration across multiple teams, facilitated by multi-project discovery in dbt Explorer that’s tailored to each user’s individual access. -Many of the key components that underpin dbt Mesh functionality, such as model contracts, versions, and access modifier levels (public and private), are available in dbt Core. To enable cross-project dependencies, users can also leverage [packages](/docs/build/packages). This enables users to import models from an upstream project, which allows the resolution of cross-project references. +Several of the key components that underpin the dbt Mesh pattern — including model contracts, versions, and access modifiers — are defined and implemented in dbt Core. We believe these are components of the core language, which is why their implementations are open source. We want to define a standard pattern that analytics engineers everywhere can adopt, extend, and help us improve. -The major distinction comes with dbt Cloud's metadata service, which is unique to the dbt Cloud platform and allows for the resolution of references to only the public models in a project. This service enables users to take dependencies on resources from upstream projects without needing to load the full complexity of those upstream projects into their local development environment. +To reference models defined in another project, users can also leverage a longstanding feature of dbt Core: [packages](/docs/build/packages). By importing an upstream project as a package, dbt will import all models defined in that project, which enables the resolution of cross-project references to those models — [optionally restricted](/docs/collaborate/govern/model-access#how-do-i-restrict-access-to-models-defined-in-a-package) to just the models with `public` access. + +The major distinction comes with dbt Cloud's metadata service, which is unique to the dbt Cloud platform and allows for the resolution of references to only the public models in a project. This service enables users to take dependencies on upstream projects, and reference just their `public` models, *without* needing to load the full complexity of those upstream projects into their local development environment. @@ -291,7 +314,10 @@ Refer to our developer guide on [How we structure our dbt Mesh projects](https:/ `dbt-meshify` is a [CLI tool](https://github.com/dbt-labs/dbt-meshify) that automates the creation of model governance and cross-project lineage features introduced in dbt-core v1.5 and v1.6. This package will leverage your dbt project metadata to create and/or edit the files in your project to properly configure the models in your project with these features. - + + +Let’s say your organization has fewer than 500 models, and fewer than a dozen regular contributors to dbt. You’re operating at a scale that’s well served by the monolith (a single project), and the larger pattern of dbt Mesh probably isn’t a good fit. + +That said, it’s *never too early* to think about how you’re organizing models **within** that project. Use model `groups` to define clear ownership boundaries, and `private` access to restrict purpose-built models from becoming load-bearing blocks in an unrelated section of the DAG. Your future selves will thank you for having defined these interfaces, especially if you reach a scale where it makes sense to “graduate” the interfaces between `groups` into boundaries between projects. - My team isn’t structured in a way that would require multiple projects today (though that may change in the future). What aspects of dbt Mesh are relevant to me? diff --git a/website/docs/docs/dbt-support.md b/website/docs/docs/dbt-support.md index 84bf92482c5..ac63bdb81cc 100644 --- a/website/docs/docs/dbt-support.md +++ b/website/docs/docs/dbt-support.md @@ -97,3 +97,4 @@ For SQL writing, project performance review, or project building, refer to dbt P For help writing SQL, reviewing the overall performance of your project, or want someone to actually help build your dbt project, refer to the following pages: - List of [dbt Preferred Consulting Providers](https://www.getdbt.com/ecosystem/). - dbt Labs' [Services](https://www.getdbt.com/dbt-labs/services/). + From d5e87408e91e54de9f595622ad7fc9bc44f0db9a Mon Sep 17 00:00:00 2001 From: "Leona B. Campbell" <3880403+runleonarun@users.noreply.github.com> Date: Fri, 5 Jan 2024 11:49:37 -0800 Subject: [PATCH 13/31] Update website/docs/docs/dbt-support.md --- website/docs/docs/dbt-support.md | 1 - 1 file changed, 1 deletion(-) diff --git a/website/docs/docs/dbt-support.md b/website/docs/docs/dbt-support.md index ac63bdb81cc..84bf92482c5 100644 --- a/website/docs/docs/dbt-support.md +++ b/website/docs/docs/dbt-support.md @@ -97,4 +97,3 @@ For SQL writing, project performance review, or project building, refer to dbt P For help writing SQL, reviewing the overall performance of your project, or want someone to actually help build your dbt project, refer to the following pages: - List of [dbt Preferred Consulting Providers](https://www.getdbt.com/ecosystem/). - dbt Labs' [Services](https://www.getdbt.com/dbt-labs/services/). - From 588ba1ff313b1e3299ce017b3da835dea01a13d4 Mon Sep 17 00:00:00 2001 From: "Leona B. Campbell" <3880403+runleonarun@users.noreply.github.com> Date: Fri, 5 Jan 2024 12:10:55 -0800 Subject: [PATCH 14/31] removing toc --- website/docs/best-practices/how-we-mesh/mesh-4-faqs.md | 9 --------- 1 file changed, 9 deletions(-) diff --git a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md index 82ff64e3e7e..09957dcc319 100644 --- a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md +++ b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md @@ -7,15 +7,6 @@ sidebar_label: "dbt Mesh FAQs" dbt Mesh is a new architecture enabled by dbt Cloud. It allows you to better manage complexity by deploying multiple interconnected dbt projects instead of a single large, monolithic project. It’s designed to accelerate development, without compromising governance. -The following frequently asked questions (FAQs) are categorized into the following topics: - -- [Overview of Mesh](#overview-of-mesh) -- [How dbt Mesh works](#how-dbt-mesh-works) -- [Access and permissions](#access-and-permissions) -- [Compatibility with other features](#compatibility-with-other-features) -- [When and where dbt Mesh is available](#when-and-where-dbt-mesh-is-available) -- [Tips on implementing dbt Mesh](#tips-on-implementing-dbt-mesh) - ## Overview of Mesh From 345583f387e5510168c42ea4117124fbbf784d83 Mon Sep 17 00:00:00 2001 From: "Leona B. Campbell" <3880403+runleonarun@users.noreply.github.com> Date: Fri, 5 Jan 2024 13:07:12 -0800 Subject: [PATCH 15/31] Apply suggestions from code review Co-authored-by: Matt Shaver <60105315+matthewshaver@users.noreply.github.com> --- .../best-practices/how-we-mesh/mesh-1-intro.md | 2 +- .../docs/best-practices/how-we-mesh/mesh-4-faqs.md | 14 +++++++------- 2 files changed, 8 insertions(+), 8 deletions(-) diff --git a/website/docs/best-practices/how-we-mesh/mesh-1-intro.md b/website/docs/best-practices/how-we-mesh/mesh-1-intro.md index b97870a256b..fcd379de9cf 100644 --- a/website/docs/best-practices/how-we-mesh/mesh-1-intro.md +++ b/website/docs/best-practices/how-we-mesh/mesh-1-intro.md @@ -32,7 +32,7 @@ dbt Cloud is designed to coordinate the features above and simplify the complexi If you're just starting your dbt journey, don't worry about building a multi-project architecture right away. You can _incrementally_ adopt the features in this guide as you scale. The collection of features work effectively as independent tools. Familiarizing yourself with the tooling and features that make up a multi-project architecture, and how they can apply to your organization will help you make better decisions as you grow. -For additional information, refer to the [dbt Mesh frequently asked questions](/best-practices/how-we-mesh/mesh-4-faqs) (FAQs). +For additional information, refer to the [dbt Mesh FAQs](/best-practices/how-we-mesh/mesh-4-faqs). ## Learning goals diff --git a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md index 09957dcc319..f0b9b2be5a8 100644 --- a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md +++ b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md @@ -9,21 +9,21 @@ dbt Mesh is a new architecture enabled by dbt Cloud. It allows you to better man ## Overview of Mesh - + -In dbt, [model contracts](/docs/collaborate/govern/model-contracts) serve as a governance tool enabling the definition and enforcement of data structure standards in your dbt models. They allow you to specify and uphold data model guarantees, including column data types, allowing for stability of dependent models. Should a model fail to adhere to its established contracts, it will not successfully build. +dbt [model contracts](/docs/collaborate/govern/model-contracts) serve as a governance tool enabling the definition and enforcement of data structure standards in your dbt models. They allow you to specify and uphold data model guarantees, including column data types, allowing for the stability of dependent models. Should a model fail to adhere to its established contracts, it will not build successfully. -dbt [model versions](https://docs.getdbt.com/docs/collaborate/govern/model-versions) are iterations of your dbt models made over time. In many cases, you may knowingly choose to change a model’s structure in a way that “breaks” the previous model contract, and may break downstream queries depending on that model’s structure. When you do so, you may find it useful to create a new version of the model to signify this change. +dbt [model versions](https://docs.getdbt.com/docs/collaborate/govern/model-versions) are iterations of your dbt models made over time. In many cases, you might knowingly choose to change a model’s structure in a way that “breaks” the previous model contract, and may break downstream queries depending on that model’s structure. When you do so, creating a new version of the model is useful to signify this change. -You can use **model versions** to: +You can use model versions to: -- Test "prerelease" changes (in production, in downstream systems) -- Bump the latest version, to be used as the canonical source of truth -- Offer a migration window off the "old" version +- Test "prerelease" changes (in production, in downstream systems). +- Bump the latest version, to be used as the canonical "source of truth." +- Offer a migration window off the "old" version. From 20565fb1a3bb5b75b4a9a5c82609f26557682a0e Mon Sep 17 00:00:00 2001 From: "Leona B. Campbell" <3880403+runleonarun@users.noreply.github.com> Date: Fri, 5 Jan 2024 13:09:02 -0800 Subject: [PATCH 16/31] Apply suggestions from code review --- website/docs/best-practices/how-we-mesh/mesh-4-faqs.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md index f0b9b2be5a8..7070e2b6849 100644 --- a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md +++ b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md @@ -29,7 +29,7 @@ You can use model versions to: -A [model access modifier](/docs/collaborate/govern/model-access) in dbt determines if a model is accessible as an input to other dbt models and projects. It specifies where a model can be referenced using [the `ref` function](https://docs.getdbt.com/reference/dbt-jinja-functions/ref). There are three types of access modifiers: +A [model access modifier](/docs/collaborate/govern/model-access) in dbt determines if a model is accessible as an input to other dbt models and projects. It specifies where a model can be referenced using [the `ref` function](/reference/dbt-jinja-functions/ref). There are three types of access modifiers: 1. **Private:** A model with a private access modifier is only referenceable by models within the same group. This is intended for models that are implementation details and are meant to be used only within a specific group of related models. 2. **Protected:** Models with a protected access modifier can be referenced by any other model within the same dbt project or when the project is installed as a package. This is the default setting for all models, ensuring backward compatibility, especially when groups are assigned to an existing set of models. From 887df9d51917ce99349a561961268f7a863e9c52 Mon Sep 17 00:00:00 2001 From: "Leona B. Campbell" <3880403+runleonarun@users.noreply.github.com> Date: Fri, 5 Jan 2024 13:10:28 -0800 Subject: [PATCH 17/31] Apply suggestions from code review Co-authored-by: Matt Shaver <60105315+matthewshaver@users.noreply.github.com> --- website/docs/best-practices/how-we-mesh/mesh-4-faqs.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md index 7070e2b6849..4f748ebda8f 100644 --- a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md +++ b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md @@ -44,9 +44,9 @@ A [model group](/docs/collaborate/govern/model-access#groups) in dbt is a concep - +Here are some benefits of implementing dbt Mesh: 1. **Shop data products faster**: With a more modular architecture, teams can make changes rapidly and independently in specific areas without impacting the entire system, leading to faster development cycles. -2. **Improve trust in data:** Adopting a dbt Mesh can help ensure that changes in one domain's data models do not unexpectedly break dependencies in other domain areas, leading to a more secure and predictable data environment. +2. **Improve trust in data:** Adopting dbt Mesh helps ensure that changes in one domain's data models do not unexpectedly break dependencies in other domain areas, leading to a more secure and predictable data environment. 3. **Reduce complexity**: By organizing transformation logic into distinct domains, dbt Mesh reduces the complexity inherent in large, monolithic projects, making them easier to manage and understand. 4. **Improve collaboration**: Teams are able to share and build upon each other's work without duplicating efforts. From 6d67df95eb4329628bf7d1b3141cf3b4e035680e Mon Sep 17 00:00:00 2001 From: "Leona B. Campbell" <3880403+runleonarun@users.noreply.github.com> Date: Fri, 5 Jan 2024 13:11:13 -0800 Subject: [PATCH 18/31] Update website/docs/best-practices/how-we-mesh/mesh-4-faqs.md Co-authored-by: Matt Shaver <60105315+matthewshaver@users.noreply.github.com> --- website/docs/best-practices/how-we-mesh/mesh-4-faqs.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md index 4f748ebda8f..d8cfcca8fc0 100644 --- a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md +++ b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md @@ -54,7 +54,7 @@ Most importantly, all this can be accomplished without the central data team los - + This is a new way of working, and the intentionality required to build, and then maintain, cross-project interfaces and dependencies may feel like a slowdown versus what some developers are used to. This way of working introduces intentional friction that makes it more difficult to change everything at once. From 2c57a44c0d774f9e381f2ce1379414a5b2d1ea04 Mon Sep 17 00:00:00 2001 From: "Leona B. Campbell" <3880403+runleonarun@users.noreply.github.com> Date: Fri, 5 Jan 2024 13:21:05 -0800 Subject: [PATCH 19/31] Apply suggestions from code review Co-authored-by: Matt Shaver <60105315+matthewshaver@users.noreply.github.com> --- website/docs/best-practices/how-we-mesh/mesh-4-faqs.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md index d8cfcca8fc0..bf987fedb6a 100644 --- a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md +++ b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md @@ -62,11 +62,11 @@ Orchestration across multiple projects is also likely to be slightly more challe - + -dbt Mesh allows you to better **operationalize** a Data Mesh by enabling decentralized, domain-specific data ownership and collaboration. +dbt Mesh allows you to better _operationalize_ data mesh by enabling decentralized, domain-specific data ownership and collaboration. -In a Data Mesh, each business domain is responsible for its own data as a product. This is the same goal that dbt Mesh facilitates by enabling organizations to break down large, monolithic data projects into smaller, domain-specific dbt projects. Each team or domain can independently develop, maintain, and share their data models, fostering a decentralized data environment. +In data mesh, each business domain is responsible for its data as a product. This is the same goal that dbt Mesh facilitates by enabling organizations to break down large, monolithic data projects into smaller, domain-specific dbt projects. Each team or domain can independently develop, maintain, and share its data models, fostering a decentralized data environment. dbt Mesh also enhances the interoperability and reusability of data across different domains, a key aspect of the Data Mesh philosophy. By allowing cross-project references and shared governance through model contracts and access controls, dbt Mesh ensures that while data ownership is decentralized, there is still a governed structure to the overall data architecture. From b2333588dc14eed37df053514567a4deb4f81d8b Mon Sep 17 00:00:00 2001 From: "Leona B. Campbell" <3880403+runleonarun@users.noreply.github.com> Date: Fri, 5 Jan 2024 13:47:33 -0800 Subject: [PATCH 20/31] Apply suggestions from code review Co-authored-by: Matt Shaver <60105315+matthewshaver@users.noreply.github.com> --- .../best-practices/how-we-mesh/mesh-4-faqs.md | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md index bf987fedb6a..b82045d1c7e 100644 --- a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md +++ b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md @@ -54,7 +54,7 @@ Most importantly, all this can be accomplished without the central data team los - + This is a new way of working, and the intentionality required to build, and then maintain, cross-project interfaces and dependencies may feel like a slowdown versus what some developers are used to. This way of working introduces intentional friction that makes it more difficult to change everything at once. @@ -68,7 +68,7 @@ dbt Mesh allows you to better _operationalize_ data mesh by enabling decentraliz In data mesh, each business domain is responsible for its data as a product. This is the same goal that dbt Mesh facilitates by enabling organizations to break down large, monolithic data projects into smaller, domain-specific dbt projects. Each team or domain can independently develop, maintain, and share its data models, fostering a decentralized data environment. -dbt Mesh also enhances the interoperability and reusability of data across different domains, a key aspect of the Data Mesh philosophy. By allowing cross-project references and shared governance through model contracts and access controls, dbt Mesh ensures that while data ownership is decentralized, there is still a governed structure to the overall data architecture. +dbt Mesh also enhances the interoperability and reusability of data across different domains, a key aspect of the data mesh philosophy. By allowing cross-project references and shared governance through model contracts and access controls, dbt Mesh ensures that while data ownership is decentralized, there is still a governed structure to the overall data architecture. @@ -102,13 +102,13 @@ No, unless downstream projects are installed as [packages](/docs/build/packages) -Yes; as long as they’re in the same data platform (such as BigQuery, Databricks, Redshift, Snowflake, or Starburst) and you have configured permissions and sharing in that data platform provider to allow for this. +Yes, as long as they’re in the same data platform (BigQuery, Databricks, Redshift, Snowflake, etc.) and you have configured permissions and sharing in that data platform provider to allow this. -Yes! because the cross-project collaboration is done using the `{{ ref() }}` macro, you can use those models from other teams in [singular tests](/docs/build/data-tests#singular-data-tests). +Yes, because the cross-project collaboration is done using the `{{ ref() }}` macro, you can use those models from other teams in [singular tests](/docs/build/data-tests#singular-data-tests). @@ -118,8 +118,8 @@ Each team defines their connection to the data warehouse, and the default schema By default, each project belonging to a team will create: -- One schema for production runs (for example, `finance`) -- One schema per developer (for example, `dev_jerco`) +- One schema for production runs (for example, `finance`). +- One schema per developer (for example, `dev_jerco`). Depending on each team’s needs, this can be customized with model-level [schema configurations](/docs/build/custom-schemas), including the ability to define different rules by environment. @@ -127,7 +127,7 @@ Depending on each team’s needs, this can be customized with model-level [schem -No, contracts can currently only be applied at the [model level](/docs/collaborate/govern/model-contracts). It is a recommended best practice to [define staging models](/best-practices/how-we-structure/2-staging) on top of sources, and it is possible to define contracts on top of those staging models. +No, contracts can only be applied at the [model level](/docs/collaborate/govern/model-contracts). It is a recommended best practice to [define staging models](/best-practices/how-we-structure/2-staging) on top of sources, and it is possible to define contracts on top of those staging models. @@ -136,13 +136,13 @@ No, contracts can currently only be applied at the [model level](/docs/collabora No. A contract applies to an entire model, including all columns in the model’s output. This is the same set of columns that a consumer would see when viewing the model’s details in Explorer, or when querying the model in the data platform. - If you wish to contract only a subset of columns, you can create a separate model (materialized as a view) selecting only that subset. -- If you wish to limit which rows or columns a downstream consumer can see when they query the model’s data, depending on who they are, specific data platforms offer advanced capabilities around dynamic row-level access and column-level data masking. +- If you wish to limit which rows or columns a downstream consumer can see when they query the model’s data, depending on who they are, some data platforms offer advanced capabilities around dynamic row-level access and column-level data masking. -No, a [group](/docs/collaborate/govern/model-access#groups) can currently only be assigned to have a single owner. Note, however, that the assigned can be a _team_, not just an individual. +No, a [group](/docs/collaborate/govern/model-access#groups) can only be assigned to a single owner. However, the assigned owner can be a _team_, rather than an individual. From ea247a1626087180e3f5536f9b14ee0e7b3f2ae8 Mon Sep 17 00:00:00 2001 From: "Leona B. Campbell" <3880403+runleonarun@users.noreply.github.com> Date: Fri, 5 Jan 2024 13:48:08 -0800 Subject: [PATCH 21/31] Update website/docs/best-practices/how-we-mesh/mesh-4-faqs.md --- website/docs/best-practices/how-we-mesh/mesh-4-faqs.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md index b82045d1c7e..742d8212a2a 100644 --- a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md +++ b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md @@ -45,7 +45,7 @@ A [model group](/docs/collaborate/govern/model-access#groups) in dbt is a concep Here are some benefits of implementing dbt Mesh: -1. **Shop data products faster**: With a more modular architecture, teams can make changes rapidly and independently in specific areas without impacting the entire system, leading to faster development cycles. +1. **Ship data products faster**: With a more modular architecture, teams can make changes rapidly and independently in specific areas without impacting the entire system, leading to faster development cycles. 2. **Improve trust in data:** Adopting dbt Mesh helps ensure that changes in one domain's data models do not unexpectedly break dependencies in other domain areas, leading to a more secure and predictable data environment. 3. **Reduce complexity**: By organizing transformation logic into distinct domains, dbt Mesh reduces the complexity inherent in large, monolithic projects, making them easier to manage and understand. 4. **Improve collaboration**: Teams are able to share and build upon each other's work without duplicating efforts. From 316e0ba8f18e7903a140b10e579f0598f84d9bdf Mon Sep 17 00:00:00 2001 From: "Leona B. Campbell" <3880403+runleonarun@users.noreply.github.com> Date: Fri, 5 Jan 2024 13:54:39 -0800 Subject: [PATCH 22/31] Update mesh-4-faqs.md --- .../best-practices/how-we-mesh/mesh-4-faqs.md | 22 +++++++++---------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md index 742d8212a2a..8c37ebcd35d 100644 --- a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md +++ b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md @@ -9,6 +9,17 @@ dbt Mesh is a new architecture enabled by dbt Cloud. It allows you to better man ## Overview of Mesh + +Here are some benefits of implementing dbt Mesh: +1. **Ship data products faster**: With a more modular architecture, teams can make changes rapidly and independently in specific areas without impacting the entire system, leading to faster development cycles. +2. **Improve trust in data:** Adopting dbt Mesh helps ensure that changes in one domain's data models do not unexpectedly break dependencies in other domain areas, leading to a more secure and predictable data environment. +3. **Reduce complexity**: By organizing transformation logic into distinct domains, dbt Mesh reduces the complexity inherent in large, monolithic projects, making them easier to manage and understand. +4. **Improve collaboration**: Teams are able to share and build upon each other's work without duplicating efforts. + +Most importantly, all this can be accomplished without the central data team losing the ability to see lineage across the entire organization, or compromising on governance mechanisms. + + + dbt [model contracts](/docs/collaborate/govern/model-contracts) serve as a governance tool enabling the definition and enforcement of data structure standards in your dbt models. They allow you to specify and uphold data model guarantees, including column data types, allowing for the stability of dependent models. Should a model fail to adhere to its established contracts, it will not build successfully. @@ -43,17 +54,6 @@ A [model group](/docs/collaborate/govern/model-access#groups) in dbt is a concep - -Here are some benefits of implementing dbt Mesh: -1. **Ship data products faster**: With a more modular architecture, teams can make changes rapidly and independently in specific areas without impacting the entire system, leading to faster development cycles. -2. **Improve trust in data:** Adopting dbt Mesh helps ensure that changes in one domain's data models do not unexpectedly break dependencies in other domain areas, leading to a more secure and predictable data environment. -3. **Reduce complexity**: By organizing transformation logic into distinct domains, dbt Mesh reduces the complexity inherent in large, monolithic projects, making them easier to manage and understand. -4. **Improve collaboration**: Teams are able to share and build upon each other's work without duplicating efforts. - -Most importantly, all this can be accomplished without the central data team losing the ability to see lineage across the entire organization, or compromising on governance mechanisms. - - - This is a new way of working, and the intentionality required to build, and then maintain, cross-project interfaces and dependencies may feel like a slowdown versus what some developers are used to. This way of working introduces intentional friction that makes it more difficult to change everything at once. From 040dbb33df7d7e7bcbc1bb76ce799ae9a1fa853f Mon Sep 17 00:00:00 2001 From: "Leona B. Campbell" <3880403+runleonarun@users.noreply.github.com> Date: Fri, 5 Jan 2024 14:02:31 -0800 Subject: [PATCH 23/31] Apply suggestions from code review --- website/docs/best-practices/how-we-mesh/mesh-4-faqs.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md index 8c37ebcd35d..820b609402e 100644 --- a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md +++ b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md @@ -56,7 +56,7 @@ A [model group](/docs/collaborate/govern/model-access#groups) in dbt is a concep -This is a new way of working, and the intentionality required to build, and then maintain, cross-project interfaces and dependencies may feel like a slowdown versus what some developers are used to. This way of working introduces intentional friction that makes it more difficult to change everything at once. +This is a new way of working, and the intentionality required to build, and then maintain, cross-project interfaces and dependencies may feel like a slowdown versus what some developers are used to. The intentional friction introduced promotes thoughtful changes, fostering a mindset that values stability and systematic adjustments over rapid transformations. Orchestration across multiple projects is also likely to be slightly more challenging for many organizations, although we’re currently developing new functionality that will make this process simpler. From 43c3c3749120a44bdb62cd0b9a5056928485e69c Mon Sep 17 00:00:00 2001 From: "Leona B. Campbell" <3880403+runleonarun@users.noreply.github.com> Date: Fri, 5 Jan 2024 14:03:43 -0800 Subject: [PATCH 24/31] Update website/docs/best-practices/how-we-mesh/mesh-4-faqs.md Co-authored-by: Matt Shaver <60105315+matthewshaver@users.noreply.github.com> --- website/docs/best-practices/how-we-mesh/mesh-4-faqs.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md index 820b609402e..c036800c33b 100644 --- a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md +++ b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md @@ -160,7 +160,7 @@ This is not currently possible, but something we hope to enable in the near futu -dbt Cloud will soon offer a capability to trigger jobs on the completion of another job, including a job in a different project. This offers one mechanism for executing a pipeline from start to finish across projects. +dbt Cloud will soon offer the capability to trigger jobs on the completion of another job, including a job in a different project. This offers one mechanism for executing a pipeline from start to finish across projects. From 5040a486344a121ac901c0d6a12f1dc845e71803 Mon Sep 17 00:00:00 2001 From: "Leona B. Campbell" <3880403+runleonarun@users.noreply.github.com> Date: Fri, 5 Jan 2024 14:04:06 -0800 Subject: [PATCH 25/31] Update website/docs/best-practices/how-we-mesh/mesh-4-faqs.md Co-authored-by: Matt Shaver <60105315+matthewshaver@users.noreply.github.com> --- website/docs/best-practices/how-we-mesh/mesh-4-faqs.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md index c036800c33b..e9b0c21dcef 100644 --- a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md +++ b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md @@ -200,7 +200,7 @@ Private or protected models require a user to have read-only access on the speci There’s model-level access within dbt; role-based access for users and groups in dbt Cloud; and access to actual underlying data within the data platform. -First things first: access to underlying data is always defined and enforced by the underlying data platform (for example, BigQuery, Databricks, Redshift, Snowflake, Starburst.) This access is managed by executing “DCL statements” (namely `grant`). dbt [makes it easy to configure `grants` on models](/reference/resource-configs/grants), which provision data access for other roles/users/groups in the data warehouse. However, dbt does **not** automatically define or coordinate those grants unless they are configured explicitly. It’s possible your organization prefers to use a separate system for managing data warehouse permissions. +First things first: access to underlying data is always defined and enforced by the underlying data platform (for example, BigQuery, Databricks, Redshift, Snowflake, Starburst, etc.) This access is managed by executing “DCL statements” (namely `grant`). dbt makes it easy to [configure `grants` on models](/reference/resource-configs/grants), which provision data access for other roles/users/groups in the data warehouse. However, dbt does _not_ automatically define or coordinate those grants unless they are configured explicitly. Refer to your organization's system for managing data warehouse permissions. [dbt Cloud Enterprise plans](https://www.getdbt.com/pricing) allow a system of role-based access that manages granular permissions for users and user groups. In this way, you can control which users can see or edit all aspects of a dbt Cloud project. A user’s access on a dbt Cloud projects also informs whether they can “explore” that project in detail. Roles, users, and groups are defined within the dbt Cloud application, via the UI or by integrating with an identity provider. From 8808f50caa13385575091c8c7034f835b91f22df Mon Sep 17 00:00:00 2001 From: "Leona B. Campbell" <3880403+runleonarun@users.noreply.github.com> Date: Fri, 5 Jan 2024 14:04:31 -0800 Subject: [PATCH 26/31] Update website/docs/best-practices/how-we-mesh/mesh-4-faqs.md Co-authored-by: Matt Shaver <60105315+matthewshaver@users.noreply.github.com> --- website/docs/best-practices/how-we-mesh/mesh-4-faqs.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md index e9b0c21dcef..41fa1925bd7 100644 --- a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md +++ b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md @@ -202,7 +202,7 @@ There’s model-level access within dbt; role-based access for users and groups First things first: access to underlying data is always defined and enforced by the underlying data platform (for example, BigQuery, Databricks, Redshift, Snowflake, Starburst, etc.) This access is managed by executing “DCL statements” (namely `grant`). dbt makes it easy to [configure `grants` on models](/reference/resource-configs/grants), which provision data access for other roles/users/groups in the data warehouse. However, dbt does _not_ automatically define or coordinate those grants unless they are configured explicitly. Refer to your organization's system for managing data warehouse permissions. -[dbt Cloud Enterprise plans](https://www.getdbt.com/pricing) allow a system of role-based access that manages granular permissions for users and user groups. In this way, you can control which users can see or edit all aspects of a dbt Cloud project. A user’s access on a dbt Cloud projects also informs whether they can “explore” that project in detail. Roles, users, and groups are defined within the dbt Cloud application, via the UI or by integrating with an identity provider. +[dbt Cloud Enterprise plans](https://www.getdbt.com/pricing) support [role-based access control (RBAC)](/docs/cloud/manage-access/enterprise-permissions#how-to-set-up-rbac-groups-in-dbt-cloud) that manages granular permissions for users and user groups. You can control which users can see or edit all aspects of a dbt Cloud project. A user’s access to dbt Cloud projects also determines whether they can “explore” that project in detail. Roles, users, and groups are defined within the dbt Cloud application via the UI or by integrating with an identity provider. [Model access](/docs/collaborate/govern/model-access) is about defining where models can be **referenced.** It also informs the discoverability of those projects within dbt Explorer. Model `access` is defined in code, just like any other model configuration (`materialized`, `tags`, etc). From 0b36de4de2e5b58bea6aab2bee069d8cdc9897ea Mon Sep 17 00:00:00 2001 From: "Leona B. Campbell" <3880403+runleonarun@users.noreply.github.com> Date: Fri, 5 Jan 2024 14:04:41 -0800 Subject: [PATCH 27/31] Update website/docs/best-practices/how-we-mesh/mesh-4-faqs.md Co-authored-by: Matt Shaver <60105315+matthewshaver@users.noreply.github.com> --- website/docs/best-practices/how-we-mesh/mesh-4-faqs.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md index 41fa1925bd7..9631579cedf 100644 --- a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md +++ b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md @@ -198,7 +198,7 @@ Private or protected models require a user to have read-only access on the speci -There’s model-level access within dbt; role-based access for users and groups in dbt Cloud; and access to actual underlying data within the data platform. +There’s model-level access within dbt, role-based access for users and groups in dbt Cloud, and access to the underlying data within the data platform. First things first: access to underlying data is always defined and enforced by the underlying data platform (for example, BigQuery, Databricks, Redshift, Snowflake, Starburst, etc.) This access is managed by executing “DCL statements” (namely `grant`). dbt makes it easy to [configure `grants` on models](/reference/resource-configs/grants), which provision data access for other roles/users/groups in the data warehouse. However, dbt does _not_ automatically define or coordinate those grants unless they are configured explicitly. Refer to your organization's system for managing data warehouse permissions. From 12619fd7b24caef7112e4b3c6cc35f64f793ee1d Mon Sep 17 00:00:00 2001 From: "Leona B. Campbell" <3880403+runleonarun@users.noreply.github.com> Date: Fri, 5 Jan 2024 14:04:50 -0800 Subject: [PATCH 28/31] Update website/docs/best-practices/how-we-mesh/mesh-4-faqs.md Co-authored-by: Matt Shaver <60105315+matthewshaver@users.noreply.github.com> --- website/docs/best-practices/how-we-mesh/mesh-4-faqs.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md index 9631579cedf..c63f46546e9 100644 --- a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md +++ b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md @@ -188,7 +188,7 @@ We also expose some of this information in dbt Cloud itself in [jobs](/docs/depl ## Permissions and access - + The existence of projects that have at least one public model will be visible to everyone in the organization with [read-only access](/docs/cloud/manage-access/seats-and-users). From bbb15fd299f9427840af3e66af0a168d18a0374c Mon Sep 17 00:00:00 2001 From: "Leona B. Campbell" <3880403+runleonarun@users.noreply.github.com> Date: Fri, 5 Jan 2024 14:05:14 -0800 Subject: [PATCH 29/31] Update website/docs/best-practices/how-we-mesh/mesh-4-faqs.md Co-authored-by: Matt Shaver <60105315+matthewshaver@users.noreply.github.com> --- website/docs/best-practices/how-we-mesh/mesh-4-faqs.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md index c63f46546e9..1773cdf9164 100644 --- a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md +++ b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md @@ -204,7 +204,7 @@ First things first: access to underlying data is always defined and enforced by [dbt Cloud Enterprise plans](https://www.getdbt.com/pricing) support [role-based access control (RBAC)](/docs/cloud/manage-access/enterprise-permissions#how-to-set-up-rbac-groups-in-dbt-cloud) that manages granular permissions for users and user groups. You can control which users can see or edit all aspects of a dbt Cloud project. A user’s access to dbt Cloud projects also determines whether they can “explore” that project in detail. Roles, users, and groups are defined within the dbt Cloud application via the UI or by integrating with an identity provider. -[Model access](/docs/collaborate/govern/model-access) is about defining where models can be **referenced.** It also informs the discoverability of those projects within dbt Explorer. Model `access` is defined in code, just like any other model configuration (`materialized`, `tags`, etc). +[Model access](/docs/collaborate/govern/model-access) defines where models can be referenced. It also informs the discoverability of those projects within dbt Explorer. Model `access` is defined in code, just like any other model configuration (`materialized`, `tags`, etc). **Public:** Models with `public` access can be referenced everywhere. These are the “data products” of your organization. From 04883ef79c1cad0b4975f23b47a2a680f8b0a610 Mon Sep 17 00:00:00 2001 From: "Leona B. Campbell" <3880403+runleonarun@users.noreply.github.com> Date: Fri, 5 Jan 2024 15:18:09 -0800 Subject: [PATCH 30/31] Apply suggestions from code review Co-authored-by: azzam34 <86269359+azzam34@users.noreply.github.com> Co-authored-by: Matt Shaver <60105315+matthewshaver@users.noreply.github.com> --- .../best-practices/how-we-mesh/mesh-4-faqs.md | 23 ++++++++++--------- 1 file changed, 12 insertions(+), 11 deletions(-) diff --git a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md index 1773cdf9164..cd5f5c4e50e 100644 --- a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md +++ b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md @@ -140,7 +140,7 @@ No. A contract applies to an entire model, including all columns in the model’ - + No, a [group](/docs/collaborate/govern/model-access#groups) can only be assigned to a single owner. However, the assigned owner can be a _team_, rather than an individual. @@ -208,7 +208,8 @@ First things first: access to underlying data is always defined and enforced by **Public:** Models with `public` access can be referenced everywhere. These are the “data products” of your organization. -**Protected:** Models with `protected` access can only be referenced within the same project. This is the default level of model access. (We are discussing a future extension to `protected` models that will allow for their reference in *specific* downstream projects. Please read [the GitHub issue](https://github.com/dbt-labs/dbt-core/issues/9340), and upvote/comment if you’re interested in this use case.) +**Protected:** Models with `protected` access can only be referenced within the same project. This is the default level of model access. +We are discussing a future extension to `protected` models to allow for their reference in _specific_ downstream projects. Please read [the GitHub issue](https://github.com/dbt-labs/dbt-core/issues/9340), and upvote/comment if you’re interested in this use case. **Private:** Model `groups` enable more-granular control over where `private` models can be referenced. By defining a group, and configuring models to belong to that group, you can restrict other models (not in the same group) from referencing any `private` models the group contains. Groups also provide a standard mechanism for defining the `owner` of all resources it contains. @@ -218,9 +219,9 @@ Because dbt does not implicitly coordinate data warehouse `grants` with model-le - + -There is not currently! But this is something we may evaluate for the future. +Not currently! But this is something we may evaluate for the future. @@ -248,7 +249,7 @@ The [dbt Semantic Layer](/docs/use-dbt-semantic-layer/dbt-sl) and dbt Mesh are c The Semantic Layer in dbt Cloud allows teams to centrally define business metrics and dimensions. It ensures consistent and reliable metric definitions across various analytics tools and platforms. -dbt Mesh enables organizations to split their data architecture into multiple domain-specific projects, while retaining the ability to reference “public” models across projects. It is also possible to reference a “public” model from another project for the purpose of defining semantic models and metrics. In this way, your organization can have multiple dbt projects feed into a unified semantic layer, ensuring that metrics and dimensions are consistently defined and understood across these different domains. +dbt Mesh enables organizations to split their data architecture into multiple domain-specific projects, while retaining the ability to reference “public” models across projects. It is also possible to reference a “public” model from another project for the purpose of defining semantic models and metrics. Your organization can have multiple dbt projects feed into a unified semantic layer, ensuring that metrics and dimensions are consistently defined and understood across these domains. @@ -270,17 +271,17 @@ The [dbt Cloud CLI](/docs/cloud/cloud-cli-installation) allows users to develop -Yes — your account must be on at least dbt v1.6 to take advantage of [cross-project dependencies](/docs/collaborate/govern/project-dependencies), one of the most crucial underlying capabilities required to implement a dbt Mesh. +Yes, your account must be on [at least dbt v1.6](/docs/dbt-versions/upgrade-core-in-cloud) to take advantage of [cross-project dependencies](/docs/collaborate/govern/project-dependencies), one of the most crucial underlying capabilities required to implement a dbt Mesh. -Not all of them. While dbt Core defines several of the foundational elements for dbt Mesh, dbt Cloud offers an enhanced experience that leverages these elements for scaled collaboration across multiple teams, facilitated by multi-project discovery in dbt Explorer that’s tailored to each user’s individual access. +While dbt Core defines several of the foundational elements for dbt Mesh, dbt Cloud offers an enhanced experience that leverages these elements for scaled collaboration across multiple teams, facilitated by multi-project discovery in dbt Explorer that’s tailored to each user’s access. -Several of the key components that underpin the dbt Mesh pattern — including model contracts, versions, and access modifiers — are defined and implemented in dbt Core. We believe these are components of the core language, which is why their implementations are open source. We want to define a standard pattern that analytics engineers everywhere can adopt, extend, and help us improve. +Several key components that underpin the dbt Mesh pattern, including model contracts, versions, and access modifiers, are defined and implemented in dbt Core. We believe these are components of the core language, which is why their implementations are open source. We want to define a standard pattern that analytics engineers everywhere can adopt, extend, and help us improve. -To reference models defined in another project, users can also leverage a longstanding feature of dbt Core: [packages](/docs/build/packages). By importing an upstream project as a package, dbt will import all models defined in that project, which enables the resolution of cross-project references to those models — [optionally restricted](/docs/collaborate/govern/model-access#how-do-i-restrict-access-to-models-defined-in-a-package) to just the models with `public` access. +To reference models defined in another project, users can also leverage [packages](/docs/build/packages), a longstanding feature of dbt Core. By importing an upstream project as a package, dbt will import all models defined in that project, which enables the resolution of cross-project references to those models. They can be [optionally restricted](/docs/collaborate/govern/model-access#how-do-i-restrict-access-to-models-defined-in-a-package) to just the models with `public` access. The major distinction comes with dbt Cloud's metadata service, which is unique to the dbt Cloud platform and allows for the resolution of references to only the public models in a project. This service enables users to take dependencies on upstream projects, and reference just their `public` models, *without* needing to load the full complexity of those upstream projects into their local development environment. @@ -307,8 +308,8 @@ Refer to our developer guide on [How we structure our dbt Mesh projects](https:/ -Let’s say your organization has fewer than 500 models, and fewer than a dozen regular contributors to dbt. You’re operating at a scale that’s well served by the monolith (a single project), and the larger pattern of dbt Mesh probably isn’t a good fit. +Let’s say your organization has fewer than 500 models and fewer than a dozen regular contributors to dbt. You're operating at a scale well served by the monolith (a single project), and the larger pattern of dbt Mesh probably won't provide any immediate benefits. -That said, it’s *never too early* to think about how you’re organizing models **within** that project. Use model `groups` to define clear ownership boundaries, and `private` access to restrict purpose-built models from becoming load-bearing blocks in an unrelated section of the DAG. Your future selves will thank you for having defined these interfaces, especially if you reach a scale where it makes sense to “graduate” the interfaces between `groups` into boundaries between projects. +It’s never too early to think about how you’re organizing models _within_ that project. Use model `groups` to define clear ownership boundaries and `private` access to restrict purpose-built models from becoming load-bearing blocks in an unrelated section of the DAG. Your future selves will thank you for defining these interfaces, especially if you reach a scale where it makes sense to “graduate” the interfaces between `groups` into boundaries between projects. From ed47fbb7fb6b2d05e29abc92b39b0cae1d6a3a2c Mon Sep 17 00:00:00 2001 From: "Leona B. Campbell" <3880403+runleonarun@users.noreply.github.com> Date: Fri, 5 Jan 2024 15:35:00 -0800 Subject: [PATCH 31/31] Update mesh-4-faqs.md --- website/docs/best-practices/how-we-mesh/mesh-4-faqs.md | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md index cd5f5c4e50e..7119a3d90bd 100644 --- a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md +++ b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md @@ -10,11 +10,13 @@ dbt Mesh is a new architecture enabled by dbt Cloud. It allows you to better man ## Overview of Mesh + Here are some benefits of implementing dbt Mesh: -1. **Ship data products faster**: With a more modular architecture, teams can make changes rapidly and independently in specific areas without impacting the entire system, leading to faster development cycles. -2. **Improve trust in data:** Adopting dbt Mesh helps ensure that changes in one domain's data models do not unexpectedly break dependencies in other domain areas, leading to a more secure and predictable data environment. -3. **Reduce complexity**: By organizing transformation logic into distinct domains, dbt Mesh reduces the complexity inherent in large, monolithic projects, making them easier to manage and understand. -4. **Improve collaboration**: Teams are able to share and build upon each other's work without duplicating efforts. + +* **Ship data products faster**: With a more modular architecture, teams can make changes rapidly and independently in specific areas without impacting the entire system, leading to faster development cycles. +* **Improve trust in data:** Adopting dbt Mesh helps ensure that changes in one domain's data models do not unexpectedly break dependencies in other domain areas, leading to a more secure and predictable data environment. +* **Reduce complexity**: By organizing transformation logic into distinct domains, dbt Mesh reduces the complexity inherent in large, monolithic projects, making them easier to manage and understand. +* **Improve collaboration**: Teams are able to share and build upon each other's work without duplicating efforts. Most importantly, all this can be accomplished without the central data team losing the ability to see lineage across the entire organization, or compromising on governance mechanisms.