diff --git a/contributing/content-style-guide.md b/contributing/content-style-guide.md index 0d2bf243d45..e6fafec4593 100644 --- a/contributing/content-style-guide.md +++ b/contributing/content-style-guide.md @@ -284,7 +284,7 @@ If the list starts getting lengthy and dense, consider presenting the same conte A bulleted list with introductory text: -> A dbt project is a directory of `.sql` and .yml` files. The directory must contain at a minimum: +> A dbt project is a directory of `.sql` and `.yml` files. The directory must contain at a minimum: > > - Models: A model is a single `.sql` file. Each model contains a single `select` statement that either transforms raw data into a dataset that is ready for analytics or, more often, is an intermediate step in such a transformation. > - A project file: A `dbt_project.yml` file, which configures and defines your dbt project. diff --git a/website/blog/2023-12-11-semantic-layer-on-semantic-layer.md b/website/blog/2023-12-11-semantic-layer-on-semantic-layer.md new file mode 100644 index 00000000000..ea77072a6dd --- /dev/null +++ b/website/blog/2023-12-11-semantic-layer-on-semantic-layer.md @@ -0,0 +1,97 @@ +--- +title: "How we built consistent product launch metrics with the dbt Semantic Layer." +description: "We built an end-to-end data pipeline for measuring the launch of the dbt Semantic Layer using the dbt Semantic Layer." +slug: product-analytics-pipeline-with-dbt-semantic-layer + +authors: [jordan_stein] + +tags: [dbt Cloud] +hide_table_of_contents: false + +date: 2023-12-12 +is_featured: false +--- +There’s nothing quite like the feeling of launching a new product. +On launch day emotions can range from excitement, to fear, to accomplishment all in the same hour. +Once the dust settles and the product is in the wild, the next thing the team needs to do is track how the product is doing. +How many users do we have? How is performance looking? What features are customers using? How often? Answering these questions is vital to understanding the success of any product launch. + +At dbt we recently made the [Semantic Layer Generally Available](https://www.getdbt.com/blog/new-dbt-cloud-features-announced-at-coalesce-2023). The Semantic Layer lets teams define business metrics centrally, in dbt, and access them in multiple analytics tools through our semantic layer APIs. +I’m a Product Manager on the Semantic Layer team, and the launch of the Semantic Layer put our team in an interesting, somewhat “meta,” position: we need to understand how a product launch is doing, and the product we just launched is designed to make defining and consuming metrics much more efficient. It’s the perfect opportunity to put the semantic layer through its paces for product analytics. This blog post walks through the end-to-end process we used to set up product analytics for the dbt Semantic Layer using the dbt Semantic Layer. + +## Getting your data ready for metrics + +The first steps to building a product analytics pipeline with the Semantic Layer look the same as just using dbt - it’s all about data transformation. The steps we followed were broadly: + +1. Work with engineering to understand the data sources. In our case, it’s db exports from Semantic Layer Server. +2. Load the data into our warehouse. We use Fivetran and Snowflake. +3. Transform the data into normalized tables with dbt. This step is a classic. dbt’s bread and butter. You probably know the drill by now. + +There are [plenty of other great resources](https://docs.getdbt.com/docs/build/projects) on how to accomplish the above steps, I’m going to skip that in this post and focus on how we built business metrics using the Semantic Layer. Once the data is loaded and modeling is complete, our DAG for the Semantic Layer data looks like the following: + + + + + + + + +Let’s walk through the DAG from left to right: First, we have raw tables from the Semantic Layer Server loaded into our warehouse, next we have staging models where we apply business logic and finally a clean, normalized `fct_semantic_layer_queries` model. Finally, we built a semantic model named `semantic_layer_queries` on top of our normalized fact model. This is a typical DAG for a dbt project that contains semantic objects. Now let’s zoom in to the section of the DAG that contains our semantic layer objects and look in more detail at how we defined our semantic layer product metrics. + +## [How we build semantic models and metrics](https://docs.getdbt.com/best-practices/how-we-build-our-metrics/semantic-layer-1-intro) + +What [is a semantic model](https://docs.getdbt.com/docs/build/semantic-models)? Put simply, semantic models contain the components we need to build metrics. Semantic models are YAML files that live in your dbt project. They contain metadata about your dbt models in a format that MetricFlow, the query builder that powers the semantic layer, can understand. The DAG below in [dbt Explorer](https://docs.getdbt.com/docs/collaborate/explore-projects) shows the metrics we’ve built off of `semantic_layer_queries`. + + + + +Let’s dig into semantic models and metrics a bit more, and explain some of the data modeling decisions we made. First, we needed to decide what model to use as a base for our semantic model. We decide to use`fct_semantic_layer`queries as our base model because defining a semantic model on top of a normalized fact table gives us maximum flexibility to join to other tables. This increased the number of dimensions available, which means we can answer more questions. + +You may wonder: why not just build our metrics on top of raw tables and let MetricFlow figure out the rest? The reality is, that you will almost almost always need to do some form of data modeling to create the data set you want to build your metrics off of. MetricFlow’s job isn’t to do data modeling. The transformation step is done with dbt. + +Next, we had to decide what we wanted to put into our semantic models. Semantic models contain [dimensions](https://docs.getdbt.com/docs/build/dimensions), [measures](https://docs.getdbt.com/docs/build/measures), and [entities](https://docs.getdbt.com/docs/build/entities). We took the following approach to add each of these components: + +- Dimensions: We included all the relevant dimensions in our semantic model that stakeholders might ask for, like the time a query was created, the query status, and booleans showing if a query contained certain elements like a where filter or multiple metrics. +- Entities: We added entities to our semantic model, like dbt cloud environment id. Entities function as join keys in semantic models, which means any other semantic models that have a j[oinable entity](https://docs.getdbt.com/docs/build/join-logic) can be used when querying metrics. +- Measures: Next we added Measures. Measures define the aggregation you want to run on your data. I think of measures as a metric primitive, we’ll use them to build metrics and can reuse them to keep our code [DRY](https://docs.getdbt.com/terms/dry). + +Finally, we reference the measures defined in our semantic model to create metrics. Our initial set of usage metrics are all relatively simple aggregations. For example, the total number of queries run. + +```yaml +## Example of a metric definition +metrics: + - name: queries + description: The total number of queries run + type: simple + label: Semantic Layer Queries + type_params: + measure: queries +``` + +Having our metrics in the semantic layer is powerful in a few ways. Firstly, metric definitions and the generated SQL are centralized, and live in our dbt project, instead of being scattered across BI tools or sql clients. Secondly, the types of queries I can run are dynamic and flexible. Traditionally, I would materialize a cube or rollup table which needs to contain all the different dimensional slices my users might be curious about. Now, users can join tables and add dimensionality to their metrics queries on the fly at query time, saving our data team cycles of updating and adding new fields to rollup tables. Thirdly, we can expose these metrics to a variety of downstream BI tools so stakeholders in product, finance, or GTM can understand product performance regardless of their technical skills. + +Now that we’ve done the pipeline work to set up our metrics for the semantic layer launch we’re ready to analyze how the launch went! + +## Our Finance, Operations and GTM teams are all looking at the same metrics 😊 + +To query to Semantic Layer you have two paths: you can query metrics directly through the Semantic Layer APIs or use one of our [first-class integrations](https://docs.getdbt.com/docs/use-dbt-semantic-layer/avail-sl-integrations). Our analytics team and product teams are big Hex users, while our operations and finance teams live and breathe Google Sheets, so it’s important for us to have the same metric definitions available in both tools. + +The leg work of building our pipeline and defining metrics is all done, which makes last-mile consumption much easier. First, we set up a launch dashboard in Hex as the source of truth for semantic layer product metrics. This tool is used by cross-functional partners like marketing, sales, and the executive team to easily check product and usage metrics like total semantic layer queries, or weekly active semantic layer users. To set up our Hex connection, we simply enter a few details from our dbt Cloud environment and then we can work with metrics directly in Hex notebooks. We can use the JDBC interface, or use Hex’s GUI metric builder to build reports. We run all our WBRs off this dashboard, which allows us to spot trends in consumption and react quickly to changes in our business. + + + + + +On the finance and operations side, product usage data is crucial to making informed pricing decisions. All our pricing models are created in spreadsheets, so we leverage the Google Sheets integration to give those teams access to consistent data sets without the need to download CSVs from the Hex dashboard. This lets the Pricing team add dimensional slices, like tier and company size, to the data in a self-serve manner without having to request data team resources to generate those insights. This allows our finance team to iteratively build financial models and be more self-sufficient in pulling data, instead of relying on data team resources. + + + + + +As a former data scientist and data engineer, I personally think this is a huge improvement over the approach I would have used without the semantic layer. My old approach would have been to materialize One Big Table with all the numeric and categorical columns I needed for analysis. Then write a ton of SQL in Hex or various notebooks to create reports for stakeholders. Inevitably I’m signing up for more development cycles to update the pipeline whenever a new dimension needs to be added or the data needs to be aggregated in a slightly different way. From a data team management perspective, using a central semantic layer saves data analysts cycles since users can more easily self-serve. At every company I’ve ever worked at, data analysts are always in high demand, with more requests than they can reasonably accomplish. This means any time a stakeholder can self-serve their data without pulling us in is a huge win. + +## The Result: Consistent Governed Metrics + +And just like that, we have an end-to-end pipeline for product analytics on the dbt Semantic Layer using the dbt Semantic Layer 🤯. Part of the foundational work to build this pipeline will be familiar to you, like building out a normalized fact table using dbt. Hopefully walking through the next step of adding semantic models and metrics on top of those dbt models helped give you some ideas about how you can use the semantic layer for your team. Having launch metrics defined in dbt made keeping the entire organization up to date on product adoption and performance much easier. Instead of a rollup table or static materialized cubes, we added flexible metrics without rewriting logic in SQL, or adding additional tables to the end of our DAG. + +The result is access to consistent and governed metrics in the tool our stakeholders are already using to do their jobs. We are able to keep the entire organization aligned and give them access to consistent, accurate data they need to do their part to make the semantic layer product successful. Thanks for reading! If you’re thinking of using the semantic layer, or have questions we’re always happy to keep the conversation going in the [dbt community slack.](https://www.getdbt.com/community/join-the-community) Drop us a note in #dbt-cloud-semantic-layer. We’d love to hear from you! diff --git a/website/blog/authors.yml b/website/blog/authors.yml index 31d69824ed4..cd2bd162935 100644 --- a/website/blog/authors.yml +++ b/website/blog/authors.yml @@ -287,6 +287,14 @@ jonathan_natkins: url: https://twitter.com/nattyice name: Jon "Natty" Natkins organization: dbt Labs +jordan_stein: + image_url: /img/blog/authors/jordan.jpeg + job_title: Product Manager + links: + - icon: fa-linkedin + url: https://www.linkedin.com/in/jstein5/ + name: Jordan Stein + organization: dbt Labs josh_fell: image_url: /img/blog/authors/josh-fell.jpeg diff --git a/website/docs/best-practices/dbt-unity-catalog-best-practices.md b/website/docs/best-practices/dbt-unity-catalog-best-practices.md index 89153fe1b86..a55e1d121af 100644 --- a/website/docs/best-practices/dbt-unity-catalog-best-practices.md +++ b/website/docs/best-practices/dbt-unity-catalog-best-practices.md @@ -49,9 +49,9 @@ The **prod** service principal should have “read” access to raw source data, | | Source Data | Development catalog | Production catalog | Test catalog | | --- | --- | --- | --- | --- | -| developers | use | use, create table & create view | use or none | none | -| production service principal | use | none | use, create table & create view | none | -| Test service principal | use | none | none | use, create table & create view | +| developers | use | use, create schema, table, & view | use or none | none | +| production service principal | use | none | use, create schema, table & view | none | +| Test service principal | use | none | none | use, create schema, table & view | ## Next steps diff --git a/website/docs/best-practices/how-we-build-our-metrics/semantic-layer-1-intro.md b/website/docs/best-practices/how-we-build-our-metrics/semantic-layer-1-intro.md index 19c6717063c..ee3d4262882 100644 --- a/website/docs/best-practices/how-we-build-our-metrics/semantic-layer-1-intro.md +++ b/website/docs/best-practices/how-we-build-our-metrics/semantic-layer-1-intro.md @@ -4,10 +4,6 @@ description: Getting started with the dbt and MetricFlow hoverSnippet: Learn how to get started with the dbt and MetricFlow --- -:::tip -**This is a guide for a beta product.** We anticipate this guide will evolve alongside the Semantic Layer through community collaboration. We welcome discussions, ideas, issues, and contributions to refining best practices. -::: - Flying cars, hoverboards, and true self-service analytics: this is the future we were promised. The first two might still be a few years out, but real self-service analytics is here today. With dbt Cloud's Semantic Layer, you can resolve the tension between accuracy and flexibility that has hampered analytics tools for years, empowering everybody in your organization to explore a shared reality of metrics. Best of all for analytics engineers, building with these new tools will significantly [DRY](https://docs.getdbt.com/terms/dry) up and simplify your codebase. As you'll see, the deep interaction between your dbt models and the Semantic Layer make your dbt project the ideal place to craft your metrics. ## Learning goals diff --git a/website/docs/best-practices/materializations/materializations-guide-4-incremental-models.md b/website/docs/best-practices/materializations/materializations-guide-4-incremental-models.md index 71b24ef58f2..a195812ff1c 100644 --- a/website/docs/best-practices/materializations/materializations-guide-4-incremental-models.md +++ b/website/docs/best-practices/materializations/materializations-guide-4-incremental-models.md @@ -7,7 +7,7 @@ displayText: Materializations best practices hoverSnippet: Read this guide to understand the incremental models you can create in dbt. --- -So far we’ve looked at tables and views, which map to the traditional objects in the data warehouse. As mentioned earlier, incremental models are a little different. This where we start to deviate from this pattern with more powerful and complex materializations. +So far we’ve looked at tables and views, which map to the traditional objects in the data warehouse. As mentioned earlier, incremental models are a little different. This is where we start to deviate from this pattern with more powerful and complex materializations. - 📚 **Incremental models generate tables.** They physically persist the data itself to the warehouse, just piece by piece. What’s different is **how we build that table**. - 💅 **Only apply our transformations to rows of data with new or updated information**, this maximizes efficiency. @@ -53,7 +53,7 @@ where updated_at > (select max(updated_at) from {{ this }}) ``` -Let’s break down that `where` clause a bit, because this where the action is with incremental models. Stepping through the code **_right-to-left_** we: +Let’s break down that `where` clause a bit, because this is where the action is with incremental models. Stepping through the code **_right-to-left_** we: 1. Get our **cutoff.** 1. Select the `max(updated_at)` timestamp — the **most recent record** @@ -138,7 +138,7 @@ where {% endif %} ``` -Fantastic! We’ve got a working incremental model. On our first run, when there is no corresponding table in the warehouse, `is_incremental` will evaluate to false and we’ll capture the entire table. On subsequent runs is it will evaluate to true and we’ll apply our filter logic, capturing only the newer data. +Fantastic! We’ve got a working incremental model. On our first run, when there is no corresponding table in the warehouse, `is_incremental` will evaluate to false and we’ll capture the entire table. On subsequent runs it will evaluate to true and we’ll apply our filter logic, capturing only the newer data. ### Late arriving facts diff --git a/website/docs/community/resources/oss-expectations.md b/website/docs/community/resources/oss-expectations.md index 9c916de1240..a57df7fe67f 100644 --- a/website/docs/community/resources/oss-expectations.md +++ b/website/docs/community/resources/oss-expectations.md @@ -94,6 +94,8 @@ PRs are your surest way to make the change you want to see in dbt / packages / d **Every PR should be associated with an issue.** Why? Before you spend a lot of time working on a contribution, we want to make sure that your proposal will be accepted. You should open an issue first, describing your desired outcome and outlining your planned change. If you've found an older issue that's already open, comment on it with an outline for your planned implementation. Exception to this rule: If you're just opening a PR for a cosmetic fix, such as a typo in documentation, an issue isn't needed. +**PRs should include robust testing.** Comprehensive testing within pull requests is crucial for the stability of our project. By prioritizing robust testing, we ensure the reliability of our codebase, minimize unforeseen issues and safeguard against potential regressions. We understand that creating thorough tests often requires significant effort, and your dedication to this process greatly contributes to the project's overall reliability. Thank you for your commitment to maintaining the integrity of our codebase!" + **Our goal is to review most new PRs within 7 days.** The first review will include some high-level comments about the implementation, including (at a high level) whether it’s something we think suitable to merge. Depending on the scope of the PR, the first review may include line-level code suggestions, or we may delay specific code review until the PR is more finalized / until we have more time. **Automation that can help us:** Many repositories have a template for pull request descriptions, which will include a checklist that must be completed before the PR can be merged. You don’t have to do all of these things to get an initial PR, but they definitely help. Those many include things like: diff --git a/website/docs/community/spotlight/alison-stanton.md b/website/docs/community/spotlight/alison-stanton.md index 2054f78b0b7..fd4a1796411 100644 --- a/website/docs/community/spotlight/alison-stanton.md +++ b/website/docs/community/spotlight/alison-stanton.md @@ -17,6 +17,7 @@ socialLinks: link: https://github.com/alison985/ dateCreated: 2023-11-07 hide_table_of_contents: true +communityAward: true --- ## When did you join the dbt community and in what way has it impacted your career? diff --git a/website/docs/community/spotlight/bruno-de-lima.md b/website/docs/community/spotlight/bruno-de-lima.md index 0365ee6c6a8..3cce6135ae0 100644 --- a/website/docs/community/spotlight/bruno-de-lima.md +++ b/website/docs/community/spotlight/bruno-de-lima.md @@ -20,6 +20,7 @@ socialLinks: link: https://medium.com/@bruno.szdl dateCreated: 2023-11-05 hide_table_of_contents: true +communityAward: true --- ## When did you join the dbt community and in what way has it impacted your career? diff --git a/website/docs/community/spotlight/dakota-kelley.md b/website/docs/community/spotlight/dakota-kelley.md index 57834d9cdff..85b79f0e85a 100644 --- a/website/docs/community/spotlight/dakota-kelley.md +++ b/website/docs/community/spotlight/dakota-kelley.md @@ -15,6 +15,7 @@ socialLinks: link: https://www.linkedin.com/in/dakota-kelley/ dateCreated: 2023-11-08 hide_table_of_contents: true +communityAward: true --- ## When did you join the dbt community and in what way has it impacted your career? diff --git a/website/docs/community/spotlight/fabiyi-opeyemi.md b/website/docs/community/spotlight/fabiyi-opeyemi.md index f67ff4aaefc..67efd90e1c5 100644 --- a/website/docs/community/spotlight/fabiyi-opeyemi.md +++ b/website/docs/community/spotlight/fabiyi-opeyemi.md @@ -18,6 +18,7 @@ socialLinks: link: https://www.linkedin.com/in/opeyemifabiyi/ dateCreated: 2023-11-06 hide_table_of_contents: true +communityAward: true --- ## When did you join the dbt community and in what way has it impacted your career? diff --git a/website/docs/community/spotlight/josh-devlin.md b/website/docs/community/spotlight/josh-devlin.md index d8a9b91c282..6036105940b 100644 --- a/website/docs/community/spotlight/josh-devlin.md +++ b/website/docs/community/spotlight/josh-devlin.md @@ -23,6 +23,7 @@ socialLinks: link: https://www.linkedin.com/in/josh-devlin/ dateCreated: 2023-11-10 hide_table_of_contents: true +communityAward: true --- ## When did you join the dbt community and in what way has it impacted your career? diff --git a/website/docs/community/spotlight/karen-hsieh.md b/website/docs/community/spotlight/karen-hsieh.md index 5147f39ce59..22d6915baf7 100644 --- a/website/docs/community/spotlight/karen-hsieh.md +++ b/website/docs/community/spotlight/karen-hsieh.md @@ -24,6 +24,7 @@ socialLinks: link: https://medium.com/@ijacwei dateCreated: 2023-11-04 hide_table_of_contents: true +communityAward: true --- ## When did you join the dbt community and in what way has it impacted your career? diff --git a/website/docs/community/spotlight/oliver-cramer.md b/website/docs/community/spotlight/oliver-cramer.md index bfd62db0908..7e9974a8a2c 100644 --- a/website/docs/community/spotlight/oliver-cramer.md +++ b/website/docs/community/spotlight/oliver-cramer.md @@ -16,6 +16,7 @@ socialLinks: link: https://www.linkedin.com/in/oliver-cramer/ dateCreated: 2023-11-02 hide_table_of_contents: true +communityAward: true --- ## When did you join the dbt community and in what way has it impacted your career? diff --git a/website/docs/community/spotlight/sam-debruyn.md b/website/docs/community/spotlight/sam-debruyn.md index 166adf58b09..24ce7aa1b15 100644 --- a/website/docs/community/spotlight/sam-debruyn.md +++ b/website/docs/community/spotlight/sam-debruyn.md @@ -18,6 +18,7 @@ socialLinks: link: https://debruyn.dev/ dateCreated: 2023-11-03 hide_table_of_contents: true +communityAward: true --- ## When did you join the dbt community and in what way has it impacted your career? diff --git a/website/docs/community/spotlight/stacy-lo.md b/website/docs/community/spotlight/stacy-lo.md index f0b70fcc225..23a5491dd18 100644 --- a/website/docs/community/spotlight/stacy-lo.md +++ b/website/docs/community/spotlight/stacy-lo.md @@ -17,6 +17,7 @@ socialLinks: link: https://www.linkedin.com/in/olycats/ dateCreated: 2023-11-01 hide_table_of_contents: true +communityAward: true --- ## When did you join the dbt community and in what way has it impacted your career? diff --git a/website/docs/community/spotlight/sydney-burns.md b/website/docs/community/spotlight/sydney-burns.md index ecebd6cdec0..25278ac6ecf 100644 --- a/website/docs/community/spotlight/sydney-burns.md +++ b/website/docs/community/spotlight/sydney-burns.md @@ -15,6 +15,7 @@ socialLinks: link: https://www.linkedin.com/in/sydneyeburns/ dateCreated: 2023-11-09 hide_table_of_contents: true +communityAward: true --- ## When did you join the dbt community and in what way has it impacted your career? diff --git a/website/docs/docs/build/dimensions.md b/website/docs/docs/build/dimensions.md index 683ff730d3c..3c4edd9aef0 100644 --- a/website/docs/docs/build/dimensions.md +++ b/website/docs/docs/build/dimensions.md @@ -252,7 +252,8 @@ This example shows how to create slowly changing dimensions (SCD) using a semant | 333 | 2 | 2020-08-19 | 2021-10-22| | 333 | 3 | 2021-10-22 | 2048-01-01| -Take note of the extra arguments under `validity_params`: `is_start` and `is_end`. These arguments indicate the columns in the SCD table that contain the start and end dates for each tier (or beginning or ending timestamp column for a dimensional value). + +The `validity_params` include two important arguments — `is_start` and `is_end`. These specify the columns in the SCD table that mark the start and end dates (or timestamps) for each tier or dimension. Additionally, the entity is tagged as `natural` to differentiate it from a `primary` entity. In a `primary` entity, each entity value has one row. In contrast, a `natural` entity has one row for each combination of entity value and its validity period. ```yaml semantic_models: @@ -280,9 +281,11 @@ semantic_models: - name: tier type: categorical + primary_entity: sales_person + entities: - name: sales_person - type: primary + type: natural expr: sales_person_id ``` diff --git a/website/docs/docs/build/incremental-models.md b/website/docs/docs/build/incremental-models.md index 3a597499f04..01a392c12fe 100644 --- a/website/docs/docs/build/incremental-models.md +++ b/website/docs/docs/build/incremental-models.md @@ -255,10 +255,10 @@ The `merge` strategy is available in dbt-postgres and dbt-redshift beginning in | [dbt-postgres](/reference/resource-configs/postgres-configs#incremental-materialization-strategies) | `append` | `delete+insert` | | [dbt-redshift](/reference/resource-configs/redshift-configs#incremental-materialization-strategies) | `append` | `delete+insert` | | [dbt-bigquery](/reference/resource-configs/bigquery-configs#merge-behavior-incremental-models) | `merge` | `insert_overwrite` | -| [dbt-spark](/reference/resource-configs/spark-configs#incremental-models) | `append` | `merge` (Delta only) `insert_overwrite` | -| [dbt-databricks](/reference/resource-configs/databricks-configs#incremental-models) | `append` | `merge` (Delta only) `insert_overwrite` | +| [dbt-spark](/reference/resource-configs/spark-configs#incremental-models) | `append` | `merge`, `insert_overwrite` | +| [dbt-databricks](/reference/resource-configs/databricks-configs#incremental-models) | `merge` | `append`, `insert_overwrite` | | [dbt-snowflake](/reference/resource-configs/snowflake-configs#merge-behavior-incremental-models) | `merge` | `append`, `delete+insert` | -| [dbt-trino](/reference/resource-configs/trino-configs#incremental) | `append` | `merge` `delete+insert` | +| [dbt-trino](/reference/resource-configs/trino-configs#incremental) | `append` | `merge`, `delete+insert` | @@ -267,13 +267,13 @@ The `merge` strategy is available in dbt-postgres and dbt-redshift beginning in | data platform adapter | default strategy | additional supported strategies | | :----------------- | :----------------| : ---------------------------------- | -| [dbt-postgres](/reference/resource-configs/postgres-configs#incremental-materialization-strategies) | `append` | `merge` , `delete+insert` | -| [dbt-redshift](/reference/resource-configs/redshift-configs#incremental-materialization-strategies) | `append` | `merge`, `delete+insert` | +| [dbt-postgres](/reference/resource-configs/postgres-configs#incremental-materialization-strategies) | `append` | `merge` , `delete+insert` | +| [dbt-redshift](/reference/resource-configs/redshift-configs#incremental-materialization-strategies) | `append` | `merge`, `delete+insert` | | [dbt-bigquery](/reference/resource-configs/bigquery-configs#merge-behavior-incremental-models) | `merge` | `insert_overwrite` | -| [dbt-spark](/reference/resource-configs/spark-configs#incremental-models) | `append` | `merge` (Delta only) `insert_overwrite` | -| [dbt-databricks](/reference/resource-configs/databricks-configs#incremental-models) | `append` | `merge` (Delta only) `insert_overwrite` | +| [dbt-spark](/reference/resource-configs/spark-configs#incremental-models) | `append` | `merge`, `insert_overwrite` | +| [dbt-databricks](/reference/resource-configs/databricks-configs#incremental-models) | `merge` | `append`, `insert_overwrite` | | [dbt-snowflake](/reference/resource-configs/snowflake-configs#merge-behavior-incremental-models) | `merge` | `append`, `delete+insert` | -| [dbt-trino](/reference/resource-configs/trino-configs#incremental) | `append` | `merge` `delete+insert` | +| [dbt-trino](/reference/resource-configs/trino-configs#incremental) | `append` | `merge`, `delete+insert` | diff --git a/website/docs/docs/build/join-logic.md b/website/docs/docs/build/join-logic.md index 9039822c9fd..29b9d101a59 100644 --- a/website/docs/docs/build/join-logic.md +++ b/website/docs/docs/build/join-logic.md @@ -84,10 +84,6 @@ mf query --metrics average_purchase_price --dimensions metric_time,user_id__type ## Multi-hop joins -:::info -This feature is currently in development and not currently available. -::: - MetricFlow allows users to join measures and dimensions across a graph of entities, which we refer to as a 'multi-hop join.' This is because users can move from one table to another like a 'hop' within a graph. Here's an example schema for reference: @@ -134,9 +130,6 @@ semantic_models: ### Query multi-hop joins -:::info -This feature is currently in development and not currently available. -::: To query dimensions _without_ a multi-hop join involved, you can use the fully qualified dimension name with the syntax entity double underscore (dunder) dimension, like `entity__dimension`. diff --git a/website/docs/docs/build/metricflow-commands.md b/website/docs/docs/build/metricflow-commands.md index 7e535e4ea62..e3bb93da964 100644 --- a/website/docs/docs/build/metricflow-commands.md +++ b/website/docs/docs/build/metricflow-commands.md @@ -556,3 +556,8 @@ Keep in mind that modifying your shell configuration files can have an impact on +
+Why is my query limited to 100 rows in the dbt Cloud CLI? +The default limit for query issues from the dbt Cloud CLI is 100 rows. We set this default to prevent returning unnecessarily large data sets as the dbt Cloud CLI is typically used to query the dbt Semantic Layer during the development process, not for production reporting or to access large data sets. For most workflows, you only need to return a subset of the data.

+However, you can change this limit if needed by setting the --limit option in your query. For example, to return 1000 rows, you can run dbt sl list metrics --limit 1000. +
diff --git a/website/docs/docs/build/semantic-models.md b/website/docs/docs/build/semantic-models.md index 09f808d7a17..5c6883cdcee 100644 --- a/website/docs/docs/build/semantic-models.md +++ b/website/docs/docs/build/semantic-models.md @@ -43,7 +43,7 @@ semantic_models: - name: the_name_of_the_semantic_model ## Required description: same as always ## Optional model: ref('some_model') ## Required - default: ## Required + defaults: ## Required agg_time_dimension: dimension_name ## Required if the model contains dimensions entities: ## Required - see more information in entities diff --git a/website/docs/docs/cloud/about-cloud/regions-ip-addresses.md b/website/docs/docs/cloud/about-cloud/regions-ip-addresses.md index cc1c2531f56..119201b389d 100644 --- a/website/docs/docs/cloud/about-cloud/regions-ip-addresses.md +++ b/website/docs/docs/cloud/about-cloud/regions-ip-addresses.md @@ -11,8 +11,8 @@ dbt Cloud is [hosted](/docs/cloud/about-cloud/architecture) in multiple regions | Region | Location | Access URL | IP addresses | Developer plan | Team plan | Enterprise plan | |--------|----------|------------|--------------|----------------|-----------|-----------------| -| North America multi-tenant [^1] | AWS us-east-1 (N. Virginia) | cloud.getdbt.com | 52.45.144.63
54.81.134.249
52.22.161.231 | ✅ | ✅ | ✅ | -| North America Cell 1 [^1] | AWS us-east-1 (N.Virginia) | {account prefix}.us1.dbt.com | [Located in Account Settings](#locating-your-dbt-cloud-ip-addresses) | ❌ | ❌ | ✅ | +| North America multi-tenant [^1] | AWS us-east-1 (N. Virginia) | cloud.getdbt.com | 52.45.144.63
54.81.134.249
52.22.161.231
52.3.77.232
3.214.191.130
34.233.79.135 | ✅ | ✅ | ✅ | +| North America Cell 1 [^1] | AWS us-east-1 (N. Virginia) | {account prefix}.us1.dbt.com | 52.45.144.63
54.81.134.249
52.22.161.231
52.3.77.232
3.214.191.130
34.233.79.135 | ✅ | ❌ | ✅ | | EMEA [^1] | AWS eu-central-1 (Frankfurt) | emea.dbt.com | 3.123.45.39
3.126.140.248
3.72.153.148 | ❌ | ❌ | ✅ | | APAC [^1] | AWS ap-southeast-2 (Sydney)| au.dbt.com | 52.65.89.235
3.106.40.33
13.239.155.206
| ❌ | ❌ | ✅ | | Virtual Private dbt or Single tenant | Customized | Customized | Ask [Support](/community/resources/getting-help#dbt-cloud-support) for your IPs | ❌ | ❌ | ✅ | diff --git a/website/docs/docs/cloud/cloud-cli-installation.md b/website/docs/docs/cloud/cloud-cli-installation.md index f3294477611..8d2196696aa 100644 --- a/website/docs/docs/cloud/cloud-cli-installation.md +++ b/website/docs/docs/cloud/cloud-cli-installation.md @@ -26,7 +26,7 @@ dbt commands are run against dbt Cloud's infrastructure and benefit from: The dbt Cloud CLI is available in all [deployment regions](/docs/cloud/about-cloud/regions-ip-addresses) and for both multi-tenant and single-tenant accounts (Azure single-tenant not supported at this time). - Ensure you are using dbt version 1.5 or higher. Refer to [dbt Cloud versions](/docs/dbt-versions/upgrade-core-in-cloud) to upgrade. -- Note that SSH tunneling for [Postgres and Redshift](/docs/cloud/connect-data-platform/connect-redshift-postgresql-alloydb) connections and [Single sign-on (SSO)](/docs/cloud/manage-access/sso-overview) doesn't support the dbt Cloud CLI yet. +- Note that SSH tunneling for [Postgres and Redshift](/docs/cloud/connect-data-platform/connect-redshift-postgresql-alloydb) connections doesn't support the dbt Cloud CLI yet. ## Install dbt Cloud CLI diff --git a/website/docs/docs/cloud/manage-access/audit-log.md b/website/docs/docs/cloud/manage-access/audit-log.md index b90bceef570..774400529e9 100644 --- a/website/docs/docs/cloud/manage-access/audit-log.md +++ b/website/docs/docs/cloud/manage-access/audit-log.md @@ -34,7 +34,7 @@ On the audit log page, you will see a list of various events and their associate Click the event card to see the details about the activity that triggered the event. This view provides important details, including when it happened and what type of event was triggered. For example, if someone changes the settings for a job, you can use the event details to see which job was changed (type of event: `v1.events.job_definition.Changed`), by whom (person who triggered the event: `actor`), and when (time it was triggered: `created_at_utc`). For types of events and their descriptions, see [Events in audit log](#events-in-audit-log). -The event details provides the key factors of an event: +The event details provide the key factors of an event: | Name | Description | | -------------------- | --------------------------------------------- | @@ -160,16 +160,22 @@ The audit log supports various events for different objects in dbt Cloud. You wi You can search the audit log to find a specific event or actor, which is limited to the ones listed in [Events in audit log](#events-in-audit-log). The audit log successfully lists historical events spanning the last 90 days. You can search for an actor or event using the search bar, and then narrow your results using the time window. - + ## Exporting logs You can use the audit log to export all historical audit results for security, compliance, and analysis purposes: -- For events within 90 days — dbt Cloud will automatically display the 90-day selectable date range. Select **Export Selection** to download a CSV file of all the events that occurred in your organization within 90 days. -- For events beyond 90 days — Select **Export All**. The Account Admin will receive an email link to download a CSV file of all the events that occurred in your organization. +- **For events within 90 days** — dbt Cloud will automatically display the 90-day selectable date range. Select **Export Selection** to download a CSV file of all the events that occurred in your organization within 90 days. - +- **For events beyond 90 days** — Select **Export All**. The Account Admin will receive an email link to download a CSV file of all the events that occurred in your organization. + +### Azure Single-tenant + +For users deployed in [Azure single tenant](/docs/cloud/about-cloud/tenancy), while the **Export All** button isn't available, you can conveniently use specific APIs to access all events: + +- [Get recent audit log events CSV](/dbt-cloud/api-v3#/operations/Get%20Recent%20Audit%20Log%20Events%20CSV) — This API returns all events in a single CSV without pagination. +- [List recent audit log events](/dbt-cloud/api-v3#/operations/List%20Recent%20Audit%20Log%20Events) — This API returns a limited number of events at a time, which means you will need to paginate the results. diff --git a/website/docs/docs/cloud/manage-access/sso-overview.md b/website/docs/docs/cloud/manage-access/sso-overview.md index f613df7907e..b4954955c8c 100644 --- a/website/docs/docs/cloud/manage-access/sso-overview.md +++ b/website/docs/docs/cloud/manage-access/sso-overview.md @@ -57,8 +57,9 @@ Non-admin users that currently login with a password will no longer be able to d ### Security best practices There are a few scenarios that might require you to login with a password. We recommend these security best-practices for the two most common scenarios: -* **Onboarding partners and contractors** - We highly recommend that you add partners and contractors to your Identity Provider. IdPs like Okta and Azure Active Directory (AAD) offer capabilities explicitly for temporary employees. We highly recommend that you reach out to your IT team to provision an SSO license for these situations. Using an IdP highly secure, reduces any breach risk, and significantly increases the security posture of your dbt Cloud environment. -* **Identity Provider is down -** Account admins will continue to be able to log in with a password which would allow them to work with your Identity Provider to troubleshoot the problem. +* **Onboarding partners and contractors** — We highly recommend that you add partners and contractors to your Identity Provider. IdPs like Okta and Azure Active Directory (AAD) offer capabilities explicitly for temporary employees. We highly recommend that you reach out to your IT team to provision an SSO license for these situations. Using an IdP highly secure, reduces any breach risk, and significantly increases the security posture of your dbt Cloud environment. +* **Identity Provider is down** — Account admins will continue to be able to log in with a password which would allow them to work with your Identity Provider to troubleshoot the problem. +* **Offboarding admins** — When offboarding admins, revoke access to dbt Cloud by deleting the user from your environment; otherwise, they can continue to use username/password credentials to log in. ### Next steps for non-admin users currently logging in with passwords @@ -67,4 +68,5 @@ If you have any non-admin users logging into dbt Cloud with a password today: 1. Ensure that all users have a user account in your identity provider and are assigned dbt Cloud so they won’t lose access. 2. Alert all dbt Cloud users that they won’t be able to use a password for logging in anymore unless they are already an Admin with a password. 3. We **DO NOT** recommend promoting any users to Admins just to preserve password-based logins because you will reduce security of your dbt Cloud environment. -** + + diff --git a/website/docs/docs/cloud/secure/about-privatelink.md b/website/docs/docs/cloud/secure/about-privatelink.md index b31e4c08a26..2134ab25cfe 100644 --- a/website/docs/docs/cloud/secure/about-privatelink.md +++ b/website/docs/docs/cloud/secure/about-privatelink.md @@ -23,3 +23,4 @@ dbt Cloud supports the following data platforms for use with the PrivateLink fea - [Databricks](/docs/cloud/secure/databricks-privatelink) - [Redshift](/docs/cloud/secure/redshift-privatelink) - [Postgres](/docs/cloud/secure/postgres-privatelink) +- [VCS](/docs/cloud/secure/vcs-privatelink) diff --git a/website/docs/docs/collaborate/explore-projects.md b/website/docs/docs/collaborate/explore-projects.md index 05326016fab..78fe6f45cc7 100644 --- a/website/docs/docs/collaborate/explore-projects.md +++ b/website/docs/docs/collaborate/explore-projects.md @@ -2,7 +2,7 @@ title: "Explore your dbt projects" sidebar_label: "Explore dbt projects" description: "Learn about dbt Explorer and how to interact with it to understand, improve, and leverage your data pipelines." -pagination_next: "docs/collaborate/explore-multiple-projects" +pagination_next: "docs/collaborate/model-performance" pagination_prev: null --- @@ -36,7 +36,7 @@ For a richer experience with dbt Explorer, you must: - Run [dbt source freshness](/reference/commands/source#dbt-source-freshness) within a job in the environment to view source freshness data. - Run [dbt snapshot](/reference/commands/snapshot) or [dbt build](/reference/commands/build) within a job in the environment to view snapshot details. -Richer and more timely metadata will become available as dbt, the Discovery API, and the underlying dbt Cloud platform evolves. +Richer and more timely metadata will become available as dbt Core, the Discovery API, and the underlying dbt Cloud platform evolves. ## Explore your project's lineage graph {#project-lineage} @@ -46,6 +46,8 @@ If you don't see the project lineage graph immediately, click **Render Lineage** The nodes in the lineage graph represent the project’s resources and the edges represent the relationships between the nodes. Nodes are color-coded and include iconography according to their resource type. +By default, dbt Explorer shows the project's [applied state](/docs/dbt-cloud-apis/project-state#definition-logical-vs-applied-state-of-dbt-nodes) lineage. That is, it shows models that have been successfully built and are available to query, not just the models defined in the project. + To explore the lineage graphs of tests and macros, view [their resource details pages](#view-resource-details). By default, dbt Explorer excludes these resources from the full lineage graph unless a search query returns them as results. To interact with the full lineage graph, you can: diff --git a/website/docs/docs/collaborate/model-performance.md b/website/docs/docs/collaborate/model-performance.md new file mode 100644 index 00000000000..7ef675b4e1e --- /dev/null +++ b/website/docs/docs/collaborate/model-performance.md @@ -0,0 +1,41 @@ +--- +title: "Model performance" +sidebar_label: "Model performance" +description: "Learn about ." +--- + +dbt Explorer provides metadata on dbt Cloud runs for in-depth model performance and quality analysis. This feature assists in reducing infrastructure costs and saving time for data teams by highlighting where to fine-tune projects and deployments — such as model refactoring or job configuration adjustments. + + + +:::tip Beta + +The model performance beta feature is now available in dbt Explorer! Check it out! +::: + +## The Performance overview page + +You can pinpoint areas for performance enhancement by using the Performance overview page. This page presents a comprehensive analysis across all project models and displays the longest-running models, those most frequently executed, and the ones with the highest failure rates during runs/tests. Data can be segmented by environment and job type which can offer insights into: + +- Most executed models (total count). +- Models with the longest execution time (average duration). +- Models with the most failures, detailing run failures (percentage and count) and test failures (percentage and count). + +Each data point links to individual models in Explorer. + + + +You can view historical metadata for up to the past three months. Select the time horizon using the filter, which defaults to a two-week lookback. + + + +## The Model performance tab + +You can view trends in execution times, counts, and failures by using the Model performance tab for historical performance analysis. Daily execution data includes: + +- Average model execution time. +- Model execution counts, including failures/errors (total sum). + +Clicking on a data point reveals a table listing all job runs for that day, with each row providing a direct link to the details of a specific run. + + \ No newline at end of file diff --git a/website/docs/docs/collaborate/project-recommendations.md b/website/docs/docs/collaborate/project-recommendations.md new file mode 100644 index 00000000000..e6263a875fc --- /dev/null +++ b/website/docs/docs/collaborate/project-recommendations.md @@ -0,0 +1,50 @@ +--- +title: "Project recommendations" +sidebar_label: "Project recommendations" +description: "dbt Explorer provides recommendations that you can take to improve the quality of your dbt project." +--- + +:::tip Beta + +The project recommendations beta feature is now available in dbt Explorer! Check it out! + +::: + +dbt Explorer provides recommendations about your project from the `dbt_project_evaluator` [package](https://hub.getdbt.com/dbt-labs/dbt_project_evaluator/latest/) using metadata from the Discovery API. + +Explorer also offers a global view, showing all the recommendations across the project for easy sorting and summarizing. + +These recommendations provide insight into how you can build a more well documented, well tested, and well built project, leading to less confusion and more trust. + +The Recommendations overview page includes two top-level metrics measuring the test and documentation coverage of the models in your project. + +- **Model test coverage** — The percent of models in your project (models not from a package or imported via dbt Mesh) with at least one dbt test configured on them. +- **Model documentation coverage** — The percent of models in your project (models not from a package or imported via dbt Mesh) with a description. + + + +## List of rules + +| Category | Name | Description | Package Docs Link | +| --- | --- | --- | --- | +| Modeling | Direct Join to Source | Model that joins both a model and source, indicating a missing staging model | [GitHub](https://dbt-labs.github.io/dbt-project-evaluator/0.8/rules/modeling/#direct-join-to-source) | +| Modeling | Duplicate Sources | More than one source node corresponds to the same data warehouse relation | [GitHub](https://dbt-labs.github.io/dbt-project-evaluator/0.8/rules/modeling/#duplicate-sources) | +| Modeling | Multiple Sources Joined | Models with more than one source parent, indicating lack of staging models | [GitHub](https://dbt-labs.github.io/dbt-project-evaluator/0.8/rules/modeling/#multiple-sources-joined) | +| Modeling | Root Model | Models with no parents, indicating potential hardcoded references and need for sources | [GitHub](https://dbt-labs.github.io/dbt-project-evaluator/0.8/rules/modeling/#root-models) | +| Modeling | Source Fanout | Sources with more than one model child, indicating a need for staging models | [GitHub](https://dbt-labs.github.io/dbt-project-evaluator/0.8/rules/modeling/#source-fanout) | +| Modeling | Unused Source | Sources that are not referenced by any resource | [GitHub](https://dbt-labs.github.io/dbt-project-evaluator/0.8/rules/modeling/#unused-sources) | +| Performance | Exposure Dependent on View | Exposures with at least one model parent materialized as a view, indicating potential query performance issues | [GitHub](https://dbt-labs.github.io/dbt-project-evaluator/0.8/rules/performance/#exposure-parents-materializations) | +| Testing | Missing Primary Key Test | Models with insufficient testing on the grain of the model. | [GitHub](https://dbt-labs.github.io/dbt-project-evaluator/0.8/rules/testing/#missing-primary-key-tests) | +| Documentation | Undocumented Models | Models without a model-level description | [GitHub](https://dbt-labs.github.io/dbt-project-evaluator/0.8/rules/documentation/#undocumented-models) | +| Documentation | Undocumented Source | Sources (collections of source tables) without descriptions | [GitHub](https://dbt-labs.github.io/dbt-project-evaluator/0.8/rules/documentation/#undocumented-sources) | +| Documentation | Undocumented Source Tables | Source tables without descriptions | [GitHub](https://dbt-labs.github.io/dbt-project-evaluator/0.8/rules/documentation/#undocumented-source-tables) | +| Governance | Public Model Missing Contract | Models with public access that do not have a model contract to ensure the data types | [GitHub](https://dbt-labs.github.io/dbt-project-evaluator/0.8/rules/governance/#public-models-without-contracts) | + + +## The Recommendations tab + +Models, sources and exposures each also have a Recommendations tab on their resource details page, with the specific recommendations that correspond to that resource: + + + + diff --git a/website/docs/docs/core/connect-data-platform/infer-setup.md b/website/docs/docs/core/connect-data-platform/infer-setup.md index 7642c553cc4..c04fba59a56 100644 --- a/website/docs/docs/core/connect-data-platform/infer-setup.md +++ b/website/docs/docs/core/connect-data-platform/infer-setup.md @@ -12,16 +12,21 @@ meta: slack_channel_name: n/a slack_channel_link: platform_name: 'Infer' - config_page: '/reference/resource-configs/no-configs' + config_page: '/reference/resource-configs/infer-configs' min_supported_version: n/a --- +:::info Vendor-supported plugin + +Certain core functionality may vary. If you would like to report a bug, request a feature, or contribute, you can check out the linked repository and open an issue. + +::: + import SetUpPages from '/snippets/_setup-pages-intro.md'; - ## Connecting to Infer with **dbt-infer** Infer allows you to perform advanced ML Analytics within SQL as if native to your data warehouse. @@ -30,10 +35,18 @@ you can build advanced analysis for any business use case. Read more about SQL-inf and Infer in the [Infer documentation](https://docs.getinfer.io/). The `dbt-infer` package allow you to use SQL-inf easily within your DBT models. -You can read more about the `dbt-infer` package itself and how it connecst to Infer in the [dbt-infer documentation](https://dbt.getinfer.io/). +You can read more about the `dbt-infer` package itself and how it connects to Infer in the [dbt-infer documentation](https://dbt.getinfer.io/). + +The dbt-infer adapter is maintained via PyPi and installed with pip. +To install the latest dbt-infer package simply run the following within the same shell as you run dbt. +```python +pip install dbt-infer +``` + +Versioning of dbt-infer follows the standard dbt versioning scheme - meaning if you are using dbt 1.2 the corresponding dbt-infer will be named 1.2.x where is the latest minor version number. Before using SQL-inf in your DBT models you need to setup an Infer account and generate an API-key for the connection. -You can read how to do that in the [Getting Started Guide](https://dbt.getinfer.io/docs/getting_started#sign-up-to-infer). +You can read how to do that in the [Getting Started Guide](https://docs.getinfer.io/docs/reference/integrations/dbt). The profile configuration in `profiles.yml` for `dbt-infer` should look something like this: @@ -101,10 +114,10 @@ as native SQL functions. Infer supports a number of SQL-inf commands, including `PREDICT`, `EXPLAIN`, `CLUSTER`, `SIMILAR_TO`, `TOPICS`, `SENTIMENT`. -You can read more about SQL-inf and the commands it supports in the [SQL-inf Reference Guide](https://docs.getinfer.io/docs/reference). +You can read more about SQL-inf and the commands it supports in the [SQL-inf Reference Guide](https://docs.getinfer.io/docs/category/commands). To get you started we will give a brief example here of what such a model might look like. -You can find other more complex examples on the [dbt-infer examples page](https://dbt.getinfer.io/docs/examples). +You can find other more complex examples in the [dbt-infer examples repo](https://github.com/inferlabs/dbt-infer-examples). In our simple example, we will show how to use a previous model 'user_features' to predict churn by predicting the column `has_churned`. diff --git a/website/docs/docs/dbt-cloud-apis/schema-discovery-job.mdx b/website/docs/docs/dbt-cloud-apis/schema-discovery-job.mdx index 8b02c5601ad..faebcd9ec2c 100644 --- a/website/docs/docs/dbt-cloud-apis/schema-discovery-job.mdx +++ b/website/docs/docs/dbt-cloud-apis/schema-discovery-job.mdx @@ -61,4 +61,4 @@ query JobQueryExample { ### Fields When querying an `job`, you can use the following fields. - + diff --git a/website/docs/docs/dbt-cloud-apis/schema.jsx b/website/docs/docs/dbt-cloud-apis/schema.jsx index 31568671573..9ac4656c984 100644 --- a/website/docs/docs/dbt-cloud-apis/schema.jsx +++ b/website/docs/docs/dbt-cloud-apis/schema.jsx @@ -173,7 +173,7 @@ export const NodeArgsTable = ({ parent, name, useBetaAPI }) => { ) } -export const SchemaTable = ({ nodeName, useBetaAPI }) => { +export const SchemaTable = ({ nodeName, useBetaAPI, exclude = [] }) => { const [data, setData] = useState(null) useEffect(() => { const fetchData = () => { @@ -255,6 +255,7 @@ export const SchemaTable = ({ nodeName, useBetaAPI }) => { {data.data.__type.fields.map(function ({ name, description, type }) { + if (exclude.includes(name)) return; return ( {name} diff --git a/website/docs/docs/dbt-cloud-apis/service-tokens.md b/website/docs/docs/dbt-cloud-apis/service-tokens.md index f1369711d2b..b0b5fbd6cfe 100644 --- a/website/docs/docs/dbt-cloud-apis/service-tokens.md +++ b/website/docs/docs/dbt-cloud-apis/service-tokens.md @@ -51,7 +51,7 @@ Job admin service tokens can authorize requests for viewing, editing, and creati Member service tokens can authorize requests for viewing and editing resources, triggering runs, and inviting members to the account. Tokens assigned the Member permission set will have the same permissions as a Member user. For more information about Member users, see "[Self-service permissions](/docs/cloud/manage-access/self-service-permissions)". **Read-only**
-Read-only service tokens can authorize requests for viewing a read-only dashboard, viewing generated documentation, and viewing source freshness reports. +Read-only service tokens can authorize requests for viewing a read-only dashboard, viewing generated documentation, and viewing source freshness reports. This token can access and retrieve account-level information endpoints on the [Admin API](/docs/dbt-cloud-apis/admin-cloud-api) and authorize requests to the [Discovery API](/docs/dbt-cloud-apis/discovery-api). ### Enterprise plans using service account tokens diff --git a/website/docs/docs/dbt-support.md b/website/docs/docs/dbt-support.md index 40968b9d763..84bf92482c5 100644 --- a/website/docs/docs/dbt-support.md +++ b/website/docs/docs/dbt-support.md @@ -17,7 +17,9 @@ If you're developing on the command line (CLI) and have questions or need some h ## dbt Cloud support -The global dbt Support team is available to dbt Cloud customers by email or in-product live chat. We want to help you work through implementing and utilizing dbt Cloud at your organization. Have a question you can't find an answer to in [our docs](https://docs.getdbt.com/) or [the Community Forum](https://discourse.getdbt.com/)? Our Support team is here to `dbt help` you! +The global dbt Support team is available to dbt Cloud customers by [email](mailto:support@getdbt.com) or using the in-product live chat (💬). + +We want to help you work through implementing and utilizing dbt Cloud at your organization. Have a question you can't find an answer to in [our docs](https://docs.getdbt.com/) or [the Community Forum](https://discourse.getdbt.com/)? Our Support team is here to `dbt help` you! - **Enterprise plans** — Priority [support](#severity-level-for-enterprise-support), options for custom support coverage hours, implementation assistance, dedicated management, and dbt Labs security reviews depending on price point. - **Developer and Team plans** — 24x5 support (no service level agreement (SLA); [contact Sales](https://www.getdbt.com/pricing/) for Enterprise plan inquiries). diff --git a/website/docs/docs/dbt-versions/core-upgrade/11-Older versions/upgrading-to-0-14-0.md b/website/docs/docs/dbt-versions/core-upgrade/11-Older versions/upgrading-to-0-14-0.md index 036a9a2aedf..48aa14a42e5 100644 --- a/website/docs/docs/dbt-versions/core-upgrade/11-Older versions/upgrading-to-0-14-0.md +++ b/website/docs/docs/dbt-versions/core-upgrade/11-Older versions/upgrading-to-0-14-0.md @@ -189,7 +189,7 @@ models: -**Configuring the `incremental_strategy for a single model:** +**Configuring the `incremental_strategy` for a single model:** diff --git a/website/docs/docs/dbt-versions/release-notes/74-Dec-2023/external-attributes.md b/website/docs/docs/dbt-versions/release-notes/74-Dec-2023/external-attributes.md new file mode 100644 index 00000000000..25791b66fb1 --- /dev/null +++ b/website/docs/docs/dbt-versions/release-notes/74-Dec-2023/external-attributes.md @@ -0,0 +1,16 @@ +--- +title: "Update: Extended attributes is GA" +description: "December 2023: The extended attributes feature is now GA in dbt Cloud. It enables you to override dbt adapter YAML attributes at the environment level." +sidebar_label: "Update: Extended attributes is GA" +sidebar_position: 10 +tags: [Dec-2023] +date: 2023-12-06 +--- + +The extended attributes feature in dbt Cloud is now GA! It allows for an environment level override on any YAML attribute that a dbt adapter accepts in its `profiles.yml`. You can provide a YAML snippet to add or replace any [profile](/docs/core/connect-data-platform/profiles.yml) value. + +To learn more, refer to [Extended attributes](/docs/dbt-cloud-environments#extended-attributes). + +The **Extended Atrributes** text box is available from your environment's settings page: + + diff --git a/website/docs/docs/dbt-versions/release-notes/02-Nov-2023/explorer-updates-rn.md b/website/docs/docs/dbt-versions/release-notes/75-Nov-2023/explorer-updates-rn.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/02-Nov-2023/explorer-updates-rn.md rename to website/docs/docs/dbt-versions/release-notes/75-Nov-2023/explorer-updates-rn.md diff --git a/website/docs/docs/dbt-versions/release-notes/02-Nov-2023/job-notifications-rn.md b/website/docs/docs/dbt-versions/release-notes/75-Nov-2023/job-notifications-rn.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/02-Nov-2023/job-notifications-rn.md rename to website/docs/docs/dbt-versions/release-notes/75-Nov-2023/job-notifications-rn.md diff --git a/website/docs/docs/dbt-versions/release-notes/02-Nov-2023/microsoft-fabric-support-rn.md b/website/docs/docs/dbt-versions/release-notes/75-Nov-2023/microsoft-fabric-support-rn.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/02-Nov-2023/microsoft-fabric-support-rn.md rename to website/docs/docs/dbt-versions/release-notes/75-Nov-2023/microsoft-fabric-support-rn.md diff --git a/website/docs/docs/dbt-versions/release-notes/02-Nov-2023/repo-caching.md b/website/docs/docs/dbt-versions/release-notes/75-Nov-2023/repo-caching.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/02-Nov-2023/repo-caching.md rename to website/docs/docs/dbt-versions/release-notes/75-Nov-2023/repo-caching.md diff --git a/website/docs/docs/dbt-versions/release-notes/03-Oct-2023/api-v2v3-limit.md b/website/docs/docs/dbt-versions/release-notes/76-Oct-2023/api-v2v3-limit.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/03-Oct-2023/api-v2v3-limit.md rename to website/docs/docs/dbt-versions/release-notes/76-Oct-2023/api-v2v3-limit.md diff --git a/website/docs/docs/dbt-versions/release-notes/03-Oct-2023/cloud-cli-pp.md b/website/docs/docs/dbt-versions/release-notes/76-Oct-2023/cloud-cli-pp.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/03-Oct-2023/cloud-cli-pp.md rename to website/docs/docs/dbt-versions/release-notes/76-Oct-2023/cloud-cli-pp.md diff --git a/website/docs/docs/dbt-versions/release-notes/03-Oct-2023/custom-branch-fix-rn.md b/website/docs/docs/dbt-versions/release-notes/76-Oct-2023/custom-branch-fix-rn.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/03-Oct-2023/custom-branch-fix-rn.md rename to website/docs/docs/dbt-versions/release-notes/76-Oct-2023/custom-branch-fix-rn.md diff --git a/website/docs/docs/dbt-versions/release-notes/03-Oct-2023/dbt-deps-auto-install.md b/website/docs/docs/dbt-versions/release-notes/76-Oct-2023/dbt-deps-auto-install.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/03-Oct-2023/dbt-deps-auto-install.md rename to website/docs/docs/dbt-versions/release-notes/76-Oct-2023/dbt-deps-auto-install.md diff --git a/website/docs/docs/dbt-versions/release-notes/03-Oct-2023/explorer-public-preview-rn.md b/website/docs/docs/dbt-versions/release-notes/76-Oct-2023/explorer-public-preview-rn.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/03-Oct-2023/explorer-public-preview-rn.md rename to website/docs/docs/dbt-versions/release-notes/76-Oct-2023/explorer-public-preview-rn.md diff --git a/website/docs/docs/dbt-versions/release-notes/03-Oct-2023/native-retry-support-rn.md b/website/docs/docs/dbt-versions/release-notes/76-Oct-2023/native-retry-support-rn.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/03-Oct-2023/native-retry-support-rn.md rename to website/docs/docs/dbt-versions/release-notes/76-Oct-2023/native-retry-support-rn.md diff --git a/website/docs/docs/dbt-versions/release-notes/03-Oct-2023/product-docs-sept-rn.md b/website/docs/docs/dbt-versions/release-notes/76-Oct-2023/product-docs-sept-rn.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/03-Oct-2023/product-docs-sept-rn.md rename to website/docs/docs/dbt-versions/release-notes/76-Oct-2023/product-docs-sept-rn.md diff --git a/website/docs/docs/dbt-versions/release-notes/03-Oct-2023/sl-ga.md b/website/docs/docs/dbt-versions/release-notes/76-Oct-2023/sl-ga.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/03-Oct-2023/sl-ga.md rename to website/docs/docs/dbt-versions/release-notes/76-Oct-2023/sl-ga.md diff --git a/website/docs/docs/dbt-versions/release-notes/04-Sept-2023/ci-updates-phase2-rn.md b/website/docs/docs/dbt-versions/release-notes/77-Sept-2023/ci-updates-phase2-rn.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/04-Sept-2023/ci-updates-phase2-rn.md rename to website/docs/docs/dbt-versions/release-notes/77-Sept-2023/ci-updates-phase2-rn.md diff --git a/website/docs/docs/dbt-versions/release-notes/04-Sept-2023/ci-updates-phase3-rn.md b/website/docs/docs/dbt-versions/release-notes/77-Sept-2023/ci-updates-phase3-rn.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/04-Sept-2023/ci-updates-phase3-rn.md rename to website/docs/docs/dbt-versions/release-notes/77-Sept-2023/ci-updates-phase3-rn.md diff --git a/website/docs/docs/dbt-versions/release-notes/04-Sept-2023/product-docs-summer-rn.md b/website/docs/docs/dbt-versions/release-notes/77-Sept-2023/product-docs-summer-rn.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/04-Sept-2023/product-docs-summer-rn.md rename to website/docs/docs/dbt-versions/release-notes/77-Sept-2023/product-docs-summer-rn.md diff --git a/website/docs/docs/dbt-versions/release-notes/04-Sept-2023/removing-prerelease-versions.md b/website/docs/docs/dbt-versions/release-notes/77-Sept-2023/removing-prerelease-versions.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/04-Sept-2023/removing-prerelease-versions.md rename to website/docs/docs/dbt-versions/release-notes/77-Sept-2023/removing-prerelease-versions.md diff --git a/website/docs/docs/dbt-versions/release-notes/05-Aug-2023/deprecation-endpoints-discovery.md b/website/docs/docs/dbt-versions/release-notes/78-Aug-2023/deprecation-endpoints-discovery.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/05-Aug-2023/deprecation-endpoints-discovery.md rename to website/docs/docs/dbt-versions/release-notes/78-Aug-2023/deprecation-endpoints-discovery.md diff --git a/website/docs/docs/dbt-versions/release-notes/05-Aug-2023/ide-v1.2.md b/website/docs/docs/dbt-versions/release-notes/78-Aug-2023/ide-v1.2.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/05-Aug-2023/ide-v1.2.md rename to website/docs/docs/dbt-versions/release-notes/78-Aug-2023/ide-v1.2.md diff --git a/website/docs/docs/dbt-versions/release-notes/05-Aug-2023/sl-revamp-beta.md b/website/docs/docs/dbt-versions/release-notes/78-Aug-2023/sl-revamp-beta.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/05-Aug-2023/sl-revamp-beta.md rename to website/docs/docs/dbt-versions/release-notes/78-Aug-2023/sl-revamp-beta.md diff --git a/website/docs/docs/dbt-versions/release-notes/06-July-2023/faster-run.md b/website/docs/docs/dbt-versions/release-notes/79-July-2023/faster-run.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/06-July-2023/faster-run.md rename to website/docs/docs/dbt-versions/release-notes/79-July-2023/faster-run.md diff --git a/website/docs/docs/dbt-versions/release-notes/07-June-2023/admin-api-rn.md b/website/docs/docs/dbt-versions/release-notes/80-June-2023/admin-api-rn.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/07-June-2023/admin-api-rn.md rename to website/docs/docs/dbt-versions/release-notes/80-June-2023/admin-api-rn.md diff --git a/website/docs/docs/dbt-versions/release-notes/07-June-2023/ci-updates-phase1-rn.md b/website/docs/docs/dbt-versions/release-notes/80-June-2023/ci-updates-phase1-rn.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/07-June-2023/ci-updates-phase1-rn.md rename to website/docs/docs/dbt-versions/release-notes/80-June-2023/ci-updates-phase1-rn.md diff --git a/website/docs/docs/dbt-versions/release-notes/07-June-2023/lint-format-rn.md b/website/docs/docs/dbt-versions/release-notes/80-June-2023/lint-format-rn.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/07-June-2023/lint-format-rn.md rename to website/docs/docs/dbt-versions/release-notes/80-June-2023/lint-format-rn.md diff --git a/website/docs/docs/dbt-versions/release-notes/07-June-2023/product-docs-jun.md b/website/docs/docs/dbt-versions/release-notes/80-June-2023/product-docs-jun.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/07-June-2023/product-docs-jun.md rename to website/docs/docs/dbt-versions/release-notes/80-June-2023/product-docs-jun.md diff --git a/website/docs/docs/dbt-versions/release-notes/08-May-2023/discovery-api-public-preview.md b/website/docs/docs/dbt-versions/release-notes/81-May-2023/discovery-api-public-preview.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/08-May-2023/discovery-api-public-preview.md rename to website/docs/docs/dbt-versions/release-notes/81-May-2023/discovery-api-public-preview.md diff --git a/website/docs/docs/dbt-versions/release-notes/08-May-2023/may-ide-updates.md b/website/docs/docs/dbt-versions/release-notes/81-May-2023/may-ide-updates.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/08-May-2023/may-ide-updates.md rename to website/docs/docs/dbt-versions/release-notes/81-May-2023/may-ide-updates.md diff --git a/website/docs/docs/dbt-versions/release-notes/08-May-2023/product-docs-may.md b/website/docs/docs/dbt-versions/release-notes/81-May-2023/product-docs-may.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/08-May-2023/product-docs-may.md rename to website/docs/docs/dbt-versions/release-notes/81-May-2023/product-docs-may.md diff --git a/website/docs/docs/dbt-versions/release-notes/08-May-2023/run-details-and-logs-improvements.md b/website/docs/docs/dbt-versions/release-notes/81-May-2023/run-details-and-logs-improvements.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/08-May-2023/run-details-and-logs-improvements.md rename to website/docs/docs/dbt-versions/release-notes/81-May-2023/run-details-and-logs-improvements.md diff --git a/website/docs/docs/dbt-versions/release-notes/08-May-2023/run-history-endpoint.md b/website/docs/docs/dbt-versions/release-notes/81-May-2023/run-history-endpoint.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/08-May-2023/run-history-endpoint.md rename to website/docs/docs/dbt-versions/release-notes/81-May-2023/run-history-endpoint.md diff --git a/website/docs/docs/dbt-versions/release-notes/08-May-2023/run-history-improvements.md b/website/docs/docs/dbt-versions/release-notes/81-May-2023/run-history-improvements.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/08-May-2023/run-history-improvements.md rename to website/docs/docs/dbt-versions/release-notes/81-May-2023/run-history-improvements.md diff --git a/website/docs/docs/dbt-versions/release-notes/09-April-2023/api-endpoint-restriction.md b/website/docs/docs/dbt-versions/release-notes/82-April-2023/api-endpoint-restriction.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/09-April-2023/api-endpoint-restriction.md rename to website/docs/docs/dbt-versions/release-notes/82-April-2023/api-endpoint-restriction.md diff --git a/website/docs/docs/dbt-versions/release-notes/09-April-2023/apr-ide-updates.md b/website/docs/docs/dbt-versions/release-notes/82-April-2023/apr-ide-updates.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/09-April-2023/apr-ide-updates.md rename to website/docs/docs/dbt-versions/release-notes/82-April-2023/apr-ide-updates.md diff --git a/website/docs/docs/dbt-versions/release-notes/09-April-2023/product-docs.md b/website/docs/docs/dbt-versions/release-notes/82-April-2023/product-docs.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/09-April-2023/product-docs.md rename to website/docs/docs/dbt-versions/release-notes/82-April-2023/product-docs.md diff --git a/website/docs/docs/dbt-versions/release-notes/09-April-2023/scheduler-optimized.md b/website/docs/docs/dbt-versions/release-notes/82-April-2023/scheduler-optimized.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/09-April-2023/scheduler-optimized.md rename to website/docs/docs/dbt-versions/release-notes/82-April-2023/scheduler-optimized.md diff --git a/website/docs/docs/dbt-versions/release-notes/09-April-2023/starburst-trino-ga.md b/website/docs/docs/dbt-versions/release-notes/82-April-2023/starburst-trino-ga.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/09-April-2023/starburst-trino-ga.md rename to website/docs/docs/dbt-versions/release-notes/82-April-2023/starburst-trino-ga.md diff --git a/website/docs/docs/dbt-versions/release-notes/10-Mar-2023/1.0-deprecation.md b/website/docs/docs/dbt-versions/release-notes/83-Mar-2023/1.0-deprecation.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/10-Mar-2023/1.0-deprecation.md rename to website/docs/docs/dbt-versions/release-notes/83-Mar-2023/1.0-deprecation.md diff --git a/website/docs/docs/dbt-versions/release-notes/10-Mar-2023/apiv2-limit.md b/website/docs/docs/dbt-versions/release-notes/83-Mar-2023/apiv2-limit.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/10-Mar-2023/apiv2-limit.md rename to website/docs/docs/dbt-versions/release-notes/83-Mar-2023/apiv2-limit.md diff --git a/website/docs/docs/dbt-versions/release-notes/10-Mar-2023/mar-ide-updates.md b/website/docs/docs/dbt-versions/release-notes/83-Mar-2023/mar-ide-updates.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/10-Mar-2023/mar-ide-updates.md rename to website/docs/docs/dbt-versions/release-notes/83-Mar-2023/mar-ide-updates.md diff --git a/website/docs/docs/dbt-versions/release-notes/10-Mar-2023/public-preview-trino-in-dbt-cloud.md b/website/docs/docs/dbt-versions/release-notes/83-Mar-2023/public-preview-trino-in-dbt-cloud.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/10-Mar-2023/public-preview-trino-in-dbt-cloud.md rename to website/docs/docs/dbt-versions/release-notes/83-Mar-2023/public-preview-trino-in-dbt-cloud.md diff --git a/website/docs/docs/dbt-versions/release-notes/11-Feb-2023/feb-ide-updates.md b/website/docs/docs/dbt-versions/release-notes/84-Feb-2023/feb-ide-updates.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/11-Feb-2023/feb-ide-updates.md rename to website/docs/docs/dbt-versions/release-notes/84-Feb-2023/feb-ide-updates.md diff --git a/website/docs/docs/dbt-versions/release-notes/11-Feb-2023/no-partial-parse-config.md b/website/docs/docs/dbt-versions/release-notes/84-Feb-2023/no-partial-parse-config.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/11-Feb-2023/no-partial-parse-config.md rename to website/docs/docs/dbt-versions/release-notes/84-Feb-2023/no-partial-parse-config.md diff --git a/website/docs/docs/dbt-versions/release-notes/12-Jan-2023/ide-updates.md b/website/docs/docs/dbt-versions/release-notes/85-Jan-2023/ide-updates.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/12-Jan-2023/ide-updates.md rename to website/docs/docs/dbt-versions/release-notes/85-Jan-2023/ide-updates.md diff --git a/website/docs/docs/dbt-versions/release-notes/23-Dec-2022/default-thread-value.md b/website/docs/docs/dbt-versions/release-notes/86-Dec-2022/default-thread-value.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/23-Dec-2022/default-thread-value.md rename to website/docs/docs/dbt-versions/release-notes/86-Dec-2022/default-thread-value.md diff --git a/website/docs/docs/dbt-versions/release-notes/23-Dec-2022/new-jobs-default-as-off.md b/website/docs/docs/dbt-versions/release-notes/86-Dec-2022/new-jobs-default-as-off.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/23-Dec-2022/new-jobs-default-as-off.md rename to website/docs/docs/dbt-versions/release-notes/86-Dec-2022/new-jobs-default-as-off.md diff --git a/website/docs/docs/dbt-versions/release-notes/23-Dec-2022/private-packages-clone-git-token.md b/website/docs/docs/dbt-versions/release-notes/86-Dec-2022/private-packages-clone-git-token.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/23-Dec-2022/private-packages-clone-git-token.md rename to website/docs/docs/dbt-versions/release-notes/86-Dec-2022/private-packages-clone-git-token.md diff --git a/website/docs/docs/dbt-versions/release-notes/24-Nov-2022/dbt-databricks-unity-catalog-support.md b/website/docs/docs/dbt-versions/release-notes/87-Nov-2022/dbt-databricks-unity-catalog-support.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/24-Nov-2022/dbt-databricks-unity-catalog-support.md rename to website/docs/docs/dbt-versions/release-notes/87-Nov-2022/dbt-databricks-unity-catalog-support.md diff --git a/website/docs/docs/dbt-versions/release-notes/24-Nov-2022/ide-features-ide-deprecation.md b/website/docs/docs/dbt-versions/release-notes/87-Nov-2022/ide-features-ide-deprecation.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/24-Nov-2022/ide-features-ide-deprecation.md rename to website/docs/docs/dbt-versions/release-notes/87-Nov-2022/ide-features-ide-deprecation.md diff --git a/website/docs/docs/dbt-versions/release-notes/25-Oct-2022/cloud-integration-azure.md b/website/docs/docs/dbt-versions/release-notes/88-Oct-2022/cloud-integration-azure.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/25-Oct-2022/cloud-integration-azure.md rename to website/docs/docs/dbt-versions/release-notes/88-Oct-2022/cloud-integration-azure.md diff --git a/website/docs/docs/dbt-versions/release-notes/25-Oct-2022/new-ide-launch.md b/website/docs/docs/dbt-versions/release-notes/88-Oct-2022/new-ide-launch.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/25-Oct-2022/new-ide-launch.md rename to website/docs/docs/dbt-versions/release-notes/88-Oct-2022/new-ide-launch.md diff --git a/website/docs/docs/dbt-versions/release-notes/26-Sept-2022/liststeps-endpoint-deprecation.md b/website/docs/docs/dbt-versions/release-notes/89-Sept-2022/liststeps-endpoint-deprecation.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/26-Sept-2022/liststeps-endpoint-deprecation.md rename to website/docs/docs/dbt-versions/release-notes/89-Sept-2022/liststeps-endpoint-deprecation.md diff --git a/website/docs/docs/dbt-versions/release-notes/26-Sept-2022/metadata-api-data-retention-limits.md b/website/docs/docs/dbt-versions/release-notes/89-Sept-2022/metadata-api-data-retention-limits.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/26-Sept-2022/metadata-api-data-retention-limits.md rename to website/docs/docs/dbt-versions/release-notes/89-Sept-2022/metadata-api-data-retention-limits.md diff --git a/website/docs/docs/dbt-versions/release-notes/27-Aug-2022/ide-improvement-beta.md b/website/docs/docs/dbt-versions/release-notes/91-Aug-2022/ide-improvement-beta.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/27-Aug-2022/ide-improvement-beta.md rename to website/docs/docs/dbt-versions/release-notes/91-Aug-2022/ide-improvement-beta.md diff --git a/website/docs/docs/dbt-versions/release-notes/27-Aug-2022/support-redshift-ra3.md b/website/docs/docs/dbt-versions/release-notes/91-Aug-2022/support-redshift-ra3.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/27-Aug-2022/support-redshift-ra3.md rename to website/docs/docs/dbt-versions/release-notes/91-Aug-2022/support-redshift-ra3.md diff --git a/website/docs/docs/dbt-versions/release-notes/28-July-2022/render-lineage-feature.md b/website/docs/docs/dbt-versions/release-notes/92-July-2022/render-lineage-feature.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/28-July-2022/render-lineage-feature.md rename to website/docs/docs/dbt-versions/release-notes/92-July-2022/render-lineage-feature.md diff --git a/website/docs/docs/dbt-versions/release-notes/29-May-2022/gitlab-auth.md b/website/docs/docs/dbt-versions/release-notes/93-May-2022/gitlab-auth.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/29-May-2022/gitlab-auth.md rename to website/docs/docs/dbt-versions/release-notes/93-May-2022/gitlab-auth.md diff --git a/website/docs/docs/dbt-versions/release-notes/30-April-2022/audit-log.md b/website/docs/docs/dbt-versions/release-notes/94-April-2022/audit-log.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/30-April-2022/audit-log.md rename to website/docs/docs/dbt-versions/release-notes/94-April-2022/audit-log.md diff --git a/website/docs/docs/dbt-versions/release-notes/30-April-2022/credentials-saved.md b/website/docs/docs/dbt-versions/release-notes/94-April-2022/credentials-saved.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/30-April-2022/credentials-saved.md rename to website/docs/docs/dbt-versions/release-notes/94-April-2022/credentials-saved.md diff --git a/website/docs/docs/dbt-versions/release-notes/30-April-2022/email-verification.md b/website/docs/docs/dbt-versions/release-notes/94-April-2022/email-verification.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/30-April-2022/email-verification.md rename to website/docs/docs/dbt-versions/release-notes/94-April-2022/email-verification.md diff --git a/website/docs/docs/dbt-versions/release-notes/30-April-2022/scheduler-improvements.md b/website/docs/docs/dbt-versions/release-notes/94-April-2022/scheduler-improvements.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/30-April-2022/scheduler-improvements.md rename to website/docs/docs/dbt-versions/release-notes/94-April-2022/scheduler-improvements.md diff --git a/website/docs/docs/dbt-versions/release-notes/31-March-2022/ide-timeout-message.md b/website/docs/docs/dbt-versions/release-notes/95-March-2022/ide-timeout-message.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/31-March-2022/ide-timeout-message.md rename to website/docs/docs/dbt-versions/release-notes/95-March-2022/ide-timeout-message.md diff --git a/website/docs/docs/dbt-versions/release-notes/31-March-2022/prep-and-waiting-time.md b/website/docs/docs/dbt-versions/release-notes/95-March-2022/prep-and-waiting-time.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/31-March-2022/prep-and-waiting-time.md rename to website/docs/docs/dbt-versions/release-notes/95-March-2022/prep-and-waiting-time.md diff --git a/website/docs/docs/dbt-versions/release-notes/32-February-2022/DAG-updates-more.md b/website/docs/docs/dbt-versions/release-notes/96-February-2022/DAG-updates-more.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/32-February-2022/DAG-updates-more.md rename to website/docs/docs/dbt-versions/release-notes/96-February-2022/DAG-updates-more.md diff --git a/website/docs/docs/dbt-versions/release-notes/32-February-2022/service-tokens-more.md b/website/docs/docs/dbt-versions/release-notes/96-February-2022/service-tokens-more.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/32-February-2022/service-tokens-more.md rename to website/docs/docs/dbt-versions/release-notes/96-February-2022/service-tokens-more.md diff --git a/website/docs/docs/dbt-versions/release-notes/33-January-2022/IDE-autocomplete-more.md b/website/docs/docs/dbt-versions/release-notes/97-January-2022/IDE-autocomplete-more.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/33-January-2022/IDE-autocomplete-more.md rename to website/docs/docs/dbt-versions/release-notes/97-January-2022/IDE-autocomplete-more.md diff --git a/website/docs/docs/dbt-versions/release-notes/33-January-2022/model-timing-more.md b/website/docs/docs/dbt-versions/release-notes/97-January-2022/model-timing-more.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/33-January-2022/model-timing-more.md rename to website/docs/docs/dbt-versions/release-notes/97-January-2022/model-timing-more.md diff --git a/website/docs/docs/dbt-versions/release-notes/34-dbt-cloud-changelog-2021.md b/website/docs/docs/dbt-versions/release-notes/98-dbt-cloud-changelog-2021.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/34-dbt-cloud-changelog-2021.md rename to website/docs/docs/dbt-versions/release-notes/98-dbt-cloud-changelog-2021.md diff --git a/website/docs/docs/dbt-versions/release-notes/35-dbt-cloud-changelog-2019-2020.md b/website/docs/docs/dbt-versions/release-notes/99-dbt-cloud-changelog-2019-2020.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/35-dbt-cloud-changelog-2019-2020.md rename to website/docs/docs/dbt-versions/release-notes/99-dbt-cloud-changelog-2019-2020.md diff --git a/website/docs/docs/deploy/deploy-jobs.md b/website/docs/docs/deploy/deploy-jobs.md index e43020bf66e..cee6e245359 100644 --- a/website/docs/docs/deploy/deploy-jobs.md +++ b/website/docs/docs/deploy/deploy-jobs.md @@ -84,15 +84,22 @@ To fully customize the scheduling of your job, choose the **Custom cron schedule Use tools such as [crontab.guru](https://crontab.guru/) to generate the correct cron syntax. This tool allows you to input cron snippets and returns their plain English translations. -Refer to the following example snippets: +Here are examples of cron job schedules. The dbt Cloud job scheduler supports using `L` to schedule jobs on the last day of the month: -- `0 * * * *`: Every hour, at minute 0 -- `*/5 * * * *`: Every 5 minutes -- `5 4 * * *`: At exactly 4:05 AM UTC -- `30 */4 * * *`: At minute 30 past every 4th hour (e.g. 4:30AM, 8:30AM, 12:30PM, etc., all UTC) -- `0 0 */2 * *`: At midnight UTC every other day +- `0 * * * *`: Every hour, at minute 0. +- `*/5 * * * *`: Every 5 minutes. +- `5 4 * * *`: At exactly 4:05 AM UTC. +- `30 */4 * * *`: At minute 30 past every 4th hour (such as 4:30 AM, 8:30 AM, 12:30 PM, and so on, all UTC). +- `0 0 */2 * *`: At 12:00 AM (midnight) UTC every other day. - `0 0 * * 1`: At midnight UTC every Monday. +- `0 0 L * *`: At 12:00 AM (midnight), on the last day of the month. +- `0 0 L 1,2,3,4,5,6,8,9,10,11,12 *`: At 12:00 AM, on the last day of the month, only in January, February, March, April, May, June, August, September, October, November, and December. +- `0 0 L 7 *`: At 12:00 AM, on the last day of the month, only in July. +- `0 0 L * FRI,SAT`: At 12:00 AM, on the last day of the month, and on Friday and Saturday. +- `0 12 L * *`: At 12:00 PM (afternoon), on the last day of the month. +- `0 7 L * 5`: At 07:00 AM, on the last day of the month, and on Friday. +- `30 14 L * *`: At 02:30 PM, on the last day of the month. ## Related docs diff --git a/website/docs/docs/deploy/retry-jobs.md b/website/docs/docs/deploy/retry-jobs.md index ea616121f38..beefb35379e 100644 --- a/website/docs/docs/deploy/retry-jobs.md +++ b/website/docs/docs/deploy/retry-jobs.md @@ -26,7 +26,7 @@ If your dbt job run completed with a status of **Error**, you can rerun it from ## Related content -- [Retry a failed run for a job](/dbt-cloud/api-v2#/operations/Retry%20a%20failed%20run%20for%20a%20job) API endpoint +- [Retry a failed run for a job](/dbt-cloud/api-v2#/operations/Retry%20Failed%20Job) API endpoint - [Run visibility](/docs/deploy/run-visibility) - [Jobs](/docs/deploy/jobs) -- [Job commands](/docs/deploy/job-commands) \ No newline at end of file +- [Job commands](/docs/deploy/job-commands) diff --git a/website/docs/docs/use-dbt-semantic-layer/gsheets.md b/website/docs/docs/use-dbt-semantic-layer/gsheets.md index 9d5a7c105ae..d7525fa7b26 100644 --- a/website/docs/docs/use-dbt-semantic-layer/gsheets.md +++ b/website/docs/docs/use-dbt-semantic-layer/gsheets.md @@ -17,6 +17,8 @@ The dbt Semantic Layer offers a seamless integration with Google Sheets through - You have a Google account with access to Google Sheets. - You can install Google add-ons. - You have a dbt Cloud Environment ID and a [service token](/docs/dbt-cloud-apis/service-tokens) to authenticate with from a dbt Cloud account. +- You must have a dbt Cloud Team or Enterprise [account](https://www.getdbt.com/pricing). Suitable for both Multi-tenant and Single-tenant deployment. + - Single-tenant accounts should contact their account representative for necessary setup and enablement. ## Installing the add-on diff --git a/website/docs/docs/use-dbt-semantic-layer/quickstart-sl.md b/website/docs/docs/use-dbt-semantic-layer/quickstart-sl.md index 13a119da9a7..62437f4ecd6 100644 --- a/website/docs/docs/use-dbt-semantic-layer/quickstart-sl.md +++ b/website/docs/docs/use-dbt-semantic-layer/quickstart-sl.md @@ -26,7 +26,7 @@ MetricFlow, a powerful component of the dbt Semantic Layer, simplifies the creat Use this guide to fully experience the power of the universal dbt Semantic Layer. Here are the following steps you'll take: - [Create a semantic model](#create-a-semantic-model) in dbt Cloud using MetricFlow -- [Define metrics](#define-metrics) in dbt Cloud using MetricFlow +- [Define metrics](#define-metrics) in dbt using MetricFlow - [Test and query metrics](#test-and-query-metrics) with MetricFlow - [Run a production job](#run-a-production-job) in dbt Cloud - [Set up dbt Semantic Layer](#setup) in dbt Cloud diff --git a/website/docs/docs/use-dbt-semantic-layer/tableau.md b/website/docs/docs/use-dbt-semantic-layer/tableau.md index a5c1b6edd04..0f12a75f468 100644 --- a/website/docs/docs/use-dbt-semantic-layer/tableau.md +++ b/website/docs/docs/use-dbt-semantic-layer/tableau.md @@ -21,7 +21,8 @@ This integration provides a live connection to the dbt Semantic Layer through Ta - Note that Tableau Online does not currently support custom connectors natively. If you use Tableau Online, you will only be able to access the connector in Tableau Desktop. - Log in to Tableau Desktop (with Online or Server credentials) or a license to Tableau Server - You need your dbt Cloud host, [Environment ID](/docs/use-dbt-semantic-layer/setup-sl#set-up-dbt-semantic-layer) and [service token](/docs/dbt-cloud-apis/service-tokens) to log in. This account should be set up with the dbt Semantic Layer. -- You must have a dbt Cloud Team or Enterprise [account](https://www.getdbt.com/pricing) and multi-tenant [deployment](/docs/cloud/about-cloud/regions-ip-addresses). (Single-Tenant coming soon) +- You must have a dbt Cloud Team or Enterprise [account](https://www.getdbt.com/pricing). Suitable for both Multi-tenant and Single-tenant deployment. + - Single-tenant accounts should contact their account representative for necessary setup and enablement. ## Installing the Connector diff --git a/website/docs/guides/airflow-and-dbt-cloud.md b/website/docs/guides/airflow-and-dbt-cloud.md index a3ff59af14e..e7f754ef02d 100644 --- a/website/docs/guides/airflow-and-dbt-cloud.md +++ b/website/docs/guides/airflow-and-dbt-cloud.md @@ -11,50 +11,29 @@ recently_updated: true ## Introduction -In some cases, [Airflow](https://airflow.apache.org/) may be the preferred orchestrator for your organization over working fully within dbt Cloud. There are a few reasons your team might be considering using Airflow to orchestrate your dbt jobs: - -- Your team is already using Airflow to orchestrate other processes -- Your team needs to ensure that a [dbt job](https://docs.getdbt.com/docs/dbt-cloud/cloud-overview#schedule-and-run-dbt-jobs-in-production) kicks off before or after another process outside of dbt Cloud -- Your team needs flexibility to manage more complex scheduling, such as kicking off one dbt job only after another has completed -- Your team wants to own their own orchestration solution -- You need code to work right now without starting from scratch - -### Prerequisites - -- [dbt Cloud Teams or Enterprise account](https://www.getdbt.com/pricing/) (with [admin access](https://docs.getdbt.com/docs/cloud/manage-access/enterprise-permissions)) in order to create a service token. Permissions for service tokens can be found [here](https://docs.getdbt.com/docs/dbt-cloud-apis/service-tokens#permissions-for-service-account-tokens). -- A [free Docker account](https://hub.docker.com/signup) in order to sign in to Docker Desktop, which will be installed in the initial setup. -- A local digital scratchpad for temporarily copy-pasting API keys and URLs - -### Airflow + dbt Core - -There are [so many great examples](https://gitlab.com/gitlab-data/analytics/-/blob/master/dags/transformation/dbt_snowplow_backfill.py) from GitLab through their open source data engineering work. This is especially appropriate if you are well-versed in Kubernetes, CI/CD, and docker task management when building your airflow pipelines. If this is you and your team, you’re in good hands reading through more details [here](https://about.gitlab.com/handbook/business-technology/data-team/platform/infrastructure/#airflow) and [here](https://about.gitlab.com/handbook/business-technology/data-team/platform/dbt-guide/). - -### Airflow + dbt Cloud API w/Custom Scripts - -This has served as a bridge until the fabled Astronomer + dbt Labs-built dbt Cloud provider became generally available [here](https://registry.astronomer.io/providers/dbt%20Cloud/versions/latest). - -There are many different permutations of this over time: - -- [Custom Python Scripts](https://github.com/sungchun12/airflow-dbt-cloud/blob/main/archive/dbt_cloud_example.py): This is an airflow DAG based on [custom python API utilities](https://github.com/sungchun12/airflow-dbt-cloud/blob/main/archive/dbt_cloud_utils.py) -- [Make API requests directly through the BashOperator based on the docs](https://docs.getdbt.com/dbt-cloud/api-v2-legacy#operation/triggerRun): You can make cURL requests to invoke dbt Cloud to do what you want -- For more options, check out the [official dbt Docs](/docs/deploy/deployments#airflow) on the various ways teams are running dbt in airflow - -These solutions are great, but can be difficult to trust as your team grows and management for things like: testing, job definitions, secrets, and pipelines increase past your team’s capacity. Roles become blurry (or were never clearly defined at the start!). Both data and analytics engineers start digging through custom logging within each other’s workflows to make heads or tails of where and what the issue really is. Not to mention that when the issue is found, it can be even harder to decide on the best path forward for safely implementing fixes. This complex workflow and unclear delineation on process management results in a lot of misunderstandings and wasted time just trying to get the process to work smoothly! +Many organization already use [Airflow](https://airflow.apache.org/) to orchestrate their data workflows. dbt Cloud works great with Airflow, letting you execute your dbt code in dbt Cloud while keeping orchestration duties with Airflow. This ensures your project's metadata (important for tools like dbt Explorer) is available and up-to-date, while still enabling you to use Airflow for general tasks such as: +- Scheduling other processes outside of dbt runs +- Ensuring that a [dbt job](/docs/deploy/job-scheduler) kicks off before or after another process outside of dbt Cloud +- Triggering a dbt job only after another has completed In this guide, you'll learn how to: -1. Creating a working local Airflow environment -2. Invoking a dbt Cloud job with Airflow (with proof!) -3. Reusing tested and trusted Airflow code for your specific use cases +1. Create a working local Airflow environment +2. Invoke a dbt Cloud job with Airflow +3. Reuse tested and trusted Airflow code for your specific use cases You’ll also gain a better understanding of how this will: - Reduce the cognitive load when building and maintaining pipelines - Avoid dependency hell (think: `pip install` conflicts) -- Implement better recoveries from failures -- Define clearer workflows so that data and analytics engineers work better, together ♥️ +- Define clearer handoff of workflows between data engineers and analytics engineers + +## Prerequisites +- [dbt Cloud Teams or Enterprise account](https://www.getdbt.com/pricing/) (with [admin access](/docs/cloud/manage-access/enterprise-permissions)) in order to create a service token. Permissions for service tokens can be found [here](/docs/dbt-cloud-apis/service-tokens#permissions-for-service-account-tokens). +- A [free Docker account](https://hub.docker.com/signup) in order to sign in to Docker Desktop, which will be installed in the initial setup. +- A local digital scratchpad for temporarily copy-pasting API keys and URLs 🙌 Let’s get started! 🙌 @@ -72,7 +51,7 @@ brew install astro ## Install and start Docker Desktop -Docker allows us to spin up an environment with all the apps and dependencies we need for the example. +Docker allows us to spin up an environment with all the apps and dependencies we need for this guide. Follow the instructions [here](https://docs.docker.com/desktop/) to install Docker desktop for your own operating system. Once Docker is installed, ensure you have it up and running for the next steps. @@ -80,7 +59,7 @@ Follow the instructions [here](https://docs.docker.com/desktop/) to install Dock ## Clone the airflow-dbt-cloud repository -Open your terminal and clone the [airflow-dbt-cloud repository](https://github.com/sungchun12/airflow-dbt-cloud.git). This contains example Airflow DAGs that you’ll use to orchestrate your dbt Cloud job. Once cloned, navigate into the `airflow-dbt-cloud` project. +Open your terminal and clone the [airflow-dbt-cloud repository](https://github.com/sungchun12/airflow-dbt-cloud). This contains example Airflow DAGs that you’ll use to orchestrate your dbt Cloud job. Once cloned, navigate into the `airflow-dbt-cloud` project. ```bash git clone https://github.com/sungchun12/airflow-dbt-cloud.git @@ -91,12 +70,9 @@ cd airflow-dbt-cloud ## Start the Docker container -You can initialize an Astronomer project in an empty local directory using a Docker container, and then run your project locally using the `start` command. - -1. Run the following commands to initialize your project and start your local Airflow deployment: +1. From the `airflow-dbt-cloud` directory you cloned and opened in the prior step, run the following command to start your local Airflow deployment: ```bash - astro dev init astro dev start ``` @@ -110,10 +86,10 @@ You can initialize an Astronomer project in an empty local directory using a Doc Airflow Webserver: http://localhost:8080 Postgres Database: localhost:5432/postgres The default Airflow UI credentials are: admin:admin - The default Postrgres DB credentials are: postgres:postgres + The default Postgres DB credentials are: postgres:postgres ``` -2. Open the Airflow interface. Launch your web browser and navigate to the address for the **Airflow Webserver** from your output in Step 1. +2. Open the Airflow interface. Launch your web browser and navigate to the address for the **Airflow Webserver** from your output above (for us, `http://localhost:8080`). This will take you to your local instance of Airflow. You’ll need to log in with the **default credentials**: @@ -126,15 +102,15 @@ You can initialize an Astronomer project in an empty local directory using a Doc ## Create a dbt Cloud service token -Create a service token from within dbt Cloud using the instructions [found here](https://docs.getdbt.com/docs/dbt-cloud-apis/service-tokens). Ensure that you save a copy of the token, as you won’t be able to access this later. In this example we use `Account Admin`, but you can also use `Job Admin` instead for token permissions. +[Create a service token](/docs/dbt-cloud-apis/service-tokens) with `Job Admin` privileges from within dbt Cloud. Ensure that you save a copy of the token, as you won’t be able to access this later. ## Create a dbt Cloud job -In your dbt Cloud account create a job, paying special attention to the information in the bullets below. Additional information for creating a dbt Cloud job can be found [here](/guides/bigquery). +[Create a job in your dbt Cloud account](/docs/deploy/deploy-jobs#create-and-schedule-jobs), paying special attention to the information in the bullets below. -- Configure the job with the commands that you want to include when this job kicks off, as Airflow will be referring to the job’s configurations for this rather than being explicitly coded in the Airflow DAG. This job will run a set of commands rather than a single command. +- Configure the job with the full commands that you want to include when this job kicks off. This sample code has Airflow triggering the dbt Cloud job and all of its commands, instead of explicitly identifying individual models to run from inside of Airflow. - Ensure that the schedule is turned **off** since we’ll be using Airflow to kick things off. - Once you hit `save` on the job, make sure you copy the URL and save it for referencing later. The url will look similar to this: @@ -144,77 +120,59 @@ https://cloud.getdbt.com/#/accounts/{account_id}/projects/{project_id}/jobs/{job -## Add your dbt Cloud API token as a secure connection - - +## Connect dbt Cloud to Airflow -Now you have all the working pieces to get up and running with Airflow + dbt Cloud. Let’s dive into make this all work together. We will **set up a connection** and **run a DAG in Airflow** that kicks off a dbt Cloud job. +Now you have all the working pieces to get up and running with Airflow + dbt Cloud. It's time to **set up a connection** and **run a DAG in Airflow** that kicks off a dbt Cloud job. -1. Navigate to Admin and click on **Connections** +1. From the Airflow interface, navigate to Admin and click on **Connections** ![Airflow connections menu](/img/guides/orchestration/airflow-and-dbt-cloud/airflow-connections-menu.png) 2. Click on the `+` sign to add a new connection, then click on the drop down to search for the dbt Cloud Connection Type - ![Create connection](/img/guides/orchestration/airflow-and-dbt-cloud/create-connection.png) - ![Connection type](/img/guides/orchestration/airflow-and-dbt-cloud/connection-type.png) 3. Add in your connection details and your default dbt Cloud account id. This is found in your dbt Cloud URL after the accounts route section (`/accounts/{YOUR_ACCOUNT_ID}`), for example the account with id 16173 would see this in their URL: `https://cloud.getdbt.com/#/accounts/16173/projects/36467/jobs/65767/` -![https://lh3.googleusercontent.com/sRxe5xbv_LYhIKblc7eiY7AmByr1OibOac2_fIe54rpU3TBGwjMpdi_j0EPEFzM1_gNQXry7Jsm8aVw9wQBSNs1I6Cyzpvijaj0VGwSnmVf3OEV8Hv5EPOQHrwQgK2RhNBdyBxN2](https://lh3.googleusercontent.com/sRxe5xbv_LYhIKblc7eiY7AmByr1OibOac2_fIe54rpU3TBGwjMpdi_j0EPEFzM1_gNQXry7Jsm8aVw9wQBSNs1I6Cyzpvijaj0VGwSnmVf3OEV8Hv5EPOQHrwQgK2RhNBdyBxN2) - -## Add your `job_id` and `account_id` config details to the python file + ![Connection type](/img/guides/orchestration/airflow-and-dbt-cloud/connection-type-configured.png) - Add your `job_id` and `account_id` config details to the python file: [dbt_cloud_provider_eltml.py](https://github.com/sungchun12/airflow-dbt-cloud/blob/main/dags/dbt_cloud_provider_eltml.py). +## Update the placeholders in the sample code -1. You’ll find these details within the dbt Cloud job URL, see the comments in the code snippet below for an example. + Add your `account_id` and `job_id` to the python file [dbt_cloud_provider_eltml.py](https://github.com/sungchun12/airflow-dbt-cloud/blob/main/dags/dbt_cloud_provider_eltml.py). - ```python - # dbt Cloud Job URL: https://cloud.getdbt.com/#/accounts/16173/projects/36467/jobs/65767/ - # account_id: 16173 - #job_id: 65767 +Both IDs are included inside of the dbt Cloud job URL as shown in the following snippets: - # line 28 - default_args={"dbt_cloud_conn_id": "dbt_cloud", "account_id": 16173}, +```python +# For the dbt Cloud Job URL https://cloud.getdbt.com/#/accounts/16173/projects/36467/jobs/65767/ +# The account_id is 16173 - trigger_dbt_cloud_job_run = DbtCloudRunJobOperator( - task_id="trigger_dbt_cloud_job_run", - job_id=65767, # line 39 - check_interval=10, - timeout=300, - ) - ``` - -2. Turn on the DAG and verify the job succeeded after running. Note: screenshots taken from different job runs, but the user experience is consistent. - - ![https://lh6.googleusercontent.com/p8AqQRy0UGVLjDGPmcuGYmQ_BRodyL0Zis-eQgSmp69EHbKW51o4S-bCl1fXHlOmwpYEBxD0A-O1Q1hwt-VDVMO1wWH-AIeaoelBx06JXRJ0m1OcHaPpFKH0xDiduIhNlQhhbLiy](https://lh6.googleusercontent.com/p8AqQRy0UGVLjDGPmcuGYmQ_BRodyL0Zis-eQgSmp69EHbKW51o4S-bCl1fXHlOmwpYEBxD0A-O1Q1hwt-VDVMO1wWH-AIeaoelBx06JXRJ0m1OcHaPpFKH0xDiduIhNlQhhbLiy) - - ![Airflow DAG](/img/guides/orchestration/airflow-and-dbt-cloud/airflow-dag.png) - - ![Task run instance](/img/guides/orchestration/airflow-and-dbt-cloud/task-run-instance.png) - - ![https://lh6.googleusercontent.com/S9QdGhLAdioZ3x634CChugsJRiSVtTTd5CTXbRL8ADA6nSbAlNn4zV0jb3aC946c8SGi9FRTfyTFXqjcM-EBrJNK5hQ0HHAsR5Fj7NbdGoUfBI7xFmgeoPqnoYpjyZzRZlXkjtxS](https://lh6.googleusercontent.com/S9QdGhLAdioZ3x634CChugsJRiSVtTTd5CTXbRL8ADA6nSbAlNn4zV0jb3aC946c8SGi9FRTfyTFXqjcM-EBrJNK5hQ0HHAsR5Fj7NbdGoUfBI7xFmgeoPqnoYpjyZzRZlXkjtxS) - -## How do I rerun the dbt Cloud job and downstream tasks in my pipeline? - -If you have worked with dbt Cloud before, you have likely encountered cases where a job fails. In those cases, you have likely logged into dbt Cloud, investigated the error, and then manually restarted the job. - -This section of the guide will show you how to restart the job directly from Airflow. This will specifically run *just* the `trigger_dbt_cloud_job_run` and downstream tasks of the Airflow DAG and not the entire DAG. If only the transformation step fails, you don’t need to re-run the extract and load processes. Let’s jump into how to do that in Airflow. - -1. Click on the task +# Update line 28 +default_args={"dbt_cloud_conn_id": "dbt_cloud", "account_id": 16173}, +``` - ![Task DAG view](/img/guides/orchestration/airflow-and-dbt-cloud/task-dag-view.png) +```python +# For the dbt Cloud Job URL https://cloud.getdbt.com/#/accounts/16173/projects/36467/jobs/65767/ +# The job_id is 65767 + +# Update line 39 +trigger_dbt_cloud_job_run = DbtCloudRunJobOperator( + task_id="trigger_dbt_cloud_job_run", + job_id=65767, + check_interval=10, + timeout=300, + ) +``` -2. Clear the task instance + - ![Clear task instance](/img/guides/orchestration/airflow-and-dbt-cloud/clear-task-instance.png) +## Run the Airflow DAG - ![Approve clearing](/img/guides/orchestration/airflow-and-dbt-cloud/approve-clearing.png) +Turn on the DAG and trigger it to run. Verify the job succeeded after running. -3. Watch it rerun in real time +![Airflow DAG](/img/guides/orchestration/airflow-and-dbt-cloud/airflow-dag.png) - ![Re-run](/img/guides/orchestration/airflow-and-dbt-cloud/re-run.png) +Click Monitor Job Run to open the run details in dbt Cloud. +![Task run instance](/img/guides/orchestration/airflow-and-dbt-cloud/task-run-instance.png) ## Cleaning up @@ -224,9 +182,9 @@ At the end of this guide, make sure you shut down your docker container. When y $ astrocloud dev stop [+] Running 3/3 - ⠿ Container airflow-dbt-cloud_e3fe3c-webserver-1 Stopped 7.5s - ⠿ Container airflow-dbt-cloud_e3fe3c-scheduler-1 Stopped 3.3s - ⠿ Container airflow-dbt-cloud_e3fe3c-postgres-1 Stopped 0.3s + ⠿ Container airflow-dbt-cloud_e3fe3c-webserver-1 Stopped 7.5s + ⠿ Container airflow-dbt-cloud_e3fe3c-scheduler-1 Stopped 3.3s + ⠿ Container airflow-dbt-cloud_e3fe3c-postgres-1 Stopped 0.3s ``` To verify that the deployment has stopped, use the following command: @@ -244,37 +202,29 @@ airflow-dbt-cloud_e3fe3c-scheduler-1 exited airflow-dbt-cloud_e3fe3c-postgres-1 exited ``` - + ## Frequently asked questions ### How can we run specific subsections of the dbt DAG in Airflow? -Because of the way we configured the dbt Cloud job to run in Airflow, you can leave this job to your analytics engineers to define in the job configurations from dbt Cloud. If, for example, we need to run hourly-tagged models every hour and daily-tagged models daily, we can create jobs like `Hourly Run` or `Daily Run` and utilize the commands `dbt run -s tag:hourly` and `dbt run -s tag:daily` within each, respectively. We only need to grab our dbt Cloud `account` and `job id`, configure it in an Airflow DAG with the code provided, and then we can be on your way. See more node selection options: [here](/reference/node-selection/syntax) - -### How can I re-run models from the point of failure? - -You may want to parse the dbt DAG in Airflow to get the benefit of re-running from the point of failure. However, when you have hundreds of models in your DAG expanded out, it becomes useless for diagnosis and rerunning due to the overhead that comes along with creating an expansive Airflow DAG. +Because the Airflow DAG references dbt Cloud jobs, your analytics engineers can take responsibility for configuring the jobs in dbt Cloud. -You can’t re-run from failure natively in dbt Cloud today (feature coming!), but you can use a custom rerun parser. +For example, to run some models hourly and others daily, there will be jobs like `Hourly Run` or `Daily Run` using the commands `dbt run --select tag:hourly` and `dbt run --select tag:daily` respectively. Once configured in dbt Cloud, these can be added as steps in an Airflow DAG as shown in this guide. Refer to our full [node selection syntax docs here](/reference/node-selection/syntax). -Using a simple python script coupled with the dbt Cloud provider, you can: - -- Avoid managing artifacts in a separate storage bucket(dbt Cloud does this for you) -- Avoid building your own parsing logic -- Get clear logs on what models you're rerunning in dbt Cloud (without hard coding step override commands) - -Watch the video below to see how it works! +### How can I re-run models from the point of failure? - +You can trigger re-run from point of failure with the `rerun` API endpoint. See the docs on [retrying jobs](/docs/deploy/retry-jobs) for more information. ### Should Airflow run one big dbt job or many dbt jobs? -Overall we recommend being as purposeful and minimalistic as you can. This is because dbt manages all of the dependencies between models and the orchestration of running those dependencies in order, which in turn has benefits in terms of warehouse processing efforts. +dbt jobs are most effective when a build command contains as many models at once as is practical. This is because dbt manages the dependencies between models and coordinates running them in order, which ensures that your jobs can run in a highly parallelized fashion. It also streamlines the debugging process when a model fails and enables re-run from point of failure. + +As an explicit example, it's not recommended to have a dbt job for every single node in your DAG. Try combining your steps according to desired run frequency, or grouping by department (finance, marketing, customer success...) instead. ### We want to kick off our dbt jobs after our ingestion tool (such as Fivetran) / data pipelines are done loading data. Any best practices around that? -Our friends at Astronomer answer this question with this example: [here](https://registry.astronomer.io/dags/fivetran-dbt-cloud-census) +Astronomer's DAG registry has a sample workflow combining Fivetran, dbt Cloud and Census [here](https://registry.astronomer.io/dags/fivetran-dbt_cloud-census/versions/3.0.0). ### How do you set up a CI/CD workflow with Airflow? @@ -285,12 +235,12 @@ Check out these two resources for accomplishing your own CI/CD pipeline: ### Can dbt dynamically create tasks in the DAG like Airflow can? -We prefer to keep models bundled vs. unbundled. You can go this route, but if you have hundreds of dbt models, it’s more effective to let the dbt Cloud job handle the models and dependencies. Bundling provides the solution to clear observability when things go wrong - we've seen more success in having the ability to clearly see issues in a bundled dbt Cloud job than combing through the nodes of an expansive Airflow DAG. If you still have a use case for this level of control though, our friends at Astronomer answer this question [here](https://www.astronomer.io/blog/airflow-dbt-1/)! +As discussed above, we prefer to keep jobs bundled together and containing as many nodes as are necessary. If you must run nodes one at a time for some reason, then review [this article](https://www.astronomer.io/blog/airflow-dbt-1/) for some pointers. -### Can you trigger notifications if a dbt job fails with Airflow? Is there any way to access the status of the dbt Job to do that? +### Can you trigger notifications if a dbt job fails with Airflow? -Yes, either through [Airflow's email/slack](https://www.astronomer.io/guides/error-notifications-in-airflow/) functionality by itself or combined with [dbt Cloud's notifications](/docs/deploy/job-notifications), which support email and slack notifications. +Yes, either through [Airflow's email/slack](https://www.astronomer.io/guides/error-notifications-in-airflow/) functionality, or [dbt Cloud's notifications](/docs/deploy/job-notifications), which support email and Slack notifications. You could also create a [webhook](/docs/deploy/webhooks). -### Are there decision criteria for how to best work with dbt Cloud and airflow? +### How should I plan my dbt Cloud + Airflow implementation? -Check out this deep dive into planning your dbt Cloud + Airflow implementation [here](https://www.youtube.com/watch?v=n7IIThR8hGk)! +Check out [this recording](https://www.youtube.com/watch?v=n7IIThR8hGk) of a dbt meetup for some tips. diff --git a/website/docs/guides/bigquery-qs.md b/website/docs/guides/bigquery-qs.md index c1f632f0621..9cf2447fa52 100644 --- a/website/docs/guides/bigquery-qs.md +++ b/website/docs/guides/bigquery-qs.md @@ -78,7 +78,7 @@ In order to let dbt connect to your warehouse, you'll need to generate a keyfile - Click **Next** to create a new service account. 2. Create a service account for your new project from the [Service accounts page](https://console.cloud.google.com/projectselector2/iam-admin/serviceaccounts?supportedpurview=project). For more information, refer to [Create a service account](https://developers.google.com/workspace/guides/create-credentials#create_a_service_account) in the Google Cloud docs. As an example for this guide, you can: - Type `dbt-user` as the **Service account name** - - From the **Select a role** dropdown, choose **BigQuery Admin** and click **Continue** + - From the **Select a role** dropdown, choose **BigQuery Job User** and **BigQuery Data Editor** roles and click **Continue** - Leave the **Grant users access to this service account** fields blank - Click **Done** 3. Create a service account key for your new project from the [Service accounts page](https://console.cloud.google.com/iam-admin/serviceaccounts?walkthrough_id=iam--create-service-account-keys&start_index=1#step_index=1). For more information, refer to [Create a service account key](https://cloud.google.com/iam/docs/creating-managing-service-account-keys#creating) in the Google Cloud docs. When downloading the JSON file, make sure to use a filename you can easily remember. For example, `dbt-user-creds.json`. For security reasons, dbt Labs recommends that you protect this JSON file like you would your identity credentials; for example, don't check the JSON file into your version control software. diff --git a/website/docs/guides/manual-install-qs.md b/website/docs/guides/manual-install-qs.md index c74d30db51c..e9c1af259ac 100644 --- a/website/docs/guides/manual-install-qs.md +++ b/website/docs/guides/manual-install-qs.md @@ -16,7 +16,7 @@ When you use dbt Core to work with dbt, you will be editing files locally using * To use dbt Core, it's important that you know some basics of the Terminal. In particular, you should understand `cd`, `ls` and `pwd` to navigate through the directory structure of your computer easily. * Install dbt Core using the [installation instructions](/docs/core/installation-overview) for your operating system. -* Complete [Setting up (in BigQuery)](/guides/bigquery?step=2) and [Loading data (BigQuery)](/guides/bigquery?step=3). +* Complete appropriate Setting up and Loading data steps in the Quickstart for dbt Cloud series. For example, for BigQuery, complete [Setting up (in BigQuery)](/guides/bigquery?step=2) and [Loading data (BigQuery)](/guides/bigquery?step=3). * [Create a GitHub account](https://github.com/join) if you don't already have one. ### Create a starter project diff --git a/website/docs/guides/set-up-your-databricks-dbt-project.md b/website/docs/guides/set-up-your-databricks-dbt-project.md index c17c6a1f99e..b2988f36589 100644 --- a/website/docs/guides/set-up-your-databricks-dbt-project.md +++ b/website/docs/guides/set-up-your-databricks-dbt-project.md @@ -62,7 +62,7 @@ Let’s [create a Databricks SQL warehouse](https://docs.databricks.com/sql/admi 5. Click *Create* 6. Configure warehouse permissions to ensure our service principal and developer have the right access. -We are not covering python in this post but if you want to learn more, check out these [docs](https://docs.getdbt.com/docs/build/python-models#specific-data-platforms). Depending on your workload, you may wish to create a larger SQL Warehouse for production workflows while having a smaller development SQL Warehouse (if you’re not using Serverless SQL Warehouses). +We are not covering python in this post but if you want to learn more, check out these [docs](https://docs.getdbt.com/docs/build/python-models#specific-data-platforms). Depending on your workload, you may wish to create a larger SQL Warehouse for production workflows while having a smaller development SQL Warehouse (if you’re not using Serverless SQL Warehouses). As your project grows, you might want to apply [compute per model configurations](/reference/resource-configs/databricks-configs#specifying-the-compute-for-models). ## Configure your dbt project diff --git a/website/docs/reference/artifacts/dbt-artifacts.md b/website/docs/reference/artifacts/dbt-artifacts.md index 859fde7c908..31525777500 100644 --- a/website/docs/reference/artifacts/dbt-artifacts.md +++ b/website/docs/reference/artifacts/dbt-artifacts.md @@ -48,3 +48,6 @@ In the manifest, the `metadata` may also include: #### Notes: - The structure of dbt artifacts is canonized by [JSON schemas](https://json-schema.org/), which are hosted at **schemas.getdbt.com**. - Artifact versions may change in any minor version of dbt (`v1.x.0`). Each artifact is versioned independently. + +## Related docs +- [Other artifacts](/reference/artifacts/other-artifacts) files such as `index.html` or `graph_summary.json`. diff --git a/website/docs/reference/artifacts/other-artifacts.md b/website/docs/reference/artifacts/other-artifacts.md index 205bdfc1a14..60050a6be66 100644 --- a/website/docs/reference/artifacts/other-artifacts.md +++ b/website/docs/reference/artifacts/other-artifacts.md @@ -21,7 +21,7 @@ This file is used to store a compressed representation of files dbt has parsed. **Produced by:** commands supporting [node selection](/reference/node-selection/syntax) -Stores the networkx representation of the dbt resource DAG. +Stores the network representation of the dbt resource DAG. ### graph_summary.json diff --git a/website/docs/reference/dbt-jinja-functions/config.md b/website/docs/reference/dbt-jinja-functions/config.md index c2fc8f96e5b..3903c82eef7 100644 --- a/website/docs/reference/dbt-jinja-functions/config.md +++ b/website/docs/reference/dbt-jinja-functions/config.md @@ -25,8 +25,6 @@ is responsible for handling model code that looks like this: ``` Review [Model configurations](/reference/model-configs) for examples and more information on valid arguments. -https://docs.getdbt.com/reference/model-configs - ## config.get __Args__: diff --git a/website/docs/reference/dbt-jinja-functions/return.md b/website/docs/reference/dbt-jinja-functions/return.md index 43bbddfa2d1..d2069bc9254 100644 --- a/website/docs/reference/dbt-jinja-functions/return.md +++ b/website/docs/reference/dbt-jinja-functions/return.md @@ -1,6 +1,6 @@ --- title: "About return function" -sidebar_variable: "return" +sidebar_label: "return" id: "return" description: "Read this guide to understand the return Jinja function in dbt." --- diff --git a/website/docs/reference/node-selection/defer.md b/website/docs/reference/node-selection/defer.md index 03c3b2aac12..81a0f4a0328 100644 --- a/website/docs/reference/node-selection/defer.md +++ b/website/docs/reference/node-selection/defer.md @@ -234,3 +234,8 @@ dbt will check to see if `dev_alice.model_a` exists. If it doesn't exist, dbt wi + +## Related docs + +- [Using defer in dbt Cloud](/docs/cloud/about-cloud-develop-defer) + diff --git a/website/docs/reference/node-selection/syntax.md b/website/docs/reference/node-selection/syntax.md index d0ea4a9acd8..22946903b7d 100644 --- a/website/docs/reference/node-selection/syntax.md +++ b/website/docs/reference/node-selection/syntax.md @@ -35,7 +35,7 @@ To follow [POSIX standards](https://pubs.opengroup.org/onlinepubs/9699919799/bas 3. dbt now has a list of still-selected resources of varying types. As a final step, it tosses away any resource that does not match the resource type of the current task. (Only seeds are kept for `dbt seed`, only models for `dbt run`, only tests for `dbt test`, and so on.) -### Shorthand +## Shorthand Select resources to build (run, test, seed, snapshot) or check freshness: `--select`, `-s` @@ -43,6 +43,9 @@ Select resources to build (run, test, seed, snapshot) or check freshness: `--sel By default, `dbt run` will execute _all_ of the models in the dependency graph. During development (and deployment), it is useful to specify only a subset of models to run. Use the `--select` flag with `dbt run` to select a subset of models to run. Note that the following arguments (`--select`, `--exclude`, and `--selector`) also apply to other dbt tasks, such as `test` and `build`. + + + The `--select` flag accepts one or more arguments. Each argument can be one of: 1. a package name @@ -52,8 +55,7 @@ The `--select` flag accepts one or more arguments. Each argument can be one of: Examples: - - ```bash +```bash dbt run --select "my_dbt_project_name" # runs all models in your project dbt run --select "my_dbt_model" # runs a specific model dbt run --select "path.to.my.models" # runs all models in a specific directory @@ -61,14 +63,30 @@ dbt run --select "my_package.some_model" # run a specific model in a specific pa dbt run --select "tag:nightly" # run models with the "nightly" tag dbt run --select "path/to/models" # run models contained in path/to/models dbt run --select "path/to/my_model.sql" # run a specific model by its path - ``` +``` -dbt supports a shorthand language for defining subsets of nodes. This language uses the characters `+`, `@`, `*`, and `,`. + + - ```bash +dbt supports a shorthand language for defining subsets of nodes. This language uses the following characters: + +- plus operator [(`+`)](/reference/node-selection/graph-operators#the-plus-operator) +- at operator [(`@`)](/reference/node-selection/graph-operators#the-at-operator) +- asterisk operator (`*`) +- comma operator (`,`) + +Examples: + +```bash # multiple arguments can be provided to --select - dbt run --select "my_first_model my_second_model" +dbt run --select "my_first_model my_second_model" + +# select my_model and all of its children +dbt run --select "my_model+" + +# select my_model, its children, and the parents of its children +dbt run --models @my_model # these arguments can be projects, models, directory paths, tags, or sources dbt run --select "tag:nightly my_model finance.base.*" @@ -77,6 +95,10 @@ dbt run --select "tag:nightly my_model finance.base.*" dbt run --select "path:marts/finance,tag:nightly,config.materialized:table" ``` + + + + As your selection logic gets more complex, and becomes unwieldly to type out as command-line arguments, consider using a [yaml selector](/reference/node-selection/yaml-selectors). You can use a predefined definition with the `--selector` flag. Note that when you're using `--selector`, most other flags (namely `--select` and `--exclude`) will be ignored. diff --git a/website/docs/reference/resource-configs/access.md b/website/docs/reference/resource-configs/access.md index da50e48d2f0..fcde93647a1 100644 --- a/website/docs/reference/resource-configs/access.md +++ b/website/docs/reference/resource-configs/access.md @@ -27,7 +27,7 @@ You can apply access modifiers in config files, including `the dbt_project.yml`, There are multiple approaches to configuring access: -In the model configs of `dbt_project.yml``: +In the model configs of `dbt_project.yml`: ```yaml models: diff --git a/website/docs/reference/resource-configs/bigquery-configs.md b/website/docs/reference/resource-configs/bigquery-configs.md index ffbaa37c059..d3497a02caf 100644 --- a/website/docs/reference/resource-configs/bigquery-configs.md +++ b/website/docs/reference/resource-configs/bigquery-configs.md @@ -718,3 +718,188 @@ Views with this configuration will be able to select from objects in `project_1. The `grant_access_to` config is not thread-safe when multiple views need to be authorized for the same dataset. The initial `dbt run` operation after a new `grant_access_to` config is added should therefore be executed in a single thread. Subsequent runs using the same configuration will not attempt to re-apply existing access grants, and can make use of multiple threads. + + +## Materialized views + +The BigQuery adapter supports [materialized views](https://cloud.google.com/bigquery/docs/materialized-views-intro) +with the following configuration parameters: + +| Parameter | Type | Required | Default | Change Monitoring Support | +|-------------------------------------------------------------|------------------------|----------|---------|---------------------------| +| `on_configuration_change` | `` | no | `apply` | n/a | +| [`cluster_by`](#clustering-clause) | `[]` | no | `none` | drop/create | +| [`partition_by`](#partition-clause) | `{}` | no | `none` | drop/create | +| [`enable_refresh`](#auto-refresh) | `` | no | `true` | alter | +| [`refresh_interval_minutes`](#auto-refresh) | `` | no | `30` | alter | +| [`max_staleness`](#auto-refresh) (in Preview) | `` | no | `none` | alter | +| [`description`](/reference/resource-properties/description) | `` | no | `none` | alter | +| [`labels`](#specifying-labels) | `{: }` | no | `none` | alter | +| [`hours_to_expiration`](#controlling-table-expiration) | `` | no | `none` | alter | +| [`kms_key_name`](#using-kms-encryption) | `` | no | `none` | alter | + + + + + + + + +```yaml +models: + [](/reference/resource-configs/resource-path): + [+](/reference/resource-configs/plus-prefix)[materialized](/reference/resource-configs/materialized): materialized_view + [+](/reference/resource-configs/plus-prefix)on_configuration_change: apply | continue | fail + [+](/reference/resource-configs/plus-prefix)[cluster_by](#clustering-clause): | [] + [+](/reference/resource-configs/plus-prefix)[partition_by](#partition-clause): + - field: + - data_type: timestamp | date | datetime | int64 + # only if `data_type` is not 'int64' + - granularity: hour | day | month | year + # only if `data_type` is 'int64' + - range: + - start: + - end: + - interval: + [+](/reference/resource-configs/plus-prefix)[enable_refresh](#auto-refresh): true | false + [+](/reference/resource-configs/plus-prefix)[refresh_interval_minutes](#auto-refresh): + [+](/reference/resource-configs/plus-prefix)[max_staleness](#auto-refresh): + [+](/reference/resource-configs/plus-prefix)[description](/reference/resource-properties/description): + [+](/reference/resource-configs/plus-prefix)[labels](#specifying-labels): {: } + [+](/reference/resource-configs/plus-prefix)[hours_to_expiration](#acontrolling-table-expiration): + [+](/reference/resource-configs/plus-prefix)[kms_key_name](##using-kms-encryption): +``` + + + + + + + + + + +```yaml +version: 2 + +models: + - name: [] + config: + [materialized](/reference/resource-configs/materialized): materialized_view + on_configuration_change: apply | continue | fail + [cluster_by](#clustering-clause): | [] + [partition_by](#partition-clause): + - field: + - data_type: timestamp | date | datetime | int64 + # only if `data_type` is not 'int64' + - granularity: hour | day | month | year + # only if `data_type` is 'int64' + - range: + - start: + - end: + - interval: + [enable_refresh](#auto-refresh): true | false + [refresh_interval_minutes](#auto-refresh): + [max_staleness](#auto-refresh): + [description](/reference/resource-properties/description): + [labels](#specifying-labels): {: } + [hours_to_expiration](#acontrolling-table-expiration): + [kms_key_name](##using-kms-encryption): +``` + + + + + + + + + + +```jinja +{{ config( + [materialized](/reference/resource-configs/materialized)='materialized_view', + on_configuration_change="apply" | "continue" | "fail", + [cluster_by](#clustering-clause)="" | [""], + [partition_by](#partition-clause)={ + "field": "", + "data_type": "timestamp" | "date" | "datetime" | "int64", + + # only if `data_type` is not 'int64' + "granularity": "hour" | "day" | "month" | "year, + + # only if `data_type` is 'int64' + "range": { + "start": , + "end": , + "interval": , + } + }, + + # auto-refresh options + [enable_refresh](#auto-refresh)= true | false, + [refresh_interval_minutes](#auto-refresh)=, + [max_staleness](#auto-refresh)="", + + # additional options + [description](/reference/resource-properties/description)="", + [labels](#specifying-labels)={ + "": "", + }, + [hours_to_expiration](#acontrolling-table-expiration)=, + [kms_key_name](##using-kms-encryption)="", +) }} +``` + + + + + + + +Many of these parameters correspond to their table counterparts and have been linked above. +The set of parameters unique to materialized views covers [auto-refresh functionality](#auto-refresh). + +Find more information about these parameters in the BigQuery docs: +- [CREATE MATERIALIZED VIEW statement](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-definition-language#create_materialized_view_statement) +- [materialized_view_option_list](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-definition-language#materialized_view_option_list) + +### Auto-refresh + +| Parameter | Type | Required | Default | Change Monitoring Support | +|------------------------------|--------------|----------|---------|---------------------------| +| `enable_refresh` | `` | no | `true` | alter | +| `refresh_interval_minutes` | `` | no | `30` | alter | +| `max_staleness` (in Preview) | `` | no | `none` | alter | + +BigQuery supports [automatic refresh](https://cloud.google.com/bigquery/docs/materialized-views-manage#automatic_refresh) configuration for materialized views. +By default, a materialized view will automatically refresh within 5 minutes of changes in the base table, but not more frequently than once every 30 minutes. +BigQuery only officially supports the configuration of the frequency (the "once every 30 minutes" frequency); +however, there is a feature in preview that allows for the configuration of the staleness (the "5 minutes" refresh). +dbt will monitor these parameters for changes and apply them using an `ALTER` statement. + +Find more information about these parameters in the BigQuery docs: +- [materialized_view_option_list](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-definition-language#materialized_view_option_list) +- [max_staleness](https://cloud.google.com/bigquery/docs/materialized-views-create#max_staleness) + +### Limitations + +As with most data platforms, there are limitations associated with materialized views. Some worth noting include: + +- Materialized view SQL has a [limited feature set](https://cloud.google.com/bigquery/docs/materialized-views-create#supported-mvs). +- Materialized view SQL cannot be updated; the materialized view must go through a `--full-refresh` (DROP/CREATE). +- The `partition_by` clause on a materialized view must match that of the underlying base table. +- While materialized views can have descriptions, materialized view *columns* cannot. +- Recreating/dropping the base table requires recreating/dropping the materialized view. + +Find more information about materialized view limitations in Google's BigQuery [docs](https://cloud.google.com/bigquery/docs/materialized-views-intro#limitations). + + diff --git a/website/docs/reference/resource-configs/databricks-configs.md b/website/docs/reference/resource-configs/databricks-configs.md index a3b00177967..fb3c9e7f5c3 100644 --- a/website/docs/reference/resource-configs/databricks-configs.md +++ b/website/docs/reference/resource-configs/databricks-configs.md @@ -361,6 +361,191 @@ insert into analytics.replace_where_incremental + + +## Selecting compute per model + +Beginning in version 1.7.2, you can assign which compute resource to use on a per-model basis. +For SQL models, you can select a SQL Warehouse (serverless or provisioned) or an all purpose cluster. +For details on how this feature interacts with python models, see [Specifying compute for Python models](#specifying-compute-for-python-models). + +:::note + +This is an optional setting. If you do not configure this as shown below, we will default to the compute specified by http_path in the top level of the output section in your profile. +This is also the compute that will be used for tasks not associated with a particular model, such as gathering metadata for all tables in a schema. + +::: + + +To take advantage of this capability, you will need to add compute blocks to your profile: + + + +```yaml + +: + target: # this is the default target + outputs: + : + type: databricks + catalog: [optional catalog name if you are using Unity Catalog] + schema: [schema name] # Required + host: [yourorg.databrickshost.com] # Required + + ### This path is used as the default compute + http_path: [/sql/your/http/path] # Required + + ### New compute section + compute: + + ### Name that you will use to refer to an alternate compute + Compute1: + http_path: [‘/sql/your/http/path’] # Required of each alternate compute + + ### A third named compute, use whatever name you like + Compute2: + http_path: [‘/some/other/path’] # Required of each alternate compute + ... + + : # additional targets + ... + ### For each target, you need to define the same compute, + ### but you can specify different paths + compute: + + ### Name that you will use to refer to an alternate compute + Compute1: + http_path: [‘/sql/your/http/path’] # Required of each alternate compute + + ### A third named compute, use whatever name you like + Compute2: + http_path: [‘/some/other/path’] # Required of each alternate compute + ... + +``` + + + +The new compute section is a map of user chosen names to objects with an http_path property. +Each compute is keyed by a name which is used in the model definition/configuration to indicate which compute you wish to use for that model/selection of models. +We recommend choosing a name that is easily recognized as the compute resources you're using, such as the name of the compute resource inside the Databricks UI. + +:::note + +You need to use the same set of names for compute across your outputs, though you may supply different http_paths, allowing you to use different computes in different deployment scenarios. + +::: + +To configure this inside of dbt Cloud, use the [extended attributes feature](/docs/dbt-cloud-environments#extended-attributes-) on the desired environments: + +```yaml + +compute: + Compute1: + http_path:[`/some/other/path'] + Compute2: + http_path:[`/some/other/path'] + +``` + +### Specifying the compute for models + +As with many other configuaration options, you can specify the compute for a model in multiple ways, using `databricks_compute`. +In your `dbt_project.yml`, the selected compute can be specified for all the models in a given directory: + + + +```yaml + +... + +models: + +databricks_compute: "Compute1" # use the `Compute1` warehouse/cluster for all models in the project... + my_project: + clickstream: + +databricks_compute: "Compute2" # ...except for the models in the `clickstream` folder, which will use `Compute2`. + +snapshots: + +databricks_compute: "Compute1" # all Snapshot models are configured to use `Compute1`. + +``` + + + +For an individual model the compute can be specified in the model config in your schema file. + + + +```yaml + +models: + - name: table_model + config: + databricks_compute: Compute1 + columns: + - name: id + data_type: int + +``` + + + + +Alternatively the warehouse can be specified in the config block of a model's SQL file. + + + +```sql + +{{ + config( + materialized='table', + databricks_compute='Compute1' + ) +}} +select * from {{ ref('seed') }} + +``` + + + + +To validate that the specified compute is being used, look for lines in your dbt.log like: + +``` +Databricks adapter ... using default compute resource. +``` + +or + +``` +Databricks adapter ... using compute resource . +``` + +### Specifying compute for Python models + +Materializing a python model requires execution of SQL as well as python. +Specifically, if your python model is incremental, the current execution pattern involves executing python to create a staging table that is then merged into your target table using SQL. +The python code needs to run on an all purpose cluster, while the SQL code can run on an all purpose cluster or a SQL Warehouse. +When you specify your `databricks_compute` for a python model, you are currently only specifying which compute to use when running the model-specific SQL. +If you wish to use a different compute for executing the python itself, you must specify an alternate `http_path` in the config for the model. Please note that declaring a separate SQL compute and a python compute for your python dbt models is optional. If you wish to do this: + + + + ```python + +def model(dbt, session): + dbt.config( + http_path="sql/protocolv1/..." + ) + +``` + + + +If your default compute is a SQL Warehouse, you will need to specify an all purpose cluster `http_path` in this way. + + ## Persisting model descriptions diff --git a/website/docs/reference/resource-configs/infer-configs.md b/website/docs/reference/resource-configs/infer-configs.md new file mode 100644 index 00000000000..c4837806935 --- /dev/null +++ b/website/docs/reference/resource-configs/infer-configs.md @@ -0,0 +1,39 @@ +--- +title: "Infer configurations" +description: "Read this guide to understand how to configure Infer with dbt." +id: "infer-configs" +--- + + +## Authentication + +To connect to Infer from your dbt instance you need to set up a correct profile in your `profiles.yml`. + +The format of this should look like this: + + + +```yaml +: + target: + outputs: + : + type: infer + url: "" + username: "" + apikey: "" + data_config: + [configuration for your underlying data warehouse] +``` + + + +### Description of Infer Profile Fields + +| Field | Required | Description | +|------------|----------|---------------------------------------------------------------------------------------------------------------------------------------------------| +| `type` | Yes | Must be set to `infer`. This must be included either in `profiles.yml` or in the `dbt_project.yml` file. | +| `url` | Yes | The host name of the Infer server to connect to. Typically this is `https://app.getinfer.io`. | +| `username` | Yes | Your Infer username - the one you use to login. | +| `apikey` | Yes | Your Infer api key. | +| `data_config` | Yes | The configuration for your underlying data warehouse. The format of this follows the format of the configuration for your data warehouse adapter. | diff --git a/website/docs/reference/resource-configs/postgres-configs.md b/website/docs/reference/resource-configs/postgres-configs.md index 97a695ee12e..8465a5cbb31 100644 --- a/website/docs/reference/resource-configs/postgres-configs.md +++ b/website/docs/reference/resource-configs/postgres-configs.md @@ -108,21 +108,100 @@ models: ## Materialized views -The Postgres adapter supports [materialized views](https://www.postgresql.org/docs/current/rules-materializedviews.html). -Indexes are the only configuration that is specific to `dbt-postgres`. -The remaining configuration follows the general [materialized view](/docs/build/materializations#materialized-view) configuration. -There are also some limitations that we hope to address in the next version. +The Postgres adapter supports [materialized views](https://www.postgresql.org/docs/current/rules-materializedviews.html) +with the following configuration parameters: -### Monitored configuration changes +| Parameter | Type | Required | Default | Change Monitoring Support | +|---------------------------|--------------------|----------|---------|---------------------------| +| `on_configuration_change` | `` | no | `apply` | n/a | +| [`indexes`](#indexes) | `[{}]` | no | `none` | alter | -The settings below are monitored for changes applicable to `on_configuration_change`. + -#### Indexes -Index changes (`CREATE`, `DROP`) can be applied without the need to rebuild the materialized view. -This differs from a table model, where the table needs to be dropped and re-created to update the indexes. -If the `indexes` portion of the `config` block is updated, the changes will be detected and applied -directly to the materialized view in place. + + + + +```yaml +models: + [](/reference/resource-configs/resource-path): + [+](/reference/resource-configs/plus-prefix)[materialized](/reference/resource-configs/materialized): materialized_view + [+](/reference/resource-configs/plus-prefix)on_configuration_change: apply | continue | fail + [+](/reference/resource-configs/plus-prefix)[indexes](#indexes): + - columns: [] + unique: true | false + type: hash | btree +``` + + + + + + + + + + +```yaml +version: 2 + +models: + - name: [] + config: + [materialized](/reference/resource-configs/materialized): materialized_view + on_configuration_change: apply | continue | fail + [indexes](#indexes): + - columns: [] + unique: true | false + type: hash | btree +``` + + + + + + + + + + +```jinja +{{ config( + [materialized](/reference/resource-configs/materialized)="materialized_view", + on_configuration_change="apply" | "continue" | "fail", + [indexes](#indexes)=[ + { + "columns": [""], + "unique": true | false, + "type": "hash" | "btree", + } + ] +) }} +``` + + + + + + + +The [`indexes`](#indexes) parameter corresponds to that of a table, as explained above. +It's worth noting that, unlike tables, dbt monitors this parameter for changes and applies the changes without dropping the materialized view. +This happens via a `DROP/CREATE` of the indexes, which can be thought of as an `ALTER` of the materialized view. + +Find more information about materialized view parameters in the Postgres docs: +- [CREATE MATERIALIZED VIEW](https://www.postgresql.org/docs/current/sql-creatematerializedview.html) + + ### Limitations @@ -138,3 +217,5 @@ If the user changes the model's config to `materialized="materialized_view"`, th The solution is to execute `DROP TABLE my_model` on the data warehouse before trying the model again. + + diff --git a/website/docs/reference/resource-configs/redshift-configs.md b/website/docs/reference/resource-configs/redshift-configs.md index 9bd127a1e1a..b559c0451b0 100644 --- a/website/docs/reference/resource-configs/redshift-configs.md +++ b/website/docs/reference/resource-configs/redshift-configs.md @@ -111,40 +111,138 @@ models: ## Materialized views -The Redshift adapter supports [materialized views](https://docs.aws.amazon.com/redshift/latest/dg/materialized-view-overview.html). -Redshift-specific configuration includes the typical `dist`, `sort_type`, `sort`, and `backup`. -For materialized views, there is also the `auto_refresh` setting, which allows Redshift to [automatically refresh](https://docs.aws.amazon.com/redshift/latest/dg/materialized-view-refresh.html) the materialized view for you. -The remaining configuration follows the general [materialized view](/docs/build/materializations#Materialized-View) configuration. -There are also some limitations that we hope to address in the next version. +The Redshift adapter supports [materialized views](https://docs.aws.amazon.com/redshift/latest/dg/materialized-view-overview.html) +with the following configuration parameters: + +| Parameter | Type | Required | Default | Change Monitoring Support | +|-------------------------------------------|--------------|----------|------------------------------------------------|---------------------------| +| `on_configuration_change` | `` | no | `apply` | n/a | +| [`dist`](#using-sortkey-and-distkey) | `` | no | `even` | drop/create | +| [`sort`](#using-sortkey-and-distkey) | `[]` | no | `none` | drop/create | +| [`sort_type`](#using-sortkey-and-distkey) | `` | no | `auto` if no `sort`
`compound` if `sort` | drop/create | +| [`auto_refresh`](#auto-refresh) | `` | no | `false` | alter | +| [`backup`](#backup) | `` | no | `true` | n/a | + + + + + + + + +```yaml +models: + [](/reference/resource-configs/resource-path): + [+](/reference/resource-configs/plus-prefix)[materialized](/reference/resource-configs/materialized): materialized_view + [+](/reference/resource-configs/plus-prefix)on_configuration_change: apply | continue | fail + [+](/reference/resource-configs/plus-prefix)[dist](#using-sortkey-and-distkey): all | auto | even | + [+](/reference/resource-configs/plus-prefix)[sort](#using-sortkey-and-distkey): | [] + [+](/reference/resource-configs/plus-prefix)[sort_type](#using-sortkey-and-distkey): auto | compound | interleaved + [+](/reference/resource-configs/plus-prefix)[auto_refresh](#auto-refresh): true | false + [+](/reference/resource-configs/plus-prefix)[backup](#backup): true | false +``` + + + + -### Monitored configuration changes -The settings below are monitored for changes applicable to `on_configuration_change`. + -#### Dist + -Changes to `dist` will result in a full refresh of the existing materialized view (applied at the time of the next `dbt run` of the model). Redshift requires a materialized view to be -dropped and recreated to apply a change to the `distkey` or `diststyle`. +```yaml +version: 2 -#### Sort type, sort +models: + - name: [] + config: + [materialized](/reference/resource-configs/materialized): materialized_view + on_configuration_change: apply | continue | fail + [dist](#using-sortkey-and-distkey): all | auto | even | + [sort](#using-sortkey-and-distkey): | [] + [sort_type](#using-sortkey-and-distkey): auto | compound | interleaved + [auto_refresh](#auto-refresh): true | false + [backup](#backup): true | false +``` + + -Changes to `sort_type` or `sort` will result in a full refresh. Redshift requires a materialized -view to be dropped and recreated to apply a change to the `sortkey` or `sortstyle`. + + + + + + + +```jinja +{{ config( + [materialized](/reference/resource-configs/materialized)="materialized_view", + on_configuration_change="apply" | "continue" | "fail", + [dist](#using-sortkey-and-distkey)="all" | "auto" | "even" | "", + [sort](#using-sortkey-and-distkey)=[""], + [sort_type](#using-sortkey-and-distkey)="auto" | "compound" | "interleaved", + [auto_refresh](#auto-refresh)=true | false, + [backup](#backup)=true | false, +) }} +``` + + + + + + + +Many of these parameters correspond to their table counterparts and have been linked above. +The parameters unique to materialized views are the [auto-refresh](#auto-refresh) and [backup](#backup) functionality, which are covered below. + +Find more information about the [CREATE MATERIALIZED VIEW](https://docs.aws.amazon.com/redshift/latest/dg/materialized-view-create-sql-command.html) parameters in the Redshift docs. + +#### Auto-refresh + +| Parameter | Type | Required | Default | Change Monitoring Support | +|----------------|-------------|----------|---------|---------------------------| +| `auto_refresh` | `` | no | `false` | alter | + +Redshift supports [automatic refresh](https://docs.aws.amazon.com/redshift/latest/dg/materialized-view-refresh.html#materialized-view-auto-refresh) configuration for materialized views. +By default, a materialized view does not automatically refresh. +dbt monitors this parameter for changes and applies them using an `ALTER` statement. + +Learn more information about the [parameters](https://docs.aws.amazon.com/redshift/latest/dg/materialized-view-create-sql-command.html#mv_CREATE_MATERIALIZED_VIEW-parameters) in the Redshift docs. #### Backup -Changes to `backup` will result in a full refresh. Redshift requires a materialized -view to be dropped and recreated to apply a change to the `backup` setting. +| Parameter | Type | Required | Default | Change Monitoring Support | +|-----------|-------------|----------|---------|---------------------------| +| `backup` | `` | no | `true` | n/a | -#### Auto refresh +Redshift supports [backup](https://docs.aws.amazon.com/redshift/latest/mgmt/working-with-snapshots.html) configuration of clusters at the object level. +This parameter identifies if the materialized view should be backed up as part of the cluster snapshot. +By default, a materialized view will be backed up during a cluster snapshot. +dbt cannot monitor this parameter as it is not queryable within Redshift. +If the value is changed, the materialized view will need to go through a `--full-refresh` in order to set it. -The `auto_refresh` setting can be updated via an `ALTER` statement. This setting effectively toggles -automatic refreshes on or off. The default setting for this config is off (`False`). If this -is the only configuration change for the materialized view, dbt will choose to apply -an `ALTER` statement instead of issuing a full refresh, +Learn more about the [parameters](https://docs.aws.amazon.com/redshift/latest/dg/materialized-view-create-sql-command.html#mv_CREATE_MATERIALIZED_VIEW-parameters) in the Redshift docs. ### Limitations +As with most data platforms, there are limitations associated with materialized views. Some worth noting include: + +- Materialized views cannot reference views, temporary tables, user-defined functions, or late-binding tables. +- Auto-refresh cannot be used if the materialized view references mutable functions, external schemas, or another materialized view. + +Find more information about materialized view limitations in Redshift's [docs](https://docs.aws.amazon.com/redshift/latest/dg/materialized-view-create-sql-command.html#mv_CREATE_MATERIALIZED_VIEW-limitations). + + + #### Changing materialization from "materialized_view" to "table" or "view" Swapping a materialized view to a table or view is not supported. @@ -157,3 +255,5 @@ If the user changes the model's config to `materialized="table"`, they will get The workaround is to execute `DROP MATERIALIZED VIEW my_mv CASCADE` on the data warehouse before trying the model again. + + diff --git a/website/docs/reference/resource-configs/snowflake-configs.md b/website/docs/reference/resource-configs/snowflake-configs.md index 30c7966ec68..aa2bf370df6 100644 --- a/website/docs/reference/resource-configs/snowflake-configs.md +++ b/website/docs/reference/resource-configs/snowflake-configs.md @@ -346,82 +346,122 @@ In the configuration format for the model SQL file: ## Dynamic tables -The Snowflake adapter supports [dynamic tables](https://docs.snowflake.com/en/sql-reference/sql/create-dynamic-table). +The Snowflake adapter supports [dynamic tables](https://docs.snowflake.com/en/user-guide/dynamic-tables-about). This materialization is specific to Snowflake, which means that any model configuration that would normally come along for the ride from `dbt-core` (e.g. as with a `view`) may not be available for dynamic tables. This gap will decrease in future patches and versions. While this materialization is specific to Snowflake, it very much follows the implementation of [materialized views](/docs/build/materializations#Materialized-View). In particular, dynamic tables have access to the `on_configuration_change` setting. -There are also some limitations that we hope to address in the next version. +Dynamic tables are supported with the following configuration parameters: -### Parameters +| Parameter | Type | Required | Default | Change Monitoring Support | +|----------------------------------------------------------|------------|----------|---------|---------------------------| +| `on_configuration_change` | `` | no | `apply` | n/a | +| [`target_lag`](#target-lag) | `` | yes | | alter | +| [`snowflake_warehouse`](#configuring-virtual-warehouses) | `` | yes | | alter | -Dynamic tables in `dbt-snowflake` require the following parameters: -- `target_lag` -- `snowflake_warehouse` -- `on_configuration_change` + -To learn more about each parameter and what values it can take, see -the Snowflake docs page: [`CREATE DYNAMIC TABLE: Parameters`](https://docs.snowflake.com/en/sql-reference/sql/create-dynamic-table). -### Usage + -You can create a dynamic table by editing _one_ of these files: + -- the SQL file for your model -- the `dbt_project.yml` configuration file +```yaml +models: + [](/reference/resource-configs/resource-path): + [+](/reference/resource-configs/plus-prefix)[materialized](/reference/resource-configs/materialized): dynamic_table + [+](/reference/resource-configs/plus-prefix)on_configuration_change: apply | continue | fail + [+](/reference/resource-configs/plus-prefix)[target_lag](#target-lag): downstream | + [+](/reference/resource-configs/plus-prefix)[snowflake_warehouse](#configuring-virtual-warehouses): +``` -The following examples create a dynamic table: + - + -```sql -{{ config( - materialized = 'dynamic_table', - snowflake_warehouse = 'snowflake_warehouse', - target_lag = '10 minutes', -) }} -``` -
+ - + ```yaml +version: 2 + models: - path: - materialized: dynamic_table - snowflake_warehouse: snowflake_warehouse - target_lag: '10 minutes' + - name: [] + config: + [materialized](/reference/resource-configs/materialized): dynamic_table + on_configuration_change: apply | continue | fail + [target_lag](#target-lag): downstream | + [snowflake_warehouse](#configuring-virtual-warehouses): ``` -### Monitored configuration changes + -The settings below are monitored for changes applicable to `on_configuration_change`. -#### Target lag + -Changes to `target_lag` can be applied by running an `ALTER` statement. Refreshing is essentially -always on for dynamic tables; this setting changes how frequently the dynamic table is updated. + -#### Warehouse +```jinja +{{ config( + [materialized](/reference/resource-configs/materialized)="dynamic_table", + on_configuration_change="apply" | "continue" | "fail", + [target_lag](#target-lag)="downstream" | " seconds | minutes | hours | days", + [snowflake_warehouse](#configuring-virtual-warehouses)="", +) }} +``` + + + + + + + +Find more information about these parameters in Snowflake's [docs](https://docs.snowflake.com/en/sql-reference/sql/create-dynamic-table): -Changes to `snowflake_warehouse` can be applied via an `ALTER` statement. +### Target lag + +Snowflake allows two configuration scenarios for scheduling automatic refreshes: +- **Time-based** — Provide a value of the form ` { seconds | minutes | hours | days }`. For example, if the dynamic table needs to be updated every 30 minutes, use `target_lag='30 minutes'`. +- **Downstream** — Applicable when the dynamic table is referenced by other dynamic tables. In this scenario, `target_lag='downstream'` allows for refreshes to be controlled at the target, instead of at each layer. + +Find more information about `target_lag` in Snowflake's [docs](https://docs.snowflake.com/en/user-guide/dynamic-tables-refresh#understanding-target-lag). ### Limitations +As with materialized views on most data platforms, there are limitations associated with dynamic tables. Some worth noting include: + +- Dynamic table SQL has a [limited feature set](https://docs.snowflake.com/en/user-guide/dynamic-tables-tasks-create#query-constructs-not-currently-supported-in-dynamic-tables). +- Dynamic table SQL cannot be updated; the dynamic table must go through a `--full-refresh` (DROP/CREATE). +- Dynamic tables cannot be downstream from: materialized views, external tables, streams. +- Dynamic tables cannot reference a view that is downstream from another dynamic table. + +Find more information about dynamic table limitations in Snowflake's [docs](https://docs.snowflake.com/en/user-guide/dynamic-tables-tasks-create#dynamic-table-limitations-and-supported-functions). + + + #### Changing materialization to and from "dynamic_table" -Swapping an already materialized model to be a dynamic table and vice versa. -The workaround is manually dropping the existing materialization in the data warehouse prior to calling `dbt run`. -Normally, re-running with the `--full-refresh` flag would resolve this, but not in this case. -This would only need to be done once as the existing object would then be a dynamic table. +Version `1.6.x` does not support altering the materialization from a non-dynamic table be a dynamic table and vice versa. +Re-running with the `--full-refresh` does not resolve this either. +The workaround is manually dropping the existing model in the warehouse prior to calling `dbt run`. +This only needs to be done once for the conversion. For example, assume for the example model below, `my_model`, has already been materialized to the underlying data platform via `dbt run`. -If the user changes the model's config to `materialized="dynamic_table"`, they will get an error. +If the model config is updated to `materialized="dynamic_table"`, dbt will return an error. The workaround is to execute `DROP TABLE my_model` on the data warehouse before trying the model again. @@ -429,7 +469,7 @@ The workaround is to execute `DROP TABLE my_model` on the data warehouse before ```yaml {{ config( - materialized="table" # or any model type eg view, incremental + materialized="table" # or any model type (e.g. view, incremental) ) }} ``` @@ -437,3 +477,5 @@ The workaround is to execute `DROP TABLE my_model` on the data warehouse before + + diff --git a/website/docs/terms/data-wrangling.md b/website/docs/terms/data-wrangling.md index 58034fe8e91..b164855ff9b 100644 --- a/website/docs/terms/data-wrangling.md +++ b/website/docs/terms/data-wrangling.md @@ -51,7 +51,7 @@ The cleaning stage involves using different functions so that the values in your - Removing appropriate duplicates or nulls you found in the discovery process - Eliminating unnecessary characters or spaces from values -Certain cleaning steps, like removing rows with null values, are helpful to do at the beginning of the process because removing nulls and duplicates from the start can increase the performance of your downstream models. In the cleaning step, it’s important to follow a standard for your transformations here. This means you should be following a consistent naming convention for your columns (especially for your primary keys) and casting to the same timezone and datatypes throughout your models. Examples include making sure all dates are in UTC time rather than source timezone-specific, all string in either lower or upper case, etc. +Certain cleaning steps, like removing rows with null values, are helpful to do at the beginning of the process because removing nulls and duplicates from the start can increase the performance of your downstream models. In the cleaning step, it’s important to follow a standard for your transformations here. This means you should be following a consistent naming convention for your columns (especially for your primary keys) and casting to the same timezone and datatypes throughout your models. Examples include making sure all dates are in UTC time rather than source timezone-specific, all strings are in either lower or upper case, etc. :::tip dbt to the rescue! If you're struggling to do all the cleaning on your own, remember that dbt packages ([dbt expectations](https://github.com/calogica/dbt-expectations), [dbt_utils](https://hub.getdbt.com/dbt-labs/dbt_utils/latest/), and [re_data](https://www.getre.io/)) and their macros are also available to help you clean up your data. diff --git a/website/sidebars.js b/website/sidebars.js index 4653b028ef9..ba00ba582b2 100644 --- a/website/sidebars.js +++ b/website/sidebars.js @@ -424,6 +424,8 @@ const sidebarSettings = { link: { type: "doc", id: "docs/collaborate/explore-projects" }, items: [ "docs/collaborate/explore-projects", + "docs/collaborate/model-performance", + "docs/collaborate/project-recommendations", "docs/collaborate/explore-multiple-projects", ], }, @@ -721,6 +723,7 @@ const sidebarSettings = { "reference/resource-configs/oracle-configs", "reference/resource-configs/upsolver-configs", "reference/resource-configs/starrocks-configs", + "reference/resource-configs/infer-configs", ], }, { diff --git a/website/snippets/_cloud-environments-info.md b/website/snippets/_cloud-environments-info.md index 4e1cba64e00..6e096b83750 100644 --- a/website/snippets/_cloud-environments-info.md +++ b/website/snippets/_cloud-environments-info.md @@ -62,7 +62,7 @@ By default, all environments will use the default branch in your repository (usu For more info, check out this [FAQ page on this topic](/faqs/Environments/custom-branch-settings)! -### Extended attributes +### Extended attributes :::note Extended attributes are retrieved and applied only at runtime when `profiles.yml` is requested for a specific Cloud run. Extended attributes are currently _not_ taken into consideration for Cloud-specific features such as PrivateLink or SSH Tunneling that do not rely on `profiles.yml` values. diff --git a/website/snippets/_new-sl-setup.md b/website/snippets/_new-sl-setup.md index 3cb6e09eb4c..18e75c3278d 100644 --- a/website/snippets/_new-sl-setup.md +++ b/website/snippets/_new-sl-setup.md @@ -1,6 +1,7 @@ You can set up the dbt Semantic Layer in dbt Cloud at the environment and project level. Before you begin: -- You must have a dbt Cloud Team or Enterprise [multi-tenant](/docs/cloud/about-cloud/regions-ip-addresses) deployment. Single-tenant coming soon. +- You must have a dbt Cloud Team or Enterprise account. Suitable for both Multi-tenant and Single-tenant deployment. + - Single-tenant accounts should contact their account representative for necessary setup and enablement. - You must be part of the Owner group, and have the correct [license](/docs/cloud/manage-access/seats-and-users) and [permissions](/docs/cloud/manage-access/self-service-permissions) to configure the Semantic Layer: * Enterprise plan — Developer license with Account Admin permissions. Or Owner with a Developer license, assigned Project Creator, Database Admin, or Admin permissions. * Team plan — Owner with a Developer license. diff --git a/website/snippets/_sl-connect-and-query-api.md b/website/snippets/_sl-connect-and-query-api.md index 429f41c3bf6..f7f1d2add24 100644 --- a/website/snippets/_sl-connect-and-query-api.md +++ b/website/snippets/_sl-connect-and-query-api.md @@ -1,10 +1,8 @@ You can query your metrics in a JDBC-enabled tool or use existing first-class integrations with the dbt Semantic Layer. -You must have a dbt Cloud Team or Enterprise [multi-tenant](/docs/cloud/about-cloud/regions-ip-addresses) deployment. Single-tenant coming soon. - +- You must have a dbt Cloud Team or Enterprise account. Suitable for both Multi-tenant and Single-tenant deployment. + - Single-tenant accounts should contact their account representative for necessary setup and enablement. - To learn how to use the JDBC or GraphQL API and what tools you can query it with, refer to [dbt Semantic Layer APIs](/docs/dbt-cloud-apis/sl-api-overview). - * To authenticate, you need to [generate a service token](/docs/dbt-cloud-apis/service-tokens) with Semantic Layer Only and Metadata Only permissions. * Refer to the [SQL query syntax](/docs/dbt-cloud-apis/sl-jdbc#querying-the-api-for-metric-metadata) to query metrics using the API. - - To learn more about the sophisticated integrations that connect to the dbt Semantic Layer, refer to [Available integrations](/docs/use-dbt-semantic-layer/avail-sl-integrations) for more info. diff --git a/website/snippets/_sl-faqs.md b/website/snippets/_sl-faqs.md index 5bc556ae00a..def8f3837f6 100644 --- a/website/snippets/_sl-faqs.md +++ b/website/snippets/_sl-faqs.md @@ -10,6 +10,11 @@ As we refine MetricFlow’s API layers, some users may find it easier to set up their own custom service layers for managing query requests. This is not currently recommended, as the API boundaries around MetricFlow are not sufficiently well-defined for broad-based community use +- **Why is my query limited to 100 rows in the dbt Cloud CLI?** +- The default `limit` for query issues from the dbt Cloud CLI is 100 rows. We set this default to prevent returning unnecessarily large data sets as the dbt Cloud CLI is typically used to query the dbt Semantic Layer during the development process, not for production reporting or to access large data sets. For most workflows, you only need to return a subset of the data. + + However, you can change this limit if needed by setting the `--limit` option in your query. For example, to return 1000 rows, you can run `dbt sl list metrics --limit 1000`. + - **Can I reference MetricFlow queries inside dbt models?** - dbt relies on Jinja macros to compile SQL, while MetricFlow is Python-based and does direct SQL rendering targeting at a specific dialect. MetricFlow does not support pass-through rendering of Jinja macros, so we can’t easily reference MetricFlow queries inside of dbt models. diff --git a/website/snippets/_sl-plan-info.md b/website/snippets/_sl-plan-info.md index 083ab2209bc..fe4e6024226 100644 --- a/website/snippets/_sl-plan-info.md +++ b/website/snippets/_sl-plan-info.md @@ -1,2 +1,2 @@ -To define and query metrics with the {props.product}, you must be on a {props.plan} multi-tenant plan .


+To define and query metrics with the {props.product}, you must be on a {props.plan} account. Suitable for both Multi-tenant and Single-tenant accounts. Note: Single-tenant accounts should contact their account representative for necessary setup and enablement.

diff --git a/website/snippets/_v2-sl-prerequisites.md b/website/snippets/_v2-sl-prerequisites.md index c80db4d1c8f..eb8b5fc27e4 100644 --- a/website/snippets/_v2-sl-prerequisites.md +++ b/website/snippets/_v2-sl-prerequisites.md @@ -1,15 +1,16 @@ -- Have a dbt Cloud Team or Enterprise [multi-tenant](/docs/cloud/about-cloud/regions-ip-addresses) deployment. Single-tenant coming soon. -- Have both your production and development environments running dbt version 1.6 or higher. Refer to [upgrade in dbt Cloud](/docs/dbt-versions/upgrade-core-in-cloud) for more info. +- Have a dbt Cloud Team or Enterprise account. Suitable for both Multi-tenant and Single-tenant deployment. + - Note: Single-tenant accounts should contact their account representative for necessary setup and enablement. +- Have both your production and development environments running [dbt version 1.6 or higher](/docs/dbt-versions/upgrade-core-in-cloud). - Use Snowflake, BigQuery, Databricks, or Redshift. - Create a successful run in the environment where you configure the Semantic Layer. - **Note:** Semantic Layer currently supports the Deployment environment for querying. (_development querying experience coming soon_) - Set up the [Semantic Layer API](/docs/dbt-cloud-apis/sl-api-overview) in the integrated tool to import metric definitions. - - To access the API and query metrics in downstream tools, you must have a dbt Cloud [Team or Enterprise](https://www.getdbt.com/pricing/) account. dbt Core or Developer accounts can define metrics but won't be able to dynamically query them.
+ - dbt Core or Developer accounts can define metrics but won't be able to dynamically query them.
- Understand [MetricFlow's](/docs/build/about-metricflow) key concepts, which powers the latest dbt Semantic Layer. -- Note that SSH tunneling for [Postgres and Redshift](/docs/cloud/connect-data-platform/connect-redshift-postgresql-alloydb) connections, [PrivateLink](/docs/cloud/secure/about-privatelink), and [Single sign-on (SSO)](/docs/cloud/manage-access/sso-overview) isn't supported yet. +- Note that SSH tunneling for [Postgres and Redshift](/docs/cloud/connect-data-platform/connect-redshift-postgresql-alloydb) connections, [PrivateLink](/docs/cloud/secure/about-privatelink), and [Single sign-on (SSO)](/docs/cloud/manage-access/sso-overview) doesn't supported the dbt Semantic Layer yet.
diff --git a/website/src/components/communitySpotlightCard/index.js b/website/src/components/communitySpotlightCard/index.js index 08707a93dd4..122edee8f06 100644 --- a/website/src/components/communitySpotlightCard/index.js +++ b/website/src/components/communitySpotlightCard/index.js @@ -1,5 +1,6 @@ import React from 'react' import Link from '@docusaurus/Link'; +import Head from "@docusaurus/Head"; import styles from './styles.module.css'; import imageCacheWrapper from '../../../functions/image-cache-wrapper'; @@ -47,24 +48,45 @@ function CommunitySpotlightCard({ frontMatter, isSpotlightMember = false }) { jobTitle, companyName, organization, - socialLinks + socialLinks, + communityAward } = frontMatter - return ( - + // Get meta description text + const metaDescription = stripHtml(description) + + return ( + + {isSpotlightMember && metaDescription ? ( + + + + + ) : null} + {communityAward ? ( +
+ Community Award Recipient +
+ ) : null} {image && (
{id && isSpotlightMember ? ( - {title} + {title} ) : ( - - {title} + + {title} )}
@@ -72,19 +94,26 @@ function CommunitySpotlightCard({ frontMatter, isSpotlightMember = false }) {
{!isSpotlightMember && id ? (

- {title} + + {title} +

- ) : ( + ) : (

{title}

)} - {pronouns &&
{pronouns}
} - + {pronouns && ( +
{pronouns}
+ )} + {isSpotlightMember && (
{(jobTitle || companyName) && (
{jobTitle && jobTitle} - {jobTitle && companyName && ', '} + {jobTitle && companyName && ", "} {companyName && companyName}
)} @@ -101,7 +130,10 @@ function CommunitySpotlightCard({ frontMatter, isSpotlightMember = false }) {
)} {description && !isSpotlightMember && ( -

+

)} {socialLinks && isSpotlightMember && socialLinks?.length > 0 && (

@@ -109,8 +141,15 @@ function CommunitySpotlightCard({ frontMatter, isSpotlightMember = false }) { <> {item?.name && item?.link && ( <> - {i !== 0 && ' | '} - {item.name} + {i !== 0 && " | "} + + {item.name} + )} @@ -118,29 +157,33 @@ function CommunitySpotlightCard({ frontMatter, isSpotlightMember = false }) {
)} {id && !isSpotlightMember && ( - Read More + > + Read More + )}
{description && isSpotlightMember && (

About

-

- +

)}
- ) + ); } -// Truncate text +// Truncate description text for community member cards function truncateText(str) { // Max length of string let maxLength = 300 - // Check if anchor link starts within first 300 characters + // Check if anchor link starts within maxLength let hasLinks = false if(str.substring(0, maxLength - 3).match(/(?:]+)>)/gi, "") + + // Strip new lines and return 130 character substring for description + const updatedDesc = strippedHtml + ?.substring(0, maxLength) + ?.replace(/(\r\n|\r|\n)/g, ""); + + return desc?.length > maxLength ? `${updatedDesc}...` : updatedDesc +} + export default CommunitySpotlightCard diff --git a/website/src/components/communitySpotlightCard/styles.module.css b/website/src/components/communitySpotlightCard/styles.module.css index 253a561ebea..5df85c8a4cc 100644 --- a/website/src/components/communitySpotlightCard/styles.module.css +++ b/website/src/components/communitySpotlightCard/styles.module.css @@ -15,6 +15,19 @@ header.spotlightMemberCard { div.spotlightMemberCard { margin-bottom: 2.5rem; } +.spotlightMemberCard .awardBadge { + flex: 0 0 100%; + margin-bottom: .5rem; +} +.spotlightMemberCard .awardBadge span { + max-width: fit-content; + color: #fff; + background: var(--ifm-color-primary); + display: block; + border-radius: 1rem; + padding: 5px 10px; + font-size: .7rem; +} .spotlightMemberCard .spotlightMemberImgContainer { flex: 0 0 100%; } @@ -81,6 +94,9 @@ div.spotlightMemberCard { margin-bottom: 0; padding-left: 0; } + .spotlightMemberCard .awardBadge span { + font-size: .8rem; + } .spotlightMemberCard .spotlightMemberImgContainer { flex: 0 0 346px; margin-right: 2rem; @@ -100,7 +116,3 @@ div.spotlightMemberCard { line-height: 2rem; } } - - - - diff --git a/website/src/components/communitySpotlightList/index.js b/website/src/components/communitySpotlightList/index.js index 6885f5ff2ac..ed0dbf6d653 100644 --- a/website/src/components/communitySpotlightList/index.js +++ b/website/src/components/communitySpotlightList/index.js @@ -36,6 +36,7 @@ function CommunitySpotlightList({ spotlightData }) { {metaTitle} +