diff --git a/.github/pull_request_template.md b/.github/pull_request_template.md index 870dadcd183..d2bb72552bd 100644 --- a/.github/pull_request_template.md +++ b/.github/pull_request_template.md @@ -9,6 +9,7 @@ To learn more about the writing conventions used in the dbt Labs docs, see the [ - [ ] I have reviewed the [Content style guide](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/content-style-guide.md) so my content adheres to these guidelines. - [ ] The topic I'm writing about is for specific dbt version(s) and I have versioned it according to the [version a whole page](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#adding-a-new-version) and/or [version a block of content](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#versioning-blocks-of-content) guidelines. - [ ] I have added checklist item(s) to this list for anything anything that needs to happen before this PR is merged, such as "needs technical review" or "change base branch." +- [ ] The content in this PR requires a dbt release note, so I added one to the [release notes page](https://docs.getdbt.com/docs/dbt-versions/dbt-cloud-release-notes). -Note: The **Defer to staging/production** [toggle](/docs/cloud/about-cloud-develop-defer#defer-in-the-dbt-cloud-ide) button doesn't apply when running Semantic Layer commands in the dbt Cloud IDE. To use defer for Semantic layer commands in the IDE, toggle the button on and manually add the `--defer` flag to the command. This is a temporary workaround and will be available soon. You can install [MetricFlow](https://github.com/dbt-labs/metricflow#getting-started) from [PyPI](https://pypi.org/project/dbt-metricflow/). You need to use `pip` to install MetricFlow on Windows or Linux operating systems: + + 1. Create or activate your virtual environment `python -m venv venv` 2. Run `pip install dbt-metricflow` * You can install MetricFlow using PyPI as an extension of your dbt adapter in the command line. To install the adapter, run `python -m pip install "dbt-metricflow[your_adapter_name]"` and add the adapter name at the end of the command. For example, for a Snowflake adapter run `python -m pip install "dbt-metricflow[snowflake]"` + + + + +1. Create or activate your virtual environment `python -m venv venv` +2. Run `pip install dbt-metricflow` + * You can install MetricFlow using PyPI as an extension of your dbt adapter in the command line. To install the adapter, run `python -m pip install "dbt-metricflow[adapter_package_name]"` and add the adapter name at the end of the command. For example, for a Snowflake adapter run `python -m pip install "dbt-metricflow[dbt-snowflake]"` + + + **Note**, you'll need to manage versioning between dbt Core, your adapter, and MetricFlow. Something to note, MetricFlow `mf` commands return an error if you have a Metafont latex package installed. To run `mf` commands, uninstall the package. diff --git a/website/docs/docs/build/metricflow-time-spine.md b/website/docs/docs/build/metricflow-time-spine.md index e932fb36f53..50d1d68d0bd 100644 --- a/website/docs/docs/build/metricflow-time-spine.md +++ b/website/docs/docs/build/metricflow-time-spine.md @@ -124,42 +124,6 @@ For an example project, refer to our [Jaffle shop](https://github.com/dbt-labs/j - - -```sql -{{ - config( - materialized = 'table', - ) -}} - -with days as ( - - {{ - dbt_utils.date_spine( - 'day', - "to_date('01/01/2000','mm/dd/yyyy')", - "to_date('01/01/2025','mm/dd/yyyy')" - ) - }} - -), - -final as ( - select cast(date_day as date) as date_day - from days -) - -select * from final --- filter the time spine to a specific range -where date_day > dateadd(year, -4, current_timestamp()) -and date_hour < dateadd(day, 30, current_timestamp()) -``` - - - - - ```sql {{ config( @@ -189,42 +153,9 @@ where date_day > dateadd(year, -4, current_timestamp()) and date_hour < dateadd(day, 30, current_timestamp()) ``` - - ### Daily (BigQuery) Use this model if you're using BigQuery. BigQuery supports `DATE()` instead of `TO_DATE()`: - - - - -```sql -{{config(materialized='table')}} -with days as ( - {{dbt_utils.date_spine( - 'day', - "DATE(2000,01,01)", - "DATE(2025,01,01)" - ) - }} -), - -final as ( - select cast(date_day as date) as date_day - from days -) - -select * -from final --- filter the time spine to a specific range -where date_day > dateadd(year, -4, current_timestamp()) -and date_hour < dateadd(day, 30, current_timestamp()) -``` - - - - - @@ -253,7 +184,6 @@ and date_hour < dateadd(day, 30, current_timestamp()) ``` - @@ -306,42 +236,6 @@ To create this table, you need to create a model in your dbt project called `met ### Daily - - - -```sql -{{ - config( - materialized = 'table', - ) -}} - -with days as ( - - {{ - dbt_utils.date_spine( - 'day', - "to_date('01/01/2000','mm/dd/yyyy')", - "to_date('01/01/2025','mm/dd/yyyy')" - ) - }} - -), - -final as ( - select cast(date_day as date) as date_day - from days -) - -select * from final --- filter the time spine to a specific range -where date_day > dateadd(year, -4, current_timestamp()) -and date_hour < dateadd(day, 30, current_timestamp()) -``` - - - - @@ -375,43 +269,11 @@ and date_hour < dateadd(day, 30, current_timestamp()) ``` - ### Daily (BigQuery) Use this model if you're using BigQuery. BigQuery supports `DATE()` instead of `TO_DATE()`: - - - - -```sql -{{config(materialized='table')}} -with days as ( - {{dbt_utils.date_spine( - 'day', - "DATE(2000,01,01)", - "DATE(2025,01,01)" - ) - }} -), - -final as ( - select cast(date_day as date) as date_day - from days -) - -select * -from final --- filter the time spine to a specific range -where date_day > dateadd(year, -4, current_timestamp()) -and date_hour < dateadd(day, 30, current_timestamp()) -``` - - - - - ```sql @@ -438,7 +300,6 @@ and date_hour < dateadd(day, 30, current_timestamp()) ``` - You only need to include the `date_day` column in the table. MetricFlow can handle broader levels of detail, but finer grains are only supported in versions 1.9 and higher. @@ -463,9 +324,22 @@ For example, if you use a custom calendar in your organization, such as a fiscal - This is useful for calculating metrics based on a custom calendar, such as fiscal quarters or weeks. - Use the `custom_granularities` key to define a non-standard time period for querying data, such as a `retail_month` or `fiscal_week`, instead of standard options like `day`, `month`, or `year`. -- Ensure the the `standard_granularity_column` is a date time type. - This feature provides more control over how time-based metrics are calculated. + + +When working with custom calendars in MetricFlow, it's important to ensure: + +- Consistent data types — Both your dimension column and the time spine column should use the same data type to allow accurate comparisons. Functions like `DATE_TRUNC` don't change the data type of the input in some databases (like Snowflake). Using different data types can lead to mismatches and inaccurate results. + + We recommend using `DATETIME` or `TIMESTAMP` data types for your time dimensions and time spine, as they support all granularities. The `DATE` data type may not support smaller granularities like hours or minutes. + +- Time zones — MetricFlow currently doesn't perform any timezone manipulation. When working with timezone-aware data, inconsistent time zones may lead to unexpected results during aggregations and comparisons. + +For example, if your time spine column is `TIMESTAMP` type and your dimension column is `DATE` type, comparisons between these columns might not work as intended. To fix this, convert your `DATE` column to `TIMESTAMP`, or make sure both columns are the same data type. + + + ### Add custom granularities To add custom granularities, the Semantic Layer supports custom calendar configurations that allow users to query data using non-standard time periods like `fiscal_year` or `retail_month`. You can define these custom granularities (all lowercased) by modifying your model's YAML configuration like this: diff --git a/website/docs/docs/build/packages.md b/website/docs/docs/build/packages.md index 0b69d10cee6..49cd7e00b1c 100644 --- a/website/docs/docs/build/packages.md +++ b/website/docs/docs/build/packages.md @@ -20,9 +20,10 @@ In dbt, libraries like these are called _packages_. dbt's packages are so powerf * Models to understand [Redshift](https://hub.getdbt.com/dbt-labs/redshift/latest/) privileges. * Macros to work with data loaded by [Stitch](https://hub.getdbt.com/dbt-labs/stitch_utils/latest/). -dbt _packages_ are in fact standalone dbt projects, with models and macros that tackle a specific problem area. As a dbt user, by adding a package to your project, the package's models and macros will become part of your own project. This means: +dbt _packages_ are in fact standalone dbt projects, with models, macros, and other resources that tackle a specific problem area. As a dbt user, by adding a package to your project, all of the package's resources will become part of your own project. This means: * Models in the package will be materialized when you `dbt run`. * You can use `ref` in your own models to refer to models from the package. +* You can use `source` to refer to sources in the package. * You can use macros in the package in your own project. * It's important to note that defining and installing dbt packages is different from [defining and installing Python packages](/docs/build/python-models#using-pypi-packages) @@ -82,11 +83,7 @@ packages: version: [">=0.7.0", "<0.8.0"] ``` - - -Beginning in v1.7, `dbt deps` "pins" each package by default. See ["Pinning packages"](#pinning-packages) for details. - - +`dbt deps` "pins" each package by default. See ["Pinning packages"](#pinning-packages) for details. Where possible, we recommend installing packages via dbt Hub, since this allows dbt to handle duplicate dependencies. This is helpful in situations such as: * Your project uses both the dbt-utils and Snowplow packages, and the Snowplow package _also_ uses the dbt-utils package. @@ -145,18 +142,8 @@ packages: revision: 4e28d6da126e2940d17f697de783a717f2503188 ``` - - -We **strongly recommend** ["pinning" your packages](#pinning-packages) to a specific release by specifying a release name. - - - - - By default, `dbt deps` "pins" each package. See ["Pinning packages"](#pinning-packages) for details. - - ### Internally hosted tarball URL Some organizations have security requirements to pull resources only from internal services. To address the need to install packages from hosted environments such as Artifactory or cloud storage buckets, dbt Core enables you to install packages from internally-hosted tarball URLs. @@ -318,18 +305,6 @@ When you remove a package from your `packages.yml` file, it isn't automatically ### Pinning packages - - -We **strongly recommend** "pinning" your package to a specific release by specifying a tagged release name or a specific commit hash. - -If you do not provide a revision, or if you use the main branch, then any updates to the package will be incorporated into your project the next time you run `dbt deps`. While we generally try to avoid making breaking changes to these packages, they are sometimes unavoidable. Pinning a package revision helps prevent your code from changing without your explicit approval. - -To find the latest release for a package, navigate to the `Releases` tab in the relevant GitHub repository. For example, you can find all of the releases for the dbt-utils package [here](https://github.com/dbt-labs/dbt-utils/releases). - - - - - Beginning with v1.7, running [`dbt deps`](/reference/commands/deps) "pins" each package by creating or updating the `package-lock.yml` file in the _project_root_ where `packages.yml` is recorded. - The `package-lock.yml` file contains a record of all packages installed. @@ -337,8 +312,6 @@ Beginning with v1.7, running [`dbt deps`](/reference/commands/deps) "pins" each For example, if you use a branch name, the `package-lock.yml` file pins to the head commit. If you use a version range, it pins to the latest release. In either case, subsequent commits or versions will **not** be installed. To get new commits or versions, run `dbt deps --upgrade` or add `package-lock.yml` to your .gitignore file. - - As of v0.14.0, dbt will warn you if you install a package using the `git` syntax without specifying a revision (see below). ### Configuring packages diff --git a/website/docs/docs/build/saved-queries.md b/website/docs/docs/build/saved-queries.md index 649885f9506..ed56d13dcc9 100644 --- a/website/docs/docs/build/saved-queries.md +++ b/website/docs/docs/build/saved-queries.md @@ -154,8 +154,6 @@ saved_queries: - - #### Project-level saved queries To enable saved queries at the project level, you can set the `saved-queries` configuration in the [`dbt_project.yml` file](/reference/dbt_project.yml). This saves you time in configuring saved queries in each file: @@ -171,7 +169,6 @@ saved-queries: For more information on `dbt_project.yml` and config naming conventions, see the [dbt_project.yml reference page](/reference/dbt_project.yml#naming-convention). - To build `saved_queries`, use the [`--resource-type` flag](/reference/global-configs/resource-type) and run the command `dbt build --resource-type saved_query`. diff --git a/website/docs/docs/build/semantic-models.md b/website/docs/docs/build/semantic-models.md index d683d7cd020..609d7f1ff8d 100644 --- a/website/docs/docs/build/semantic-models.md +++ b/website/docs/docs/build/semantic-models.md @@ -119,8 +119,6 @@ semantic_models: type: categorical ``` - - Semantic models support [`meta`](/reference/resource-configs/meta), [`group`](/reference/resource-configs/group), and [`enabled`](/reference/resource-configs/enabled) [`config`](/reference/resource-properties/config) property in either the schema file or at the project level: - Semantic model config in `models/semantic.yml`: @@ -148,8 +146,6 @@ Semantic models support [`meta`](/reference/resource-configs/meta), [`group`](/r For more information on `dbt_project.yml` and config naming conventions, see the [dbt_project.yml reference page](/reference/dbt_project.yml#naming-convention). - - ### Name Define the name of the semantic model. You must define a unique name for the semantic model. The semantic graph will use this name to identify the model, and you can update it at any time. Avoid using double underscores (\_\_) in the name as they're not supported. diff --git a/website/docs/docs/cloud-integrations/configure-auto-exposures.md b/website/docs/docs/cloud-integrations/configure-auto-exposures.md index 4574d69c164..42e36e572b3 100644 --- a/website/docs/docs/cloud-integrations/configure-auto-exposures.md +++ b/website/docs/docs/cloud-integrations/configure-auto-exposures.md @@ -25,18 +25,17 @@ To access the features, you should meet the following: 3. You have set up a [production](/docs/deploy/deploy-environments#set-as-production-environment) deployment environment for each project you want to explore, with at least one successful job run. 4. You have [admin permissions](/docs/cloud/manage-access/enterprise-permissions) in dbt Cloud to edit project settings or production environment settings. 5. Use Tableau as your BI tool and enable metadata permissions or work with an admin to do so. Compatible with Tableau Cloud or Tableau Server with the Metadata API enabled. -6. Run a production job _after_ saving the Tableau collections. ## Set up in Tableau This section of the document explains the steps you need to set up the auto-exposures integration with Tableau. Once you've set this up in Tableau and dbt Cloud, you can view the [auto-exposures](/docs/collaborate/auto-exposures#view-auto-exposures-in-dbt-explorer) in dbt Explorer. -To set up [personal access tokens (PATs)](/docs/dbt-cloud-apis/user-tokens#using-the-new-personal-access-tokens) needed for auto exposures, ask a site admin to configure it for the account. +To set up [personal access tokens (PATs)](https://help.tableau.com/current/server/en-us/security_personal_access_tokens.htm) needed for auto exposures, ask a site admin to configure it for the account. 1. Ensure you or a site admin enables PATs for the account in Tableau. -2. Create a PAT that you can add to dbt Cloud to pull in Tableau metadata for auto exposures. +2. Create a PAT that you can add to dbt Cloud to pull in Tableau metadata for auto exposures. Ensure the user creating the PAT has access to collections/folders, as the PAT only grants access matching the creator's existing privileges. 3. Copy the **Secret** and the **Token name** and enter them in dbt Cloud. The secret is only displayed once, so store it in a safe location (like a password manager). @@ -63,7 +62,6 @@ To set up [personal access tokens (PATs)](/docs/dbt-cloud-apis/user-tokens#using dbt Cloud automatically imports and syncs any workbook within the selected collections. New additions to the collections will be added to the lineage in dbt Cloud during the next automatic sync (usually once per day). 5. Click **Save**. -6. Run a production job _after_ saving the Tableau collections. dbt Cloud imports everything in the collection(s) and you can continue to view them in Explorer. For more information on how to view and use auto-exposures, refer to [View auto-exposures from dbt Explorer](/docs/collaborate/auto-exposures) page. @@ -72,5 +70,5 @@ dbt Cloud imports everything in the collection(s) and you can continue to view t ## Refresh auto-exposures in jobs :::info Coming soon -Soon, you’ll also be able to use auto-exposures trigger refresh of the data used in your Tableau dashboards from within dbt Cloud. Stay tuned for more on this soon! +Soon, you’ll also be able to use auto-exposures to trigger the refresh of the data used in your Tableau dashboards from within dbt Cloud. Stay tuned for more on this soon! ::: diff --git a/website/docs/docs/cloud-integrations/overview.md b/website/docs/docs/cloud-integrations/overview.md index d925e3e52a7..8334632a7f8 100644 --- a/website/docs/docs/cloud-integrations/overview.md +++ b/website/docs/docs/cloud-integrations/overview.md @@ -13,7 +13,7 @@ Many data applications integrate with dbt Cloud, enabling you to leverage the po
diff --git a/website/docs/docs/cloud/about-cloud-develop-defer.md b/website/docs/docs/cloud/about-cloud-develop-defer.md index 3ee5ac71666..fc55edf8a38 100644 --- a/website/docs/docs/cloud/about-cloud-develop-defer.md +++ b/website/docs/docs/cloud/about-cloud-develop-defer.md @@ -40,9 +40,6 @@ To enable defer in the dbt Cloud IDE, toggle the **Defer to production** button For example, if you were to start developing on a new branch with [nothing in your development schema](/reference/node-selection/defer#usage), edit a single model, and run `dbt build -s state:modified` — only the edited model would run. Any `{{ ref() }}` functions will point to the production location of the referenced models. - -Note: The **Defer to staging/production** toggle button doesn't apply when running [dbt Semantic Layer commands](/docs/build/metricflow-commands) in the dbt Cloud IDE. To use defer for Semantic layer commands in the IDE, toggle the button on and manually add the `--defer` flag to the command. This is a temporary workaround and will be available soon. - ### Defer in dbt Cloud CLI diff --git a/website/docs/docs/cloud/account-settings.md b/website/docs/docs/cloud/account-settings.md index 3b2632c8747..aaad9b28e5c 100644 --- a/website/docs/docs/cloud/account-settings.md +++ b/website/docs/docs/cloud/account-settings.md @@ -45,6 +45,6 @@ To use, select the **Enable partial parsing between deployment runs** option fro To use Advanced CI features, your dbt Cloud account must have access to them. Ask your dbt Cloud administrator to enable Advanced CI features on your account, which they can do by choosing the **Enable account access to Advanced CI** option from the account settings. -Once enabled, the **Run compare changes** option becomes available in the CI job settings for you to select. +Once enabled, the **dbt compare** option becomes available in the CI job settings for you to select. - \ No newline at end of file + diff --git a/website/docs/docs/cloud/connect-data-platform/about-connections.md b/website/docs/docs/cloud/connect-data-platform/about-connections.md index 6f2f140b724..89dd13808ec 100644 --- a/website/docs/docs/cloud/connect-data-platform/about-connections.md +++ b/website/docs/docs/cloud/connect-data-platform/about-connections.md @@ -20,9 +20,12 @@ dbt Cloud can connect with a variety of data platform providers including: - [Starburst or Trino](/docs/cloud/connect-data-platform/connect-starburst-trino) - [Teradata](/docs/cloud/connect-data-platform/connect-teradata) -You can connect to your database in dbt Cloud by clicking the gear in the top right and selecting **Account Settings**. From the Account Settings page, click **+ New Project**. +To connect to your database in dbt Cloud: - +1. Click your account name at the bottom of the left-side menu and click **Account settings** +2. Select **Projects** from the top left, and from there click **New Project** + + These connection instructions provide the basic fields required for configuring a data platform connection in dbt Cloud. For more detailed guides, which include demo project data, read our [Quickstart guides](https://docs.getdbt.com/guides) @@ -41,7 +44,7 @@ Connections created with APIs before this change cannot be accessed with the [la Warehouse connections are an account-level resource. As such you can find them under **Accounts Settings** > **Connections**: - + Warehouse connections can be re-used across projects. If multiple projects all connect to the same warehouse, you should re-use the same connection to streamline your management operations. Connections are assigned to a project via an [environment](/docs/dbt-cloud-environments). diff --git a/website/docs/docs/cloud/connect-data-platform/connect-amazon-athena.md b/website/docs/docs/cloud/connect-data-platform/connect-amazon-athena.md index 0b2f844ccac..f1009f61274 100644 --- a/website/docs/docs/cloud/connect-data-platform/connect-amazon-athena.md +++ b/website/docs/docs/cloud/connect-data-platform/connect-amazon-athena.md @@ -5,7 +5,7 @@ description: "Configure the Amazon Athena data platform connection in dbt Cloud. sidebar_label: "Connect Amazon Athena" --- -# Connect Amazon Athena +# Connect Amazon Athena Your environment(s) must be on ["Versionless"](/docs/dbt-versions/versionless-cloud) to use the Amazon Athena connection. diff --git a/website/docs/docs/cloud/connect-data-platform/connnect-bigquery.md b/website/docs/docs/cloud/connect-data-platform/connnect-bigquery.md index 0243bc619b1..1ce9712ab91 100644 --- a/website/docs/docs/cloud/connect-data-platform/connnect-bigquery.md +++ b/website/docs/docs/cloud/connect-data-platform/connnect-bigquery.md @@ -52,6 +52,123 @@ As an end user, if your organization has set up BigQuery OAuth, you can link a p To learn how to optimize performance with data platform-specific configurations in dbt Cloud, refer to [BigQuery-specific configuration](/reference/resource-configs/bigquery-configs). +### Optional configurations + +In BigQuery, optional configurations let you tailor settings for tasks such as query priority, dataset location, job timeout, and more. These options give you greater control over how BigQuery functions behind the scenes to meet your requirements. + +To customize your optional configurations in dbt Cloud: + +1. Click your name at the bottom left-hand side bar menu in dbt Cloud +2. Select **Your profile** from the menu +3. From there, click **Projects** and select your BigQuery project +5. Go to **Development Connection** and select BigQuery +6. Click **Edit** and then scroll down to **Optional settings** + + + +The following are the optional configurations you can set in dbt Cloud: + +| Configuration |
Information
| Type |
Example
| +|---------------------------|-----------------------------------------|---------|--------------------| +| [Priority](#priority) | Sets the priority for BigQuery jobs (either `interactive` or queued for `batch` processing) | String | `batch` or `interactive` | +| [Retries](#retries) | Specifies the number of retries for failed jobs due to temporary issues | Integer | `3` | +| [Location](#location) | Location for creating new datasets | String | `US`, `EU`, `us-west2` | +| [Maximum bytes billed](#maximum-bytes-billed) | Limits the maximum number of bytes that can be billed for a query | Integer | `1000000000` | +| [Execution project](#execution-project) | Specifies the project ID to bill for query execution | String | `my-project-id` | +| [Impersonate service account](#impersonate-service-account) | Allows users authenticated locally to access BigQuery resources under a specified service account | String | `service-account@project.iam.gserviceaccount.com` | +| [Job retry deadline seconds](#job-retry-deadline-seconds) | Sets the total number of seconds BigQuery will attempt to retry a job if it fails | Integer | `600` | +| [Job creation timeout seconds](#job-creation-timeout-seconds) | Specifies the maximum timeout for the job creation step | Integer | `120` | +| [Google cloud storage-bucket](#google-cloud-storage-bucket) | Location for storing objects in Google Cloud Storage | String | `my-bucket` | +| [Dataproc region](#dataproc-region) | Specifies the cloud region for running data processing jobs | String | `US`, `EU`, `asia-northeast1` | +| [Dataproc cluster name](#dataproc-cluster-name) | Assigns a unique identifier to a group of virtual machines in Dataproc | String | `my-cluster` | + + + + +The `priority` for the BigQuery jobs that dbt executes can be configured with the `priority` configuration in your BigQuery profile. The priority field can be set to one of `batch` or `interactive`. For more information on query priority, consult the [BigQuery documentation](https://cloud.google.com/bigquery/docs/running-queries). + + + + + +Retries in BigQuery help to ensure that jobs complete successfully by trying again after temporary failures, making your operations more robust and reliable. + + + + + +The `location` of BigQuery datasets can be set using the `location` setting in a BigQuery profile. As per the [BigQuery documentation](https://cloud.google.com/bigquery/docs/locations), `location` may be either a multi-regional location (for example, `EU`, `US`), or a regional location (like `us-west2`). + + + + + +When a `maximum_bytes_billed` value is configured for a BigQuery profile, that allows you to limit how much data your query can process. It’s a safeguard to prevent your query from accidentally processing more data than you expect, which could lead to higher costs. Queries executed by dbt will fail if they exceed the configured maximum bytes threshhold. This configuration should be supplied as an integer number of bytes. + +If your `maximum_bytes_billed` is 1000000000, you would enter that value in the `maximum_bytes_billed` field in dbt cloud. + + + + + + +By default, dbt will use the specified `project`/`database` as both: + +1. The location to materialize resources (models, seeds, snapshots, and so on), unless they specify a custom project/database config +2. The GCP project that receives the bill for query costs or slot usage + +Optionally, you may specify an execution project to bill for query execution, instead of the project/database where you materialize most resources. + + + + + +This feature allows users authenticating using local OAuth to access BigQuery resources based on the permissions of a service account. + +For a general overview of this process, see the official docs for [Creating Short-lived Service Account Credentials](https://cloud.google.com/iam/docs/create-short-lived-credentials-direct). + + + + + +Job retry deadline seconds is the maximum amount of time BigQuery will spend retrying a job before it gives up. + + + + + +Job creation timeout seconds is the maximum time BigQuery will wait to start the job. If the job doesn’t start within that time, it times out. + + + +#### Run dbt python models on Google Cloud Platform + +import BigQueryDataproc from '/snippets/_bigquery-dataproc.md'; + + + + + +Everything you store in Cloud Storage must be placed inside a [bucket](https://cloud.google.com/storage/docs/buckets). Buckets help you organize your data and manage access to it. + + + + + +A designated location in the cloud where you can run your data processing jobs efficiently. This region must match the location of your BigQuery dataset if you want to use Dataproc with BigQuery to ensure data doesn't move across regions, which can be inefficient and costly. + +For more information on [Dataproc regions](https://cloud.google.com/bigquery/docs/locations), refer to the BigQuery documentation. + + + + + +A unique label you give to your group of virtual machines to help you identify and manage your data processing tasks in the cloud. When you integrate Dataproc with BigQuery, you need to provide the cluster name so BigQuery knows which specific set of resources (the cluster) to use for running the data jobs. + +Have a look at [Dataproc's document on Create a cluster](https://cloud.google.com/dataproc/docs/guides/create-cluster) for an overview on how clusters work. + + + ### Account level connections and credential management You can re-use connections across multiple projects with [global connections](/docs/cloud/connect-data-platform/about-connections#migration-from-project-level-connections-to-account-level-connections). Connections are attached at the environment level (formerly project level), so you can utilize multiple connections inside of a single project (to handle dev, staging, production, etc.). @@ -147,3 +264,7 @@ For a project, you will first create an environment variable to store the secret "extended_attributes_id": FFFFF }' ``` + + + + diff --git a/website/docs/docs/cloud/dbt-cloud-ide/develop-in-the-cloud.md b/website/docs/docs/cloud/dbt-cloud-ide/develop-in-the-cloud.md index 398b0cff2a1..c9d2cbbad30 100644 --- a/website/docs/docs/cloud/dbt-cloud-ide/develop-in-the-cloud.md +++ b/website/docs/docs/cloud/dbt-cloud-ide/develop-in-the-cloud.md @@ -53,7 +53,7 @@ To understand how to navigate the IDE and its user interface elements, refer to | Feature | Description | |---|---| | [**Keyboard shortcuts**](/docs/cloud/dbt-cloud-ide/keyboard-shortcuts) | You can access a variety of [commands and actions](/docs/cloud/dbt-cloud-ide/keyboard-shortcuts) in the IDE by choosing the appropriate keyboard shortcut. Use the shortcuts for common tasks like building modified models or resuming builds from the last failure. | -| **IDE version control** | The IDE version control section and git button allow you to apply the concept of [version control](/docs/collaborate/git/version-control-basics) to your project directly into the IDE.

- Create or change branches, execute git commands using the git button.
- Commit or revert individual files by right-clicking the edited file
- [Resolve merge conflicts](/docs/collaborate/git/merge-conflicts)
- Link to the repo directly by clicking the branch name
- Edit, format, or lint files and execute dbt commands in your primary protected branch, and commit to a new branch.
- Use Git diff view to view what has been changed in a file before you make a pull request.
- From dbt version 1.6 and higher, use the **Prune branches** [button](/docs/cloud/dbt-cloud-ide/ide-user-interface#prune-branches-modal) to delete local branches that have been deleted from the remote repository, keeping your branch management tidy. | +| **IDE version control** | The IDE version control section and git button allow you to apply the concept of [version control](/docs/collaborate/git/version-control-basics) to your project directly into the IDE.

- Create or change branches, execute git commands using the git button.
- Commit or revert individual files by right-clicking the edited file
- [Resolve merge conflicts](/docs/collaborate/git/merge-conflicts)
- Link to the repo directly by clicking the branch name
- Edit, format, or lint files and execute dbt commands in your primary protected branch, and commit to a new branch.
- Use Git diff view to view what has been changed in a file before you make a pull request.
- Use the **Prune branches** [button](/docs/cloud/dbt-cloud-ide/ide-user-interface#prune-branches-modal) (dbt v1.6 and higher) to delete local branches that have been deleted from the remote repository, keeping your branch management tidy.
- Sign your [git commits](/docs/cloud/dbt-cloud-ide/git-commit-signing) to mark them as 'Verified'. | | **Preview and Compile button** | You can [compile or preview](/docs/cloud/dbt-cloud-ide/ide-user-interface#console-section) code, a snippet of dbt code, or one of your dbt models after editing and saving. | | [**dbt Copilot**](/docs/cloud/dbt-copilot) | A powerful AI engine that can generate documentation, tests, and semantic models for your dbt SQL models. Available for dbt Cloud Enterprise plans. | | **Build, test, and run button** | Build, test, and run your project with a button click or by using the Cloud IDE command bar. diff --git a/website/docs/docs/cloud/dbt-cloud-ide/git-commit-signing.md b/website/docs/docs/cloud/dbt-cloud-ide/git-commit-signing.md new file mode 100644 index 00000000000..afaa0751669 --- /dev/null +++ b/website/docs/docs/cloud/dbt-cloud-ide/git-commit-signing.md @@ -0,0 +1,80 @@ +--- +title: "Git commit signing" +description: "Learn how to sign your Git commits when using the IDE for development." +sidebar_label: Git commit signing +--- + +# Git commit signing + +To prevent impersonation and enhance security, you can sign your Git commits before pushing them to your repository. Using your signature, a Git provider can cryptographically verify a commit and mark it as "verified", providing increased confidence about its origin. + +You can configure dbt Cloud to sign your Git commits when using the IDE for development. To set up, enable the feature in dbt Cloud, follow the flow to generate a keypair, and upload the public key to your Git provider to use for signature verification. + + +## Prerequisites + +- GitHub or GitLab is your Git provider. Currently, Azure DevOps is not supported. +- You have a dbt Cloud account on the [Enterprise plan](https://www.getdbt.com/pricing/). + +## Generate GPG keypair in dbt Cloud + +To generate a GPG keypair in dbt Cloud, follow these steps: +1. Go to your **Personal profile** page in dbt Cloud. +2. Navigate to **Signed Commits** section. +3. Enable the **Sign commits originating from this user** toggle. +4. This will generate a GPG keypair. The private key will be used to sign all future Git commits. The public key will be displayed, allowing you to upload it to your Git provider. + + + +## Upload public key to Git provider + +To upload the public key to your Git provider, follow the detailed documentation provided by the supported Git provider: + +- [GitHub instructions](https://docs.github.com/en/authentication/managing-commit-signature-verification/adding-a-gpg-key-to-your-github-account) +- [GitLab instructions](https://docs.gitlab.com/ee/user/project/repository/signed_commits/gpg.html) + +Once you have uploaded the public key to your Git provider, your Git commits will be marked as "Verified" after you push the changes to the repository. + + + +## Considerations + +- The GPG keypair is tied to the user, not a specific account. There is a 1:1 relationship between the user and keypair. The same key will be used for signing commits on any accounts the user is a member of. +- The GPG keypair generated in dbt Cloud is linked to the email address associated with your account at the time of keypair creation. This email identifies the author of signed commits. +- For your Git commits to be marked as "verified", your dbt Cloud email address must be a verified email address with your Git provider. The Git provider (such as, GitHub, GitLab) checks that the commit's signed email matches a verified email in your Git provider account. If they don’t match, the commit won't be marked as "verified." +- Keep your dbt Cloud email and Git provider's verified email in sync to avoid verification issues. If you change your dbt Cloud email address: + - Generate a new GPG keypair with the updated email, following the [steps mentioned earlier](/docs/cloud/dbt-cloud-ide/git-commit-signing#generate-gpg-keypair-in-dbt-cloud). + - Add and verify the new email in your Git provider. + + + +## FAQs + + + + + +If you delete your GPG keypair in dbt Cloud, your Git commits will no longer be signed. You can generate a new GPG keypair by following the [steps mentioned earlier](/docs/cloud/dbt-cloud-ide/git-commit-signing#generate-gpg-keypair-in-dbt-cloud). + + + + +GitHub and GitLab support commit signing, while Azure DevOps does not. Commit signing is a [git feature](https://git-scm.com/book/ms/v2/Git-Tools-Signing-Your-Work), and is independent of any specific provider. However, not all providers support the upload of public keys, or the display of verification badges on commits. + + + + + +If your Git Provider does not explicitly support the uploading of public GPG keys, then +commits will still be signed using the private key, but no verification information will +be displayed by the provider. + + + + + +If your Git provider is configured to enforce commit verification, then unsigned commits +will be rejected. To avoid this, ensure that you have followed all previous steps to generate +a keypair, and uploaded the public key to the provider. + + diff --git a/website/docs/docs/cloud/dbt-cloud-ide/ide-user-interface.md b/website/docs/docs/cloud/dbt-cloud-ide/ide-user-interface.md index 8d80483485c..36c6cc898dc 100644 --- a/website/docs/docs/cloud/dbt-cloud-ide/ide-user-interface.md +++ b/website/docs/docs/cloud/dbt-cloud-ide/ide-user-interface.md @@ -35,7 +35,7 @@ The IDE streamlines your workflow, and features a popular user interface layout * Added (A) — The IDE detects added files * Deleted (D) — The IDE detects deleted files. - + 5. **Command bar —** The Command bar, located in the lower left of the IDE, is used to invoke [dbt commands](/reference/dbt-commands). When a command is invoked, the associated logs are shown in the Invocation History Drawer. @@ -107,15 +107,19 @@ Starting from dbt v1.6 or higher, when you save changes to a model, you can comp 3. **Build button —** The build button allows users to quickly access dbt commands related to the active model in the File Editor. The available commands include dbt build, dbt test, and dbt run, with options to include only the current resource, the resource and its upstream dependencies, the resource, and its downstream dependencies, or the resource with all dependencies. This menu is available for all executable nodes. -4. **Format button —** The editor has a **Format** button that can reformat the contents of your files. For SQL files, it uses either `sqlfmt` or `sqlfluff`, and for Python files, it uses `black`. +4. **Lint button** — The **Lint** button runs the [linter](/docs/cloud/dbt-cloud-ide/lint-format) on the active file in the File Editor. The linter checks for syntax errors and style issues in your code and displays the results in the **Code quality** tab. -5. **Results tab —** The Results console tab displays the most recent Preview results in tabular format. +5. **dbt Copilot** — [dbt Copilot](/docs/cloud/dbt-copilot) is a powerful artificial intelligence engine that can generate documentation, tests, and semantic models for you. dbt Copilot is available in the IDE for Enterprise plans. + +6. **Results tab —** The Results console tab displays the most recent Preview results in tabular format. -6. **Compiled Code tab —** The Compile button triggers a compile invocation that generates compiled code, which is displayed in the Compiled Code tab. +7. **Code quality tab** — The Code Quality tab displays the results of the linter on the active file in the File Editor. It allows you to view code errors, provides code quality visibility and management, and displays the SQLFluff version used. + +8. **Compiled Code tab —** The Compile generates the compiled code when the Compile button is executed. The Compiled Code tab displays the compiled SQL code for the active file in the File Editor. -7. **Lineage tab —** The Lineage tab in the File Editor displays the active model's lineage or . By default, it shows two degrees of lineage in both directions (`2+model_name+2`), however, you can change it to +model+ (full DAG). +9. **Lineage tab —** The Lineage tab in the File Editor displays the active model's lineage or . By default, it shows two degrees of lineage in both directions (`2+model_name+2`), however, you can change it to +model+ (full DAG). To use the lineage: - Double-click a node in the DAG to open that file in a new tab - Expand or shrink the DAG using node selection syntax. - Note, the `--exclude` flag isn't supported. @@ -158,11 +162,11 @@ Use menus and modals to interact with IDE and access useful options to help your - #### File Search You can easily search for and navigate between files using the File Navigation menu, which can be accessed by pressing Command-O or Control-O or clicking on the 🔍 icon in the File Explorer. - + - #### Global Command Palette The Global Command Palette provides helpful shortcuts to interact with the IDE, such as git actions, specialized dbt commands, and compile, and preview actions, among others. To open the menu, use Command-P or Control-P. - + - #### IDE Status modal The IDE Status modal shows the current error message and debug logs for the server. This also contains an option to restart the IDE. Open this by clicking on the IDE Status button. @@ -193,7 +197,7 @@ Use menus and modals to interact with IDE and access useful options to help your * Toggling between dark or light mode for a better viewing experience * Restarting the IDE - * Fully recloning your repository to refresh your git state and view status details + * Rollback your repo to remote, to refresh your git state and view status details * Viewing status details, including the IDE Status modal. - + diff --git a/website/docs/docs/cloud/git/authenticate-azure.md b/website/docs/docs/cloud/git/authenticate-azure.md index 42028bf993b..5278c134f72 100644 --- a/website/docs/docs/cloud/git/authenticate-azure.md +++ b/website/docs/docs/cloud/git/authenticate-azure.md @@ -13,9 +13,9 @@ If you use the dbt Cloud IDE or dbt Cloud CLI to collaborate on your team's Azur Connect your dbt Cloud profile to Azure DevOps using OAuth: -1. Click the gear icon at the top right and select **Profile settings**. -2. Click **Linked Accounts**. -3. Next to Azure DevOps, click **Link**. +1. Click your account name at the bottom of the left-side menu and click **Account settings** +2. Scroll down to **Your profile** and select **Personal profile**. +3. Go to the **Linked accounts** section in the middle of the page. 4. Once you're redirected to Azure DevOps, sign into your account. diff --git a/website/docs/docs/cloud/manage-access/audit-log.md b/website/docs/docs/cloud/manage-access/audit-log.md index 9c80adaf2f8..4d07afe2cde 100644 --- a/website/docs/docs/cloud/manage-access/audit-log.md +++ b/website/docs/docs/cloud/manage-access/audit-log.md @@ -32,7 +32,7 @@ On the audit log page, you will see a list of various events and their associate ### Event details -Click the event card to see the details about the activity that triggered the event. This view provides important details, including when it happened and what type of event was triggered. For example, if someone changes the settings for a job, you can use the event details to see which job was changed (type of event: `job_definition.Changed`), by whom (person who triggered the event: `actor`), and when (time it was triggered: `created_at_utc`). For types of events and their descriptions, see [Events in audit log](#events-in-audit-log). +Click the event card to see the details about the activity that triggered the event. This view provides important details, including when it happened and what type of event was triggered. For example, if someone changes the settings for a job, you can use the event details to see which job was changed (type of event: `job_definition.Changed`), by whom (person who triggered the event: `actor`), and when (time it was triggered: `created_at_utc`). For types of events and their descriptions, see [Events in audit log](#audit-log-events). The event details provide the key factors of an event: diff --git a/website/docs/docs/cloud/manage-access/invite-users.md b/website/docs/docs/cloud/manage-access/invite-users.md index c82e15fd48f..0922b4dc991 100644 --- a/website/docs/docs/cloud/manage-access/invite-users.md +++ b/website/docs/docs/cloud/manage-access/invite-users.md @@ -17,19 +17,16 @@ You must have proper permissions to invite new users: ## Invite new users -1. In your dbt Cloud account, select the gear menu in the upper right corner and then select **Account Settings**. -2. From the left sidebar, select **Users**. - - - -3. Click on **Invite Users**. +1. In your dbt Cloud account, select your account name in the bottom left corner. Then select **Account settings**. +2. Under **Settings**, select **Users**. +3. Click on **Invite users**. -4. In the **Email Addresses** field, enter the email addresses of the users you would like to invite separated by comma, semicolon, or a new line. +4. In the **Email Addresses** field, enter the email addresses of the users you want to invite separated by a comma, semicolon, or a new line. 5. Select the license type for the batch of users from the **License** dropdown. -6. Select the group(s) you would like the invitees to belong to. -7. Click **Send Invitations**. +6. Select the group(s) you want the invitees to belong to. +7. Click **Send invitations**. - If the list of invitees exceeds the number of licenses your account has available, you will receive a warning when you click **Send Invitations** and the invitations will not be sent. diff --git a/website/docs/docs/cloud/manage-access/mfa.md b/website/docs/docs/cloud/manage-access/mfa.md index a06251e6468..bcddc04f072 100644 --- a/website/docs/docs/cloud/manage-access/mfa.md +++ b/website/docs/docs/cloud/manage-access/mfa.md @@ -7,6 +7,13 @@ sidebar: null # Multi-factor authentication +:::important + + +dbt Cloud enforces multi-factor authentication (MFA) for all users with username and password credentials. If MFA is not set up, you will see a notification bar prompting you to configure one of the supported methods when you log in. If you do not, you will have to configure MFA upon subsequent logins, or you will be unable to access dbt Cloud. + +::: + dbt Cloud provides multiple options for multi-factor authentication (MFA). MFA provides an additional layer of security to username and password logins for Developer and Team plan accounts. The available MFA methods are: - SMS verification code (US-based phone numbers only) diff --git a/website/docs/docs/cloud/migration.md b/website/docs/docs/cloud/migration.md index 76d881e7389..2665b8f6a97 100644 --- a/website/docs/docs/cloud/migration.md +++ b/website/docs/docs/cloud/migration.md @@ -15,6 +15,13 @@ Your account will be automatically migrated on or after its scheduled date. Howe ## Recommended actions +:::info Rescheduling your migration + +If you're on the dbt Cloud Enterprise tier, you can postpone your account migration by up to 45 days. To reschedule your migration, navigate to **Account Settings** → **Migration guide**. + +For help, contact the dbt Support Team at [support@getdbt.com](mailto:support@getdbt.com). +::: + We highly recommended you take these actions: - Ensure pending user invitations are accepted or note outstanding invitations. Pending user invitations might be voided during the migration. You can resend user invitations after the migration is complete. diff --git a/website/docs/docs/cloud/secure/about-privatelink.md b/website/docs/docs/cloud/secure/about-privatelink.md index 731cef3f019..f19790fd708 100644 --- a/website/docs/docs/cloud/secure/about-privatelink.md +++ b/website/docs/docs/cloud/secure/about-privatelink.md @@ -7,10 +7,13 @@ sidebar_label: "About PrivateLink" import SetUpPages from '/snippets/_available-tiers-privatelink.md'; import PrivateLinkHostnameWarning from '/snippets/_privatelink-hostname-restriction.md'; +import CloudProviders from '/snippets/_privatelink-across-providers.md'; -PrivateLink enables a private connection from any dbt Cloud Multi-Tenant environment to your data platform hosted on AWS using [AWS PrivateLink](https://aws.amazon.com/privatelink/) technology. PrivateLink allows dbt Cloud customers to meet security and compliance controls as it allows connectivity between dbt Cloud and your data platform without traversing the public internet. This feature is supported in most regions across NA, Europe, and Asia, but [contact us](https://www.getdbt.com/contact/) if you have questions about availability. +PrivateLink enables a private connection from any dbt Cloud Multi-Tenant environment to your data platform hosted on a cloud provider, such as [AWS](https://aws.amazon.com/privatelink/) or [Azure](https://azure.microsoft.com/en-us/products/private-link), using that provider’s PrivateLink technology. PrivateLink allows dbt Cloud customers to meet security and compliance controls as it allows connectivity between dbt Cloud and your data platform without traversing the public internet. This feature is supported in most regions across NA, Europe, and Asia, but [contact us](https://www.getdbt.com/contact/) if you have questions about availability. + + ### Cross-region PrivateLink diff --git a/website/docs/docs/cloud/secure/databricks-privatelink.md b/website/docs/docs/cloud/secure/databricks-privatelink.md index a02683e1269..d754f2b76c4 100644 --- a/website/docs/docs/cloud/secure/databricks-privatelink.md +++ b/website/docs/docs/cloud/secure/databricks-privatelink.md @@ -8,11 +8,14 @@ pagination_next: null import SetUpPages from '/snippets/_available-tiers-privatelink.md'; import PrivateLinkSLA from '/snippets/_PrivateLink-SLA.md'; +import CloudProviders from '/snippets/_privatelink-across-providers.md'; The following steps will walk you through the setup of a Databricks AWS PrivateLink or Azure Private Link endpoint in the dbt Cloud multi-tenant environment. + + ## Configure AWS PrivateLink 1. Locate your [Databricks instance name](https://docs.databricks.com/en/workspace/workspace-details.html#workspace-instance-names-urls-and-ids) diff --git a/website/docs/docs/cloud/secure/postgres-privatelink.md b/website/docs/docs/cloud/secure/postgres-privatelink.md index 76b7774fcec..4d670354686 100644 --- a/website/docs/docs/cloud/secure/postgres-privatelink.md +++ b/website/docs/docs/cloud/secure/postgres-privatelink.md @@ -7,11 +7,14 @@ sidebar_label: "PrivateLink for Postgres" import SetUpPages from '/snippets/_available-tiers-privatelink.md'; import PrivateLinkTroubleshooting from '/snippets/_privatelink-troubleshooting.md'; import PrivateLinkCrossZone from '/snippets/_privatelink-cross-zone-load-balancing.md'; +import CloudProviders from '/snippets/_privatelink-across-providers.md'; A Postgres database, hosted either in AWS or in a properly connected on-prem data center, can be accessed through a private network connection using AWS Interface-type PrivateLink. The type of Target Group connected to the Network Load Balancer (NLB) may vary based on the location and type of Postgres instance being connected, as explained in the following steps. + + ## Configuring Postgres interface-type PrivateLink ### 1. Provision AWS resources @@ -96,4 +99,4 @@ Once dbt Cloud support completes the configuration, you can start creating new c 4. Configure the remaining data platform details. 5. Test your connection and save it. - \ No newline at end of file + diff --git a/website/docs/docs/cloud/secure/redshift-privatelink.md b/website/docs/docs/cloud/secure/redshift-privatelink.md index 16d14badc05..75924cf76a9 100644 --- a/website/docs/docs/cloud/secure/redshift-privatelink.md +++ b/website/docs/docs/cloud/secure/redshift-privatelink.md @@ -8,6 +8,7 @@ sidebar_label: "PrivateLink for Redshift" import SetUpPages from '/snippets/_available-tiers-privatelink.md'; import PrivateLinkTroubleshooting from '/snippets/_privatelink-troubleshooting.md'; import PrivateLinkCrossZone from '/snippets/_privatelink-cross-zone-load-balancing.md'; +import CloudProviders from '/snippets/_privatelink-across-providers.md'; @@ -17,6 +18,8 @@ AWS provides two different ways to create a PrivateLink VPC endpoint for a Redsh dbt Cloud supports both types of endpoints, but there are a number of [considerations](https://docs.aws.amazon.com/redshift/latest/mgmt/managing-cluster-cross-vpc.html#managing-cluster-cross-vpc-considerations) to take into account when deciding which endpoint type to use. Redshift-managed provides a far simpler setup with no additional cost, which might make it the preferred option for many, but may not be an option in all environments. Based on these criteria, you will need to determine which is the right type for your system. Follow the instructions from the section below that corresponds to your chosen endpoint type. + + :::note Redshift Serverless While Redshift Serverless does support Redshift-managed type VPC endpoints, this functionality is not currently available across AWS accounts. Due to this limitation, an Interface-type VPC endpoint service must be used for Redshift Serverless cluster PrivateLink connectivity from dbt Cloud. ::: @@ -125,4 +128,4 @@ Once dbt Cloud support completes the configuration, you can start creating new c 4. Configure the remaining data platform details. 5. Test your connection and save it. - \ No newline at end of file + diff --git a/website/docs/docs/cloud/secure/snowflake-privatelink.md b/website/docs/docs/cloud/secure/snowflake-privatelink.md index c6775be2444..b943791292f 100644 --- a/website/docs/docs/cloud/secure/snowflake-privatelink.md +++ b/website/docs/docs/cloud/secure/snowflake-privatelink.md @@ -6,11 +6,14 @@ sidebar_label: "PrivateLink for Snowflake" --- import SetUpPages from '/snippets/_available-tiers-privatelink.md'; +import CloudProviders from '/snippets/_privatelink-across-providers.md'; The following steps walk you through the setup of a Snowflake AWS PrivateLink or Azure Private Link endpoint in a dbt Cloud multi-tenant environment. + + :::note Snowflake SSO with PrivateLink Users connecting to Snowflake using SSO over a PrivateLink connection from dbt Cloud will also require access to a PrivateLink endpoint from their local workstation. diff --git a/website/docs/docs/collaborate/data-tile.md b/website/docs/docs/collaborate/data-tile.md index 70318922d68..1d5b26e26b7 100644 --- a/website/docs/docs/collaborate/data-tile.md +++ b/website/docs/docs/collaborate/data-tile.md @@ -92,7 +92,7 @@ Follow these steps to embed the data health tile in PowerBI: ```html Website = - "" + "" ``` @@ -120,12 +120,34 @@ Follow these steps to embed the data health tile in Tableau: 3. Insert a **Web Page** object. 4. Insert the URL and click **Ok**. - `https://metadata.cloud.getdbt.com/exposure-tile?uniqueId=exposure.snowflake_tpcds_sales_spoke.customer360_test&environmentType=production&environmentId=220370&token=` - + ```html + https://metadata.ACCESS_URL/exposure-tile?uniqueId=exposure.EXPOSURE_NAME&environmentType=production&environmentId=220370&token= + ``` + *Note, replace the placeholders with your actual values.* 5. You should now see the data health tile embedded in your Tableau dashboard. + + + +Follow these steps to embed the data health tile in Sigma: + + + +1. Create a dashboard in Sigma and connect to your database to pull in the data. +2. Ensure you've copied the URL or iFrame snippet available in dbt Explorer's **Data health** section, under the **Embed data health into your dashboard** toggle. +3. Add a new embedded UI element in your Sigma Workbook in the following format: + + ```html + https://metadata.ACCESS_URL/exposure-tile?uniqueId=exposure.EXPOSURE_NAME&environmentType=production&environmentId=ENV_ID_NUMBER&token= + ``` + + *Note, replace the placeholders with your actual values.* +4. You should now see the data health tile embedded in your Sigma dashboard. + + + ## Job-based data health diff --git a/website/docs/docs/collaborate/govern/project-dependencies.md b/website/docs/docs/collaborate/govern/project-dependencies.md index c054d1b27b7..7813e25efcb 100644 --- a/website/docs/docs/collaborate/govern/project-dependencies.md +++ b/website/docs/docs/collaborate/govern/project-dependencies.md @@ -18,7 +18,7 @@ This year, dbt Labs is introducing an expanded notion of `dependencies` across m ## Prerequisites - Available in [dbt Cloud Enterprise](https://www.getdbt.com/pricing). If you have an Enterprise account, you can unlock these features by designating a [public model](/docs/collaborate/govern/model-access) and adding a [cross-project ref](#how-to-write-cross-project-ref). -- Use a supported version of dbt (v1.6, v1.7, or go versionless with "[Versionless](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless)") for both the upstream ("producer") project and the downstream ("consumer") project. +- Use a supported version of dbt (v1.6 or newer or go versionless with "[Versionless](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless)") for both the upstream ("producer") project and the downstream ("consumer") project. - Define models in an upstream ("producer") project that are configured with [`access: public`](/reference/resource-configs/access). You need at least one successful job run after defining their `access`. - Define a deployment environment in the upstream ("producer") project [that is set to be your Production environment](/docs/deploy/deploy-environments#set-as-production-environment), and ensure it has at least one successful job run in that environment. - If the upstream project has a Staging environment, run a job in that Staging environment to ensure the downstream cross-project ref resolves. diff --git a/website/docs/docs/collaborate/project-recommendations.md b/website/docs/docs/collaborate/project-recommendations.md index 12007c6b88b..c9499579e54 100644 --- a/website/docs/docs/collaborate/project-recommendations.md +++ b/website/docs/docs/collaborate/project-recommendations.md @@ -20,7 +20,7 @@ The Recommendations overview page includes two top-level metrics measuring the t - **Model test coverage** — The percent of models in your project (models not from a package or imported via dbt Mesh) with at least one dbt test configured on them. - **Model documentation coverage** — The percent of models in your project (models not from a package or imported via dbt Mesh) with a description. - + ## List of rules The following table lists the rules currently defined in the `dbt_project_evaluator` [package](https://hub.getdbt.com/dbt-labs/dbt_project_evaluator/latest/). diff --git a/website/docs/docs/core/connect-data-platform/about-core-connections.md b/website/docs/docs/core/connect-data-platform/about-core-connections.md index 461aeea2e87..221f495d054 100644 --- a/website/docs/docs/core/connect-data-platform/about-core-connections.md +++ b/website/docs/docs/core/connect-data-platform/about-core-connections.md @@ -32,8 +32,6 @@ If you're using dbt from the command line (CLI), you'll need a profiles.yml file For detailed info, you can refer to the [Connection profiles](/docs/core/connect-data-platform/connection-profiles). - - ## Adapter features The following table lists the features available for adapters: @@ -55,5 +53,3 @@ For adapters that support it, you can partially build the catalog. This allows t ### Source freshness You can measure source freshness using the warehouse metadata tables on supported adapters. This allows for calculating source freshness without using the [`loaded_at_field`](/reference/resource-properties/freshness#loaded_at_field) and without querying the table directly. This is faster and more flexible (though it might sometimes be inaccurate, depending on how the warehouse tracks altered tables). You can override this with the `loaded_at_field` in the [source config](/reference/source-configs). If the adapter doesn't support this, you can still use the `loaded_at_field`. - - diff --git a/website/docs/docs/core/connect-data-platform/bigquery-setup.md b/website/docs/docs/core/connect-data-platform/bigquery-setup.md index eedc3646f89..8b1867ef620 100644 --- a/website/docs/docs/core/connect-data-platform/bigquery-setup.md +++ b/website/docs/docs/core/connect-data-platform/bigquery-setup.md @@ -390,9 +390,9 @@ my-profile: ### Running Python models on Dataproc -To run dbt Python models on GCP, dbt uses companion services, Dataproc and Cloud Storage, that offer tight integrations with BigQuery. You may use an existing Dataproc cluster and Cloud Storage bucket, or create new ones: -- https://cloud.google.com/dataproc/docs/guides/create-cluster -- https://cloud.google.com/storage/docs/creating-buckets +import BigQueryDataproc from '/snippets/_bigquery-dataproc.md'; + + Then, add the bucket name, cluster name, and cluster region to your connection profile: diff --git a/website/docs/docs/core/connect-data-platform/glue-setup.md b/website/docs/docs/core/connect-data-platform/glue-setup.md index f2cf717147a..a074038a87f 100644 --- a/website/docs/docs/core/connect-data-platform/glue-setup.md +++ b/website/docs/docs/core/connect-data-platform/glue-setup.md @@ -175,7 +175,7 @@ Please to update variables between **`<>`**, here are explanations of these argu ### Configuration of the local environment -Because **`dbt`** and **`dbt-glue`** adapters are compatible with Python versions 3.7, 3.8, and 3.9, check the version of Python: +Because **`dbt`** and **`dbt-glue`** adapters are compatible with Python versions 3.9 or higher, check the version of Python: ```bash $ python3 --version diff --git a/website/docs/docs/core/connect-data-platform/spark-setup.md b/website/docs/docs/core/connect-data-platform/spark-setup.md index 01318211c8f..611642e91b7 100644 --- a/website/docs/docs/core/connect-data-platform/spark-setup.md +++ b/website/docs/docs/core/connect-data-platform/spark-setup.md @@ -197,14 +197,9 @@ connect_retries: 3 - - - - ### Server side configuration Spark can be customized using [Application Properties](https://spark.apache.org/docs/latest/configuration.html). Using these properties the execution can be customized, for example, to allocate more memory to the driver process. Also, the Spark SQL runtime can be set through these properties. For example, this allows the user to [set a Spark catalogs](https://spark.apache.org/docs/latest/configuration.html#spark-sql). - ## Caveats diff --git a/website/docs/docs/core/connect-data-platform/teradata-setup.md b/website/docs/docs/core/connect-data-platform/teradata-setup.md index df32b07bd0e..7b964b23b3d 100644 --- a/website/docs/docs/core/connect-data-platform/teradata-setup.md +++ b/website/docs/docs/core/connect-data-platform/teradata-setup.md @@ -26,20 +26,17 @@ import SetUpPages from '/snippets/_setup-pages-intro.md'; ## Python compatibility -| Plugin version | Python 3.6 | Python 3.7 | Python 3.8 | Python 3.9 | Python 3.10 | Python 3.11 | -| -------------- | ----------- | ----------- | ----------- | ----------- | ----------- | ------------ | -| 0.19.0.x | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ -| 0.20.0.x | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ -| 0.21.1.x | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ -| 1.0.0.x | ❌ | ✅ | ✅ | ✅ | ❌ | ❌ -|1.1.x.x | ❌ | ✅ | ✅ | ✅ | ✅ | ❌ -|1.2.x.x | ❌ | ✅ | ✅ | ✅ | ✅ | ❌ -|1.3.x.x | ❌ | ✅ | ✅ | ✅ | ✅ | ❌ -|1.4.x.x | ❌ | ✅ | ✅ | ✅ | ✅ | ✅ -|1.5.x | ❌ | ✅ | ✅ | ✅ | ✅ | ✅ -|1.6.x | ❌ | ❌ | ✅ | ✅ | ✅ | ✅ -|1.7.x | ❌ | ❌ | ✅ | ✅ | ✅ | ✅ -|1.8.x | ❌ | ❌ | ✅ | ✅ | ✅ | ✅ +| Plugin version | Python 3.9 | Python 3.10 | Python 3.11 | +| -------------- | ----------- | ----------- | ------------ | +|1.0.0.x | ✅ | ❌ | ❌ +|1.1.x.x | ✅ | ✅ | ❌ +|1.2.x.x | ✅ | ✅ | ❌ +|1.3.x.x | ✅ | ✅ | ❌ +|1.4.x.x | ✅ | ✅ | ✅ +|1.5.x | ✅ | ✅ | ✅ +|1.6.x | ✅ | ✅ | ✅ +|1.7.x | ✅ | ✅ | ✅ +|1.8.x | ✅ | ✅ | ✅ ## dbt dependent packages version compatibility diff --git a/website/docs/docs/dbt-cloud-apis/apis-overview.md b/website/docs/docs/dbt-cloud-apis/apis-overview.md index 055edea72b6..05964ace871 100644 --- a/website/docs/docs/dbt-cloud-apis/apis-overview.md +++ b/website/docs/docs/dbt-cloud-apis/apis-overview.md @@ -20,4 +20,4 @@ If you want to learn more about webhooks, refer to [Webhooks for your jobs](/doc ## How to Access the APIs -dbt Cloud supports two types of API Tokens: [user tokens](/docs/dbt-cloud-apis/user-tokens) and [service account tokens](/docs/dbt-cloud-apis/service-tokens). Requests to the dbt Cloud APIs can be authorized using these tokens. +dbt Cloud supports two types of API Tokens: [personal access tokens](/docs/dbt-cloud-apis/user-tokens) and [service account tokens](/docs/dbt-cloud-apis/service-tokens). Requests to the dbt Cloud APIs can be authorized using these tokens. diff --git a/website/docs/docs/dbt-cloud-apis/authentication.md b/website/docs/docs/dbt-cloud-apis/authentication.md index 8729cc0641d..43a08d84fd7 100644 --- a/website/docs/docs/dbt-cloud-apis/authentication.md +++ b/website/docs/docs/dbt-cloud-apis/authentication.md @@ -8,7 +8,7 @@ pagination_prev: null
@@ -23,9 +23,7 @@ pagination_prev: null ## Types of API access tokens -**User API keys (Legacy):** User API keys were historically the only method available to access dbt Cloud APIs on the user’s behalf. They are scoped to the user and not the account. User API Keys will eventually be deprecated for the more secure personal access tokens. - -**Personal access tokens (New):** Personal access tokens (PATs) are the new, preferred, and secure way of accessing dbt Cloud APIs on behalf of a user. They are more secure than user API Keys. PATs are scoped to an account and can be enhanced with more granularity and control. +**Personal access tokens:** Preferred and secure way of accessing dbt Cloud APIs on behalf of a user. PATs are scoped to an account and can be enhanced with more granularity and control. **Service tokens:** Service tokens are similar to service accounts and are the preferred method to enable access on behalf of the dbt Cloud account. @@ -33,7 +31,7 @@ pagination_prev: null You should use service tokens broadly for any production workflow where you need a service account. You should use PATs only for developmental workflows _or_ dbt Cloud client workflows that require user context. The following examples show you when to use a personal access token (PAT) or a service token: -* **Connecting a partner integration to dbt Cloud** — Some examples include the [dbt Semantic Layer Google Sheets integration](/docs/cloud-integrations/avail-sl-integrations), Hightouch, Datafold, a custom app you’ve created, etc. These types of integrations should use a service token instead of a PAT because service tokens give you visibility, and you can scope them to only what the integration needs and ensure the least privilege. We highly recommend switching to a service token if you’re using a user API key for these integrations today. +* **Connecting a partner integration to dbt Cloud** — Some examples include the [dbt Semantic Layer Google Sheets integration](/docs/cloud-integrations/avail-sl-integrations), Hightouch, Datafold, a custom app you’ve created, etc. These types of integrations should use a service token instead of a PAT because service tokens give you visibility, and you can scope them to only what the integration needs and ensure the least privilege. We highly recommend switching to a service token if you’re using a personal acess token for these integrations today. * **Production Terraform** — Use a service token since this is a production workflow and is acting as a service account and not a user account. * **Cloud CLI** — Use a PAT since the dbt Cloud CLI works within the context of a user (the user is making the requests and has to operate within the context of their user account). * **Testing a custom script and staging Terraform or Postman** — We recommend using a PAT as this is a developmental workflow and is scoped to the user making the changes. When you push this script or Terraform into production, use a service token instead. diff --git a/website/docs/docs/dbt-cloud-apis/discovery-api.md b/website/docs/docs/dbt-cloud-apis/discovery-api.md index ca84347ffad..db6819a5e09 100644 --- a/website/docs/docs/dbt-cloud-apis/discovery-api.md +++ b/website/docs/docs/dbt-cloud-apis/discovery-api.md @@ -32,7 +32,7 @@ Use the API to look at historical information like model build time to determine You can use, for example, the [model timing](/docs/deploy/run-visibility#model-timing) tab to help identify and optimize bottlenecks in model builds: - + @@ -50,7 +50,7 @@ Use the API to find and understand dbt assets in integrated tools using informat Data producers must manage and organize data for stakeholders, while data consumers need to quickly and confidently analyze data on a large scale to make informed decisions that improve business outcomes and reduce organizational overhead. The API is useful for discovery data experiences in catalogs, analytics, apps, and machine learning (ML) tools. It can help you understand the origin and meaning of datasets for your analysis. - + @@ -65,7 +65,6 @@ Use the API to review who developed the models and who uses them to help establi Use the API to review dataset changes and uses by examining exposures, lineage, and dependencies. From the investigation, you can learn how to define and build more effective dbt projects. For more details, refer to [Development](/docs/dbt-cloud-apis/discovery-use-cases-and-examples#development). - diff --git a/website/docs/docs/dbt-cloud-apis/service-tokens.md b/website/docs/docs/dbt-cloud-apis/service-tokens.md index 93a89922456..a077b230c28 100644 --- a/website/docs/docs/dbt-cloud-apis/service-tokens.md +++ b/website/docs/docs/dbt-cloud-apis/service-tokens.md @@ -12,7 +12,7 @@ If you have service tokens created on or before July 18, 2023, please read [this ::: -Service account tokens enable you to securely authenticate with the dbt Cloud API by assigning each token a narrow set of permissions that more precisely manages access to the API. While similar to [User API tokens](user-tokens), service account tokens belong to an account rather than a user. +Service account tokens enable you to securely authenticate with the dbt Cloud API by assigning each token a narrow set of permissions that more precisely manages access to the API. While similar to [personal access tokens](user-tokens), service account tokens belong to an account rather than a user. You can use service account tokens for system-level integrations that do not run on behalf of any one user. Assign any permission sets available in dbt Cloud to your service account token, which can vary slightly depending on your plan: @@ -39,8 +39,8 @@ You can assign service account tokens to any permission set available in dbt Clo The following permissions can be assigned to a service account token on a Team plan. Refer to [Enterprise permissions](/docs/cloud/manage-access/enterprise-permissions) for more information about these roles. - Account Admin — Account Admin service tokens have full `read + write` access to an account, so please use them with caution. A Team plan refers to this permission set as an "Owner role." +- Billing Admin - Job Admin -- Job Runner - Metadata Only - Member - Read-only @@ -59,6 +59,7 @@ Refer to [Enterprise permissions](/docs/cloud/manage-access/enterprise-permissio - Developer - Git Admin - Job Admin +- Job Runner - Job Viewer - Manage marketplace apps - Metadata Only diff --git a/website/docs/docs/dbt-cloud-apis/sl-jdbc.md b/website/docs/docs/dbt-cloud-apis/sl-jdbc.md index 3a9832dd706..9178d1e6592 100644 --- a/website/docs/docs/dbt-cloud-apis/sl-jdbc.md +++ b/website/docs/docs/dbt-cloud-apis/sl-jdbc.md @@ -56,7 +56,7 @@ The Semantic Layer JDBC API has built-in metadata calls which can provide a user Expand the following toggles for examples and metadata commands: - + You can use this query to fetch all defined metrics in your dbt project: @@ -65,9 +65,9 @@ select * from {{ semantic_layer.metrics() }} ``` - + - + You can use this query to fetch all dimensions for a metric. @@ -77,9 +77,9 @@ Note, metrics is a required argument that lists one or multiple metrics in it. select * from {{ semantic_layer.dimensions(metrics=['food_order_amount'])}} ``` - + - + You can use this query to fetch dimension values for one or multiple metrics and a single dimension. @@ -89,9 +89,9 @@ Note, metrics is a required argument that lists one or multiple metrics, and a s select * from {{ semantic_layer.dimension_values(metrics=['food_order_amount'], group_by=['customer__customer_name'])}} ``` - + - + You can use this query to fetch queryable granularities for a list of metrics. @@ -103,9 +103,9 @@ select * from {{ semantic_layer.queryable_granularities(metrics=['food_order_amount', 'order_gross_profit'])}} ``` - + - + You can use this query to fetch available metrics given dimensions. This command is essentially the opposite of getting dimensions given a list of metrics. @@ -117,9 +117,9 @@ select * from {{ }} ``` - + - + You can use this example query to fetch available granularities for all time dimensions (the similar queryable granularities API call only returns granularities for the primary time dimensions for metrics). @@ -133,9 +133,9 @@ select NAME, QUERYABLE_GRANULARITIES from {{ }} ``` - + - + It may be useful in your application to expose the names of the time dimensions that represent metric_time or the common thread across all metrics. @@ -147,9 +147,44 @@ select * from {{ }} ``` - + + + + +You can filter your metrics to include only those that contain a specific substring (sequence of characters contained within a larger string (text)). Use the `search` argument to specify the substring you want to match. + +```sql +select * from {{ semantic_layer.metrics(search='order') }} +``` + +If no substring is provided, the query returns all metrics. - + + + + +In the case when you don't want to return the full result set from a metadata call, you can paginate the results for both `semantic_layer.metrics()` and `semantic_layer.dimensions()` calls using the `page_size` and `page_number` parameters. + +- `page_size`: This is an optional variable which sets the number of records per page. If left as None, there is no page limit. +- `page_number`: This is an optional variable which specifies the page number to retrieve. Defaults to `1` (first page) if not specified. + +Examples: + +```sql +-- Retrieves the 5th page with a page size of 10 metrics +select * from {{ semantic_layer.metrics(page_size=10, page_number=5) }} + +-- Retrieves the 1st page with a page size of 10 metrics +select * from {{ semantic_layer.metrics(page_size=10) }} + +-- Retrieves all metrics without pagination +select * from {{ semantic_layer.metrics() }} +``` + +You can use the same pagination parameters for `semantic_layer.dimensions(...)`. + + + You can use this example query to list all available saved queries in your dbt project. @@ -165,7 +200,7 @@ select * from semantic_layer.saved_queries() | NAME | DESCRIPTION | LABEL | METRICS | GROUP_BY | WHERE_FILTER | ``` - + - - The run_results.json includes three attributes related to the `applied` state that complement `unique_id`: - `compiled`: Boolean entry of the node compilation status (`False` after parsing, but `True` after compiling). @@ -195,5 +193,3 @@ Here's a printed snippet from the `run_results.json`: } ], ``` - - diff --git a/website/docs/reference/commands/cmd-docs.md b/website/docs/reference/commands/cmd-docs.md index f20da08a4ae..03e11ae89f0 100644 --- a/website/docs/reference/commands/cmd-docs.md +++ b/website/docs/reference/commands/cmd-docs.md @@ -20,8 +20,6 @@ The command is responsible for generating your project's documentation website b dbt docs generate ``` - - Use the `--select` argument to limit the nodes included within `catalog.json`. When this flag is provided, step (3) will be restricted to the selected nodes. All other nodes will be excluded. Step (2) is unaffected. **Example**: @@ -30,8 +28,6 @@ Use the `--select` argument to limit the nodes included within `catalog.json`. W dbt docs generate --select +orders ``` - - Use the `--no-compile` argument to skip re-compilation. When this flag is provided, `dbt docs generate` will skip step (2) described above. **Example**: diff --git a/website/docs/reference/commands/deps.md b/website/docs/reference/commands/deps.md index 85c103e6337..0cb8e50f7a6 100644 --- a/website/docs/reference/commands/deps.md +++ b/website/docs/reference/commands/deps.md @@ -58,8 +58,6 @@ Updates available for packages: ['tailsdotcom/dbt_artifacts', 'dbt-labs/snowplow Update your versions in packages.yml, then run dbt deps ``` - - ## Predictable package installs Starting in dbt Core v1.7, dbt generates a `package-lock.yml` file in the root of your project. This contains the complete set of resolved packages based on the `packages` configuration in `dependencies.yml` or `packages.yml`. Each subsequent invocation of `dbt deps` will install from the _locked_ set of packages specified in this file. Storing the complete set of required packages (with pinned versions) in version-controlled code ensures predictable installs in production and consistency across all developers and environments. @@ -97,5 +95,3 @@ dbt deps --add-package https://github.com/fivetran/dbt_amplitude@v0.3.0 --source # add package from local dbt deps --add-package /opt/dbt/redshift --source local ``` - - diff --git a/website/docs/reference/commands/init.md b/website/docs/reference/commands/init.md index 8945eb823db..112fff63a38 100644 --- a/website/docs/reference/commands/init.md +++ b/website/docs/reference/commands/init.md @@ -17,15 +17,10 @@ Then, it will: - Create a new folder with your project name and sample files, enough to get you started with dbt - Create a connection profile on your local machine. The default location is `~/.dbt/profiles.yml`. Read more in [configuring your profile](/docs/core/connect-data-platform/connection-profiles). - - When using `dbt init` to initialize your project, include the `--profile` flag to specify an existing `profiles.yml` as the `profile:` key to use instead of creating a new one. For example, `dbt init --profile profile_name`. - - If the profile does not exist in `profiles.yml` or the command is run inside an existing project, the command raises an error. - ## Existing project diff --git a/website/docs/reference/dbt-commands.md b/website/docs/reference/dbt-commands.md index 8386cf61731..ca9a7725eb2 100644 --- a/website/docs/reference/dbt-commands.md +++ b/website/docs/reference/dbt-commands.md @@ -11,7 +11,7 @@ A key distinction with the tools mentioned, is that dbt Cloud CLI and IDE are de ## Parallel execution -dbt Cloud allows for parallel execution of commands, enhancing efficiency without compromising data integrity. This enables you to run multiple commands at the same time, however it's important to understand which commands can be run in parallel and which can't. +dbt Cloud allows for concurrent execution of commands, enhancing efficiency without compromising data integrity. This enables you to run multiple commands at the same time. However, it's important to understand which commands can be run in parallel and which can't. In contrast, [`dbt-core` _doesn't_ support](/reference/programmatic-invocations#parallel-execution-not-supported) safe parallel execution for multiple invocations in the same process, and requires users to manage concurrency manually to ensure data integrity and system stability. diff --git a/website/docs/reference/dbt_project.yml.md b/website/docs/reference/dbt_project.yml.md index e7cd5bbeb79..1bb9dd2cf9c 100644 --- a/website/docs/reference/dbt_project.yml.md +++ b/website/docs/reference/dbt_project.yml.md @@ -14,8 +14,6 @@ Every [dbt project](/docs/build/projects) needs a `dbt_project.yml` file — thi The following example is a list of all available configurations in the `dbt_project.yml` file: - - ```yml @@ -94,77 +92,6 @@ vars: ``` - - - - - - -```yml -[name](/reference/project-configs/name): string - -[config-version](/reference/project-configs/config-version): 2 -[version](/reference/project-configs/version): version - -[profile](/reference/project-configs/profile): profilename - -[model-paths](/reference/project-configs/model-paths): [directorypath] -[seed-paths](/reference/project-configs/seed-paths): [directorypath] -[test-paths](/reference/project-configs/test-paths): [directorypath] -[analysis-paths](/reference/project-configs/analysis-paths): [directorypath] -[macro-paths](/reference/project-configs/macro-paths): [directorypath] -[snapshot-paths](/reference/project-configs/snapshot-paths): [directorypath] -[docs-paths](/reference/project-configs/docs-paths): [directorypath] -[asset-paths](/reference/project-configs/asset-paths): [directorypath] - -[packages-install-path](/reference/project-configs/packages-install-path): directorypath - -[clean-targets](/reference/project-configs/clean-targets): [directorypath] - -[query-comment](/reference/project-configs/query-comment): string - -[require-dbt-version](/reference/project-configs/require-dbt-version): version-range | [version-range] - -[dbt-cloud](/docs/cloud/cloud-cli-installation): - [project-id](/docs/cloud/configure-cloud-cli#configure-the-dbt-cloud-cli): project_id # Required - [defer-env-id](/docs/cloud/about-cloud-develop-defer#defer-in-dbt-cloud-cli): environment_id # Optional - -[quoting](/reference/project-configs/quoting): - database: true | false - schema: true | false - identifier: true | false - -models: - [](/reference/model-configs) - -seeds: - [](/reference/seed-configs) - -snapshots: - [](/reference/snapshot-configs) - -sources: - [](source-configs) - -tests: - [](/reference/data-test-configs) - -vars: - [](/docs/build/project-variables) - -[on-run-start](/reference/project-configs/on-run-start-on-run-end): sql-statement | [sql-statement] -[on-run-end](/reference/project-configs/on-run-start-on-run-end): sql-statement | [sql-statement] - -[dispatch](/reference/project-configs/dispatch-config): - - macro_namespace: packagename - search_order: [packagename] - -[restrict-access](/docs/collaborate/govern/model-access): true | false - -``` - - - ## Naming convention diff --git a/website/docs/reference/global-configs/adapter-behavior-changes.md b/website/docs/reference/global-configs/adapter-behavior-changes.md index bd0ba9f7404..a755f8cfe50 100644 --- a/website/docs/reference/global-configs/adapter-behavior-changes.md +++ b/website/docs/reference/global-configs/adapter-behavior-changes.md @@ -14,10 +14,17 @@ Some adapters can display behavior changes when certain flags are enabled. The f
+ + + -
\ No newline at end of file +
diff --git a/website/docs/reference/global-configs/behavior-changes.md b/website/docs/reference/global-configs/behavior-changes.md index ae109b8f7c7..299674ae9c1 100644 --- a/website/docs/reference/global-configs/behavior-changes.md +++ b/website/docs/reference/global-configs/behavior-changes.md @@ -4,6 +4,8 @@ id: "behavior-changes" sidebar: "Behavior changes" --- +import StateModified from '/snippets/_state-modified-compare.md'; + Most flags exist to configure runtime behaviors with multiple valid choices. The right choice may vary based on the environment, user preference, or the specific invocation. Another category of flags provides existing projects with a migration window for runtime behaviors that are changing in newer releases of dbt. These flags help us achieve a balance between these goals, which can otherwise be in tension, by: @@ -56,6 +58,7 @@ flags: require_model_names_without_spaces: False source_freshness_run_project_hooks: False restrict_direct_pg_catalog_access: False + require_yaml_configuration_for_mf_time_spines: False ``` @@ -64,12 +67,13 @@ When we use dbt Cloud in the following table, we're referring to accounts that h | Flag | dbt Cloud: Intro | dbt Cloud: Maturity | dbt Core: Intro | dbt Core: Maturity | |-----------------------------------------------------------------|------------------|---------------------|-----------------|--------------------| -| require_explicit_package_overrides_for_builtin_materializations | 2024.04 | 2024.06 | 1.6.14, 1.7.14 | 1.8.0 | -| require_resource_names_without_spaces | 2024.05 | TBD* | 1.8.0 | 1.9.0 | -| source_freshness_run_project_hooks | 2024.03 | TBD* | 1.8.0 | 1.9.0 | +| [require_explicit_package_overrides_for_builtin_materializations](#package-override-for-built-in-materialization) | 2024.04 | 2024.06 | 1.6.14, 1.7.14 | 1.8.0 | +| [require_resource_names_without_spaces](#no-spaces-in-resource-names) | 2024.05 | TBD* | 1.8.0 | 1.9.0 | +| [source_freshness_run_project_hooks](#project-hooks-with-source-freshness) | 2024.03 | TBD* | 1.8.0 | 1.9.0 | | [Redshift] [restrict_direct_pg_catalog_access](/reference/global-configs/redshift-changes#the-restrict_direct_pg_catalog_access-flag) | 2024.09 | TBD* | dbt-redshift v1.9.0 | 1.9.0 | -| skip_nodes_if_on_run_start_fails | 2024.10 | TBD* | 1.9.0 | TBD* | -| state_modified_compare_more_unrendered_values | 2024.10 | TBD* | 1.9.0 | TBD* | +| [skip_nodes_if_on_run_start_fails](#failures-in-on-run-start-hooks) | 2024.10 | TBD* | 1.9.0 | TBD* | +| [state_modified_compare_more_unrendered_values](#source-definitions-for-state) | 2024.10 | TBD* | 1.9.0 | TBD* | +| [require_yaml_configuration_for_mf_time_spines](#metricflow-time-spine-yaml) | 2024.10 | TBD* | 1.9.0 | TBD* | When the dbt Cloud Maturity is "TBD," it means we have not yet determined the exact date when these flags' default values will change. Affected users will see deprecation warnings in the meantime, and they will receive emails providing advance warning ahead of the maturity date. In the meantime, if you are seeing a deprecation warning, you can either: - Migrate your project to support the new behavior, and then set the flag to `True` to stop seeing the warnings. @@ -83,13 +87,18 @@ Set the `skip_nodes_if_on_run_start_fails` flag to `True` to skip all selected r ### Source definitions for state:modified +:::info + + + +::: + The flag is `False` by default. Set `state_modified_compare_more_unrendered_values` to `True` to reduce false positives during `state:modified` checks (especially when configs differ by target environment like `prod` vs. `dev`). Setting the flag to `True` changes the `state:modified` comparison from using rendered values to unrendered values instead. It accomplishes this by persisting `unrendered_config` during model parsing and `unrendered_database` and `unrendered_schema` configs during source parsing. - ### Package override for built-in materialization Setting the `require_explicit_package_overrides_for_builtin_materializations` flag to `True` prevents this automatic override. @@ -136,7 +145,7 @@ The names of dbt resources (models, sources, etc) should contain letters, number Set the `source_freshness_run_project_hooks` flag to `True` to include "project hooks" ([`on-run-start` / `on-run-end`](/reference/project-configs/on-run-start-on-run-end)) in the `dbt source freshness` command execution. -If you have specific project [`on-run-start` / `on-run-end`](/reference/project-configs/on-run-start-on-run-end) hooks that should not run before/after `source freshness` command, you can add a conditional check to those hooks: +If you have a specific project [`on-run-start` / `on-run-end`](/reference/project-configs/on-run-start-on-run-end) hooks that should not run before/after `source freshness` command, you can add a conditional check to those hooks: @@ -145,3 +154,13 @@ on-run-start: - '{{ ... if flags.WHICH != 'freshness' }}' ``` + + +### MetricFlow time spine YAML +The `require_yaml_configuration_for_mf_time_spines` flag is set to `False` by default. + +In previous versions (dbt Core 1.8 and earlier), the MetricFlow time spine configuration was stored in a `metricflow_time_spine.sql` file. + +When the flag is set to `True`, dbt will continue to support the SQL file configuration. When the flag is set to `False`, dbt will raise a deprecation warning if it detects a MetricFlow time spine configured in a SQL file. + +The MetricFlow YAML file should have the `time_spine:` field. Refer to [MetricFlow timespine](/docs/build/metricflow-time-spine) for more details. diff --git a/website/docs/reference/global-configs/databricks-changes.md b/website/docs/reference/global-configs/databricks-changes.md new file mode 100644 index 00000000000..ca24b822ae5 --- /dev/null +++ b/website/docs/reference/global-configs/databricks-changes.md @@ -0,0 +1,26 @@ +--- +title: "Databricks adapter behavior changes" +id: "databricks-changes" +sidebar: "Databricks" +--- + +The following are the current [behavior change flags](/docs/reference/global-configs/behavior-changes.md#behavior-change-flags) that are specific to `dbt-databricks`: + +| Flag | `dbt-databricks`: Intro | `dbt-databricks`: Maturity | +| ----------------------------- | ----------------------- | -------------------------- | +| `use_info_schema_for_columns` | 1.9.0 | TBD | +| `use_user_folder_for_python` | 1.9.0 | TBD | + +### Use information schema for columns + +The `use_info_schema_for_columns` flag is `False` by default. + +Setting this flag to `True` will use `information_schema` rather than `describe extended` to get column metadata for Unity Catalog tables. This setting helps you avoid issues where `describe extended` truncates information when the type is a complex struct. However, this setting is not yet the default behavior, as there are performance impacts due to a Databricks metadata limitation because of the need to run `REPAIR TABLE {{relation}} SYNC METADATA` before querying to ensure the `information_schema` is complete. + +This flag will become the default behavior when this additional query is no longer needed. + +### Use user's folder for Python model notebooks + +The `use_user_folder_for_python` flag is `False` by default and results in writing uploaded python model notebooks to `/Shared/dbt_python_models/{{schema}}/`. Setting this flag to `True` will write notebooks to `/Users/{{current user}}/{{catalog}}/{{schema}}/` Writing to the `Shared` folder is deprecated by Databricks as it does not align with governance best practices. + +We plan to promote this flag to maturity in v1.10.0. diff --git a/website/docs/reference/global-configs/resource-type.md b/website/docs/reference/global-configs/resource-type.md index 9e6ec82df06..431b6c049cb 100644 --- a/website/docs/reference/global-configs/resource-type.md +++ b/website/docs/reference/global-configs/resource-type.md @@ -24,20 +24,7 @@ The `--exclude-resource-type` flag is only available in dbt version 1.8 and high The available resource types are: - - -- [`analysis`](/docs/build/analyses) -- [`exposure`](/docs/build/exposures) -- [`metric`](/docs/build/metrics-overview) -- [`model`](/docs/build/models) -- [`seed`](/docs/build/seeds) -- [`snapshot`](/docs/build/snapshots) -- [`source`](/docs/build/sources) -- [`test`](/docs/build/data-tests) - - - - + - [`analysis`](/docs/build/analyses) - [`exposure`](/docs/build/exposures) @@ -82,7 +69,6 @@ Instead of targeting specific resources, use the `--resource-flag` or `--exclude - - In this example, run the following command to include _all_ saved queries with the `--resource-type` flag: @@ -94,8 +80,6 @@ Instead of targeting specific resources, use the `--resource-flag` or `--exclude - - - In this example, use the following command to exclude _all_ unit tests from your dbt build process. Note that the `--exclude-resource-type` flag is only available in dbt version 1.8 and higher: diff --git a/website/docs/reference/node-selection/methods.md b/website/docs/reference/node-selection/methods.md index 38484494e4b..7587a9fd2b1 100644 --- a/website/docs/reference/node-selection/methods.md +++ b/website/docs/reference/node-selection/methods.md @@ -310,10 +310,6 @@ dbt list --select "+semantic_model:orders" # list your semantic model named "or ``` ### The "saved_query" method - -Supported in v1.7 or newer. - - The `saved_query` method selects [saved queries](/docs/build/saved-queries). @@ -322,8 +318,6 @@ dbt list --select "saved_query:*" # list all saved queries dbt list --select "+saved_query:orders_saved_query" # list your saved query named "orders_saved_query" and all upstream resources ``` - - ### The "unit_test" method diff --git a/website/docs/reference/node-selection/state-comparison-caveats.md b/website/docs/reference/node-selection/state-comparison-caveats.md index 25301656539..adaf35bd710 100644 --- a/website/docs/reference/node-selection/state-comparison-caveats.md +++ b/website/docs/reference/node-selection/state-comparison-caveats.md @@ -2,6 +2,8 @@ title: "Caveats to state comparison" --- +import StateModified from '/snippets/_state-modified-compare.md'; + The [`state:` selection method](/reference/node-selection/methods#the-state-method) is a powerful feature, with a lot of underlying complexity. Below are a handful of considerations when setting up automated jobs that leverage state comparison. ### Seeds @@ -48,6 +50,8 @@ dbt test -s "state:modified" --exclude "test_name:relationships" To reduce false positives during `state:modified` selection due to env-aware logic, you can set the `state_modified_compare_more_unrendered_values` [behavior flag](/reference/global-configs/behavior-changes#behavior-change-flags) to `True`. + + diff --git a/website/docs/reference/programmatic-invocations.md b/website/docs/reference/programmatic-invocations.md index 09e41b1789f..61250e6debb 100644 --- a/website/docs/reference/programmatic-invocations.md +++ b/website/docs/reference/programmatic-invocations.md @@ -25,9 +25,9 @@ for r in res.result: ## Parallel execution not supported -[`dbt-core`](https://pypi.org/project/dbt-core/) doesn't support [safe parallel execution](/reference/dbt-commands#parallel-execution) for multiple invocations in the same process. This means it's not safe to run multiple dbt commands at the same time. It's officially discouraged and requires a wrapping process to handle sub-processes. This is because: +[`dbt-core`](https://pypi.org/project/dbt-core/) doesn't support [safe parallel execution](/reference/dbt-commands#parallel-execution) for multiple invocations in the same process. This means it's not safe to run multiple dbt commands concurrently. It's officially discouraged and requires a wrapping process to handle sub-processes. This is because: -- Running simultaneous commands can unexpectedly interact with the data platform. For example, running `dbt run` and `dbt build` for the same models simultaneously could lead to unpredictable results. +- Running concurrent commands can unexpectedly interact with the data platform. For example, running `dbt run` and `dbt build` for the same models simultaneously could lead to unpredictable results. - Each `dbt-core` command interacts with global Python variables. To ensure safe operation, commands need to be executed in separate processes, which can be achieved using methods like spawning processes or using tools like Celery. To run [safe parallel execution](/reference/dbt-commands#available-commands), you can use the [dbt Cloud CLI](/docs/cloud/cloud-cli-installation) or [dbt Cloud IDE](/docs/cloud/dbt-cloud-ide/develop-in-the-cloud), both of which does that additional work to manage concurrency (multiple processes) on your behalf. diff --git a/website/docs/reference/resource-configs/access.md b/website/docs/reference/resource-configs/access.md index 0f67a454344..c73e09dd639 100644 --- a/website/docs/reference/resource-configs/access.md +++ b/website/docs/reference/resource-configs/access.md @@ -15,14 +15,6 @@ models: - - -Access modifiers may be applied to models one-by-one in YAML properties. In v1.5 and v1.6, you are unable to configure `access` for multiple models at once. Upgrade to v1.7 for additional configuration options. A group or subfolder contains models with varying access levels, so when you designate a model with `access: public`, make sure you intend for this behavior. - - - - - You can apply access modifiers in config files, including the `dbt_project.yml`, or to models one-by-one in `properties.yml`. Applying access configs to a subfolder modifies the default for all models in that subfolder, so make sure you intend for this behavior. When setting individual model access, a group or subfolder might contain a variety of access levels, so when you designate a model with `access: public` make sure you intend for this behavior. There are multiple approaches to configuring access: @@ -83,8 +75,6 @@ There are multiple approaches to configuring access: ``` - - After you define `access`, rerun a production job to apply the change. ## Definition diff --git a/website/docs/reference/resource-configs/bigquery-configs.md b/website/docs/reference/resource-configs/bigquery-configs.md index b943f114861..9dd39c936b6 100644 --- a/website/docs/reference/resource-configs/bigquery-configs.md +++ b/website/docs/reference/resource-configs/bigquery-configs.md @@ -710,8 +710,6 @@ models: Views with this configuration will be able to select from objects in `project_1.dataset_1` and `project_2.dataset_2`, even when they are located elsewhere and queried by users who do not otherwise have access to `project_1.dataset_1` and `project_2.dataset_2`. - - ## Materialized views The BigQuery adapter supports [materialized views](https://cloud.google.com/bigquery/docs/materialized-views-intro) @@ -894,10 +892,6 @@ As with most data platforms, there are limitations associated with materialized Find more information about materialized view limitations in Google's BigQuery [docs](https://cloud.google.com/bigquery/docs/materialized-views-intro#limitations). - - - - ## Python models The BigQuery adapter supports Python models with the following additional configuration parameters: @@ -914,4 +908,3 @@ By default, this is set to `True` to support the default `intermediate_format` o ### The `intermediate_format` parameter The `intermediate_format` parameter specifies which file format to use when writing records to a table. The default is `parquet`. - diff --git a/website/docs/reference/resource-configs/contract.md b/website/docs/reference/resource-configs/contract.md index 2f52fc26e1f..fb25076b0d9 100644 --- a/website/docs/reference/resource-configs/contract.md +++ b/website/docs/reference/resource-configs/contract.md @@ -16,14 +16,6 @@ This is to ensure that the people querying your model downstream—both inside a ## Data type aliasing - - -The `data_type` defined in your YAML file must match a data type your data platform recognizes. dbt does not do any type aliasing itself. If your data platform recognizes both `int` and `integer` as corresponding to the same type, then they will return a match. - - - - - dbt uses built-in type aliasing for the `data_type` defined in your YAML. For example, you can specify `string` in your contract, and on Postgres/Redshift, dbt will convert it to `text`. If dbt doesn't recognize the `data_type` name among its known aliases, it will pass it through as-is. This is enabled by default, but you can opt-out by setting `alias_types` to `false`. Example for disabling: @@ -42,7 +34,6 @@ models: ``` - ## Size, precision, and scale diff --git a/website/docs/reference/resource-configs/databricks-configs.md b/website/docs/reference/resource-configs/databricks-configs.md index 9e1b3282801..f807b1c0d88 100644 --- a/website/docs/reference/resource-configs/databricks-configs.md +++ b/website/docs/reference/resource-configs/databricks-configs.md @@ -7,23 +7,7 @@ id: "databricks-configs" When materializing a model as `table`, you may include several optional configs that are specific to the dbt-databricks plugin, in addition to the standard [model configs](/reference/model-configs). - - - -| Option | Description | Required? | Model Support | Example | -|---------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------|---------------|--------------------------| -| file_format | The file format to use when creating tables (`parquet`, `delta`, `hudi`, `csv`, `json`, `text`, `jdbc`, `orc`, `hive` or `libsvm`). | Optional | SQL, Python | `delta` | -| location_root | The created table uses the specified directory to store its data. The table alias is appended to it. | Optional | SQL, Python | `/mnt/root` | -| partition_by | Partition the created table by the specified columns. A directory is created for each partition. | Optional | SQL, Python | `date_day` | -| liquid_clustered_by | Cluster the created table by the specified columns. Clustering method is based on [Delta's Liquid Clustering feature](https://docs.databricks.com/en/delta/clustering.html). Available since dbt-databricks 1.6.2. | Optional | SQL | `date_day` | -| clustered_by | Each partition in the created table will be split into a fixed number of buckets by the specified columns. | Optional | SQL, Python | `country_code` | -| buckets | The number of buckets to create while clustering. | Required if `clustered_by` is specified | SQL, Python | `8` | -| tblproperties | [Tblproperties](https://docs.databricks.com/en/sql/language-manual/sql-ref-syntax-ddl-tblproperties.html) to be set on the created table. | Optional | SQL | `{'this.is.my.key': 12}` | -| compression | Set the compression algorithm. | Optional | SQL, Python | `zstd` | - - - - + | Option | Description | Required? | Model Support | Example | @@ -42,7 +26,7 @@ We do not yet have a PySpark API to set tblproperties at table creation, so this - + 1.8 introduces support for [Tags](https://docs.databricks.com/en/data-governance/unity-catalog/tags.html) at the table level, in addition to all table configuration supported in 1.7. @@ -51,7 +35,7 @@ We do not yet have a PySpark API to set tblproperties at table creation, so this | file_format | The file format to use when creating tables (`parquet`, `delta`, `hudi`, `csv`, `json`, `text`, `jdbc`, `orc`, `hive` or `libsvm`). | Optional | SQL, Python | `delta` | | location_root | The created table uses the specified directory to store its data. The table alias is appended to it. | Optional | SQL, Python | `/mnt/root` | | partition_by | Partition the created table by the specified columns. A directory is created for each partition. | Optional | SQL, Python | `date_day` | -| liquid_clustered_by | Cluster the created table by the specified columns. Clustering method is based on [Delta's Liquid Clustering feature](https://docs.databricks.com/en/delta/clustering.html). Available since dbt-databricks 1.6.2. | Optional | SQL | `date_day` | +| liquid_clustered_by | Cluster the created table by the specified columns. Clustering method is based on [Delta's Liquid Clustering feature](https://docs.databricks.com/en/delta/clustering.html). Available since dbt-databricks 1.6.2. | Optional | SQL, Python | `date_day` | | clustered_by | Each partition in the created table will be split into a fixed number of buckets by the specified columns. | Optional | SQL, Python | `country_code` | | buckets | The number of buckets to create while clustering | Required if `clustered_by` is specified | SQL, Python | `8` | | tblproperties | [Tblproperties](https://docs.databricks.com/en/sql/language-manual/sql-ref-syntax-ddl-tblproperties.html) to be set on the created table | Optional | SQL, Python* | `{'this.is.my.key': 12}` | @@ -65,6 +49,131 @@ We do not yet have a PySpark API to set tblproperties at table creation, so this + + +dbt Core v.9 and Versionless dbt Clouyd support for `table_format: iceberg`, in addition to all previous table configurations supported in 1.8. + +| Option | Description | Required? | Model Support | Example | +|---------------------|-----------------------------|-------------------------------------------|-----------------|--------------------------| +| table_format | Whether or not to provision [Iceberg](https://docs.databricks.com/en/delta/uniform.html) compatibility for the materialization | Optional | SQL, Python | `iceberg` | +| file_format+ | The file format to use when creating tables (`parquet`, `delta`, `hudi`, `csv`, `json`, `text`, `jdbc`, `orc`, `hive` or `libsvm`). | Optional | SQL, Python | `delta` | +| location_root | The created table uses the specified directory to store its data. The table alias is appended to it. | Optional | SQL, Python | `/mnt/root` | +| partition_by | Partition the created table by the specified columns. A directory is created for each partition. | Optional | SQL, Python | `date_day` | +| liquid_clustered_by | Cluster the created table by the specified columns. Clustering method is based on [Delta's Liquid Clustering feature](https://docs.databricks.com/en/delta/clustering.html). Available since dbt-databricks 1.6.2. | Optional | SQL, Python | `date_day` | +| clustered_by | Each partition in the created table will be split into a fixed number of buckets by the specified columns. | Optional | SQL, Python | `country_code` | +| buckets | The number of buckets to create while clustering | Required if `clustered_by` is specified | SQL, Python | `8` | +| tblproperties | [Tblproperties](https://docs.databricks.com/en/sql/language-manual/sql-ref-syntax-ddl-tblproperties.html) to be set on the created table | Optional | SQL, Python* | `{'this.is.my.key': 12}` | +| databricks_tags | [Tags](https://docs.databricks.com/en/data-governance/unity-catalog/tags.html) to be set on the created table | Optional | SQL++, Python++ | `{'my_tag': 'my_value'}` | +| compression | Set the compression algorithm. | Optional | SQL, Python | `zstd` | + +\* We do not yet have a PySpark API to set tblproperties at table creation, so this feature is primarily to allow users to anotate their python-derived tables with tblproperties. +\+ When `table_format` is `iceberg`, `file_format` must be `delta`. +\++ `databricks_tags` are currently only supported at the table level, and applied via `ALTER` statements. + + + + + +### Python submission methods + +In dbt v1.9 and higher, or in [Versionless](/docs/dbt-versions/versionless-cloud) dbt Cloud, you can use these four options for `submission_method`: + +* `all_purpose_cluster`: Executes the python model either directly using the [command api](https://docs.databricks.com/api/workspace/commandexecution) or by uploading a notebook and creating a one-off job run +* `job_cluster`: Creates a new job cluster to execute an uploaded notebook as a one-off job run +* `serverless_cluster`: Uses a [serverless cluster](https://docs.databricks.com/en/jobs/run-serverless-jobs.html) to execute an uploaded notebook as a one-off job run +* `workflow_job`: Creates/updates a reusable workflow and uploaded notebook, for execution on all-purpose, job, or serverless clusters. + :::caution + This approach gives you maximum flexibility, but will create persistent artifacts in Databricks (the workflow) that users could run outside of dbt. + ::: + +We are currently in a transitionary period where there is a disconnect between old submission methods (which were grouped by compute), and the logically distinct submission methods (command, job run, workflow). + +As such, the supported config matrix is somewhat complicated: + +| Config | Use | Default | `all_purpose_cluster`* | `job_cluster` | `serverless_cluster` | `workflow_job` | +| --------------------- | -------------------------------------------------------------------- | ------------------ | ---------------------- | ------------- | -------------------- | -------------- | +| `create_notebook` | if false, use Command API, otherwise upload notebook and use job run | `false` | ✅ | ❌ | ❌ | ❌ | +| `timeout` | maximum time to wait for command/job to run | `0` (No timeout) | ✅ | ✅ | ✅ | ✅ | +| `job_cluster_config` | configures a [new cluster](https://docs.databricks.com/api/workspace/jobs/submit#tasks-new_cluster) for running the model | `{}` | ❌ | ✅ | ❌ | ✅ | +| `access_control_list` | directly configures [access control](https://docs.databricks.com/api/workspace/jobs/submit#access_control_list) for the job | `{}` | ✅ | ✅ | ✅ | ✅ | +| `packages` | list of packages to install on the executing cluster | `[]` | ✅ | ✅ | ✅ | ✅ | +| `index_url` | url to install `packages` from | `None` (uses pypi) | ✅ | ✅ | ✅ | ✅ | +| `additional_libs` | directly configures [libraries](https://docs.databricks.com/api/workspace/jobs/submit#tasks-libraries) | `[]` | ✅ | ✅ | ✅ | ✅ | +| `python_job_config` | additional configuration for jobs/workflows (see table below) | `{}` | ✅ | ✅ | ✅ | ✅ | +| `cluster_id` | id of existing all purpose cluster to execute against | `None` | ✅ | ❌ | ❌ | ✅ | +| `http_path` | path to existing all purpose cluster to execute against | `None` | ✅ | ❌ | ❌ | ❌ | + +\* Only `timeout` and `cluster_id`/`http_path` are supported when `create_notebook` is false + +With the introduction of the `workflow_job` submission method, we chose to segregate further configuration of the python model submission under a top level configuration named `python_job_config`. This keeps configuration options for jobs and workflows namespaced in such a way that they do not interfere with other model config, allowing us to be much more flexible with what is supported for job execution. + +The support matrix for this feature is divided into `workflow_job` and all others (assuming `all_purpose_cluster` with `create_notebook`==true). +Each config option listed must be nested under `python_job_config`: + +| Config | Use | Default | `workflow_job` | All others | +| -------------------------- | ----------------------------------------------------------------------------------------------------------------------- | ------- | -------------- | ---------- | +| `name` | The name to give (or used to look up) the created workflow | `None` | ✅ | ❌ | +| `grants` | A simplified way to specify access control for the workflow | `{}` | ✅ | ✅ | +| `existing_job_id` | Id to use to look up the created workflow (in place of `name`) | `None` | ✅ | ❌ | +| `post_hook_tasks` | [Tasks](https://docs.databricks.com/api/workspace/jobs/create#tasks) to include after the model notebook execution | `[]` | ✅ | ❌ | +| `additional_task_settings` | Additional [task config](https://docs.databricks.com/api/workspace/jobs/create#tasks) to include in the model task | `{}` | ✅ | ❌ | +| [Other job run settings](https://docs.databricks.com/api/workspace/jobs/submit) | Config will be copied into the request, outside of the model task | `None` | ❌ | ✅ | +| [Other workflow settings](https://docs.databricks.com/api/workspace/jobs/create) | Config will be copied into the request, outside of the model task | `None` | ✅ | ❌ | + +This example uses the new configuration options in the previous table: + + + +```yaml +models: + - name: my_model + config: + submission_method: workflow_job + + # Define a job cluster to create for running this workflow + # Alternately, could specify cluster_id to use an existing cluster, or provide neither to use a serverless cluster + job_cluster_config: + spark_version: "15.3.x-scala2.12" + node_type_id: "rd-fleet.2xlarge" + runtime_engine: "{{ var('job_cluster_defaults.runtime_engine') }}" + data_security_mode: "{{ var('job_cluster_defaults.data_security_mode') }}" + autoscale: { "min_workers": 1, "max_workers": 4 } + + python_job_config: + # These settings are passed in, as is, to the request + email_notifications: { on_failure: ["me@example.com"] } + max_retries: 2 + + name: my_workflow_name + + # Override settings for your model's dbt task. For instance, you can + # change the task key + additional_task_settings: { "task_key": "my_dbt_task" } + + # Define tasks to run before/after the model + # This example assumes you have already uploaded a notebook to /my_notebook_path to perform optimize and vacuum + post_hook_tasks: + [ + { + "depends_on": [{ "task_key": "my_dbt_task" }], + "task_key": "OPTIMIZE_AND_VACUUM", + "notebook_task": + { "notebook_path": "/my_notebook_path", "source": "WORKSPACE" }, + }, + ] + + # Simplified structure, rather than having to specify permission separately for each user + grants: + view: [{ "group_name": "marketing-team" }] + run: [{ "user_name": "other_user@example.com" }] + manage: [] +``` + + + + + + ## Incremental models dbt-databricks plugin leans heavily on the [`incremental_strategy` config](/docs/build/incremental-strategy). This config tells the incremental materialization how to build models in runs beyond their first. It can be set to one of four values: @@ -75,6 +184,23 @@ dbt-databricks plugin leans heavily on the [`incremental_strategy` config](/docs Each of these strategies has its pros and cons, which we'll discuss below. As with any model config, `incremental_strategy` may be specified in `dbt_project.yml` or within a model file's `config()` block. + + + + +## Incremental models + +dbt-databricks plugin leans heavily on the [`incremental_strategy` config](/docs/build/incremental-strategy). This config tells the incremental materialization how to build models in runs beyond their first. It can be set to one of five values: + - **`append`**: Insert new records without updating or overwriting any existing data. + - **`insert_overwrite`**: If `partition_by` is specified, overwrite partitions in the with new data. If no `partition_by` is specified, overwrite the entire table with new data. + - **`merge`** (default; Delta and Hudi file format only): Match records based on a `unique_key`, updating old records, and inserting new ones. (If no `unique_key` is specified, all new data is inserted, similar to `append`.) + - **`replace_where`** (Delta file format only): Match records based on `incremental_predicates`, replacing all records that match the predicates from the existing table with records matching the predicates from the new data. (If no `incremental_predicates` are specified, all new data is inserted, similar to `append`.) + - **`microbatch`** (Delta file format only): Implements the [microbatch strategy](/docs/build/incremental-microbatch) using `replace_where` with predicates generated based `event_time`. + +Each of these strategies has its pros and cons, which we'll discuss below. As with any model config, `incremental_strategy` may be specified in `dbt_project.yml` or within a model file's `config()` block. + + + ### The `append` strategy Following the `append` strategy, dbt will perform an `insert into` statement with all new data. The appeal of this strategy is that it is straightforward and functional across all platforms, file types, connection methods, and Apache Spark versions. However, this strategy _cannot_ update, overwrite, or delete existing data, so it is likely to insert duplicate records for many data sources. @@ -221,7 +347,7 @@ The `merge` incremental strategy requires: - Databricks Runtime 5.1 and above for delta file format - Apache Spark for hudi file format -dbt will run an [atomic `merge` statement](https://docs.databricks.com/spark/latest/spark-sql/language-manual/merge-into.html) which looks nearly identical to the default merge behavior on Snowflake and BigQuery. If a `unique_key` is specified (recommended), dbt will update old records with values from new records that match on the key column. If a `unique_key` is not specified, dbt will forgo match criteria and simply insert all new records (similar to `append` strategy). +The Databricks adapter will run an [atomic `merge` statement](https://docs.databricks.com/spark/latest/spark-sql/language-manual/merge-into.html) similar to the default merge behavior on Snowflake and BigQuery. If a `unique_key` is specified (recommended), dbt will update old records with values from new records that match on the key column. If a `unique_key` is not specified, dbt will forgo match criteria and simply insert all new records (similar to `append` strategy). Specifying `merge` as the incremental strategy is optional since it's the default strategy used when none is specified. @@ -302,6 +428,123 @@ merge into analytics.merge_incremental as DBT_INTERNAL_DEST + + +Beginning with 1.9, `merge` behavior can be modified with the following additional configuration options: + +- `target_alias`, `source_alias`: Aliases for the target and source to allow you to describe your merge conditions more naturally. These default to `DBT_INTERNAL_DEST` and `DBT_INTERNAL_SOURCE`, respectively. +- `skip_matched_step`: If set to `true`, the 'matched' clause of the merge statement will not be included. +- `skip_not_matched_step`: If set to `true`, the 'not matched' clause will not be included. +- `matched_condition`: Condition to apply to the `WHEN MATCHED` clause. You should use the `target_alias` and `source_alias` to write a conditional expression, such as `DBT_INTERNAL_DEST.col1 = hash(DBT_INTERNAL_SOURCE.col2, DBT_INTERNAL_SOURCE.col3)`. This condition further restricts the matched set of rows. +- `not_matched_condition`: Condition to apply to the `WHEN NOT MATCHED [BY TARGET]` clause. This condition further restricts the set of rows in the target that do not match the source that will be inserted into the merged table. +- `not_matched_by_source_condition`: Condition to apply to the further filter `WHEN NOT MATCHED BY SOURCE` clause. Only used in conjunction with `not_matched_by_source_action: delete`. +- `not_matched_by_source_action`: If set to `delete`, a `DELETE` clause is added to the merge statement for `WHEN NOT MATCHED BY SOURCE`. +- `merge_with_schema_evolution`: If set to `true`, the merge statement includes the `WITH SCHEMA EVOLUTION` clause. + +For more details on the meaning of each merge clause, please see [the Databricks documentation](https://docs.databricks.com/en/sql/language-manual/delta-merge-into.html). + +The following is an example demonstrating the use of these new options: + + + + + + +```sql +{{ config( + materialized = 'incremental', + unique_key = 'id', + incremental_strategy='merge', + target_alias='t', + source_alias='s', + matched_condition='t.tech_change_ts < s.tech_change_ts', + not_matched_condition='s.attr1 IS NOT NULL', + not_matched_by_source_condition='t.tech_change_ts < current_timestamp()', + not_matched_by_source_action='delete', + merge_with_schema_evolution=true +) }} + +select + id, + attr1, + attr2, + tech_change_ts +from + {{ ref('source_table') }} as s +``` + + + + + + + +```sql +create temporary view merge_incremental__dbt_tmp as + + select + id, + attr1, + attr2, + tech_change_ts + from upstream.source_table +; + +merge + with schema evolution +into + target_table as t +using ( + select + id, + attr1, + attr2, + tech_change_ts + from + source_table as s +) +on + t.id <=> s.id +when matched + and t.tech_change_ts < s.tech_change_ts + then update set + id = s.id, + attr1 = s.attr1, + attr2 = s.attr2, + tech_change_ts = s.tech_change_ts + +when not matched + and s.attr1 IS NOT NULL + then insert ( + id, + attr1, + attr2, + tech_change_ts + ) values ( + s.id, + s.attr1, + s.attr2, + s.tech_change_ts + ) + +when not matched by source + and t.tech_change_ts < current_timestamp() + then delete +``` + + + + + + + + ### The `replace_where` strategy The `replace_where` incremental strategy requires: @@ -391,7 +634,83 @@ insert into analytics.replace_where_incremental - + + +### The `microbatch` strategy + +The Databricks adapter implements the `microbatch` strategy using `replace_where`. Note the requirements and caution statements for `replace_where` above. For more information about this strategy, see the [microbatch reference page](/docs/build/incremental-microbatch). + +In the following example, the upstream table `events` have been annotated with an `event_time` column called `ts` in its schema file. + + + + + + +```sql +{{ config( + materialized='incremental', + file_format='delta', + incremental_strategy = 'microbatch' + event_time='date' # Use 'date' as the grain for this microbatch table +) }} + +with new_events as ( + + select * from {{ ref('events') }} + +) + +select + user_id, + date, + count(*) as visits + +from events +group by 1, 2 +``` + + + + + + + +```sql +create temporary view replace_where__dbt_tmp as + + with new_events as ( + + select * from (select * from analytics.events where ts >= '2024-10-01' and ts < '2024-10-02') + + ) + + select + user_id, + date, + count(*) as visits + from events + group by 1, 2 +; + +insert into analytics.replace_where_incremental + replace where CAST(date as TIMESTAMP) >= '2024-10-01' and CAST(date as TIMESTAMP) < '2024-10-02' + table `replace_where__dbt_tmp` +``` + + + + + + + + ## Selecting compute per model @@ -556,9 +875,15 @@ Databricks adapter ... using compute resource . Materializing a python model requires execution of SQL as well as python. Specifically, if your python model is incremental, the current execution pattern involves executing python to create a staging table that is then merged into your target table using SQL. + The python code needs to run on an all purpose cluster, while the SQL code can run on an all purpose cluster or a SQL Warehouse. + + +The python code needs to run on an all purpose cluster (or serverless cluster, see [Python Submission Methods](#python-submission-methods)), while the SQL code can run on an all purpose cluster or a SQL Warehouse. + When you specify your `databricks_compute` for a python model, you are currently only specifying which compute to use when running the model-specific SQL. -If you wish to use a different compute for executing the python itself, you must specify an alternate `http_path` in the config for the model. Please note that declaring a separate SQL compute and a python compute for your python dbt models is optional. If you wish to do this: +If you wish to use a different compute for executing the python itself, you must specify an alternate compute in the config for the model. +For example: @@ -575,8 +900,6 @@ def model(dbt, session): If your default compute is a SQL Warehouse, you will need to specify an all purpose cluster `http_path` in this way. - - ## Persisting model descriptions Relation-level docs persistence is supported in dbt v0.17.0. For more @@ -788,9 +1111,5 @@ One application of this feature is making `delta` tables compatible with `iceber ) }} ``` - - `tblproperties` can be specified for python models, but they will be applied via an `ALTER` statement after table creation. This is due to a limitation in PySpark. - - diff --git a/website/docs/reference/resource-configs/enabled.md b/website/docs/reference/resource-configs/enabled.md index febf1e50c88..b74d7250907 100644 --- a/website/docs/reference/resource-configs/enabled.md +++ b/website/docs/reference/resource-configs/enabled.md @@ -230,14 +230,6 @@ exposures: - - -Support for disabling semantic models has been added in dbt Core v1.7 - - - - - ```yaml @@ -259,20 +251,10 @@ semantic_models: - - - - -Support for disabling saved queries has been added in dbt Core v1.7. - - - - - ```yaml @@ -294,8 +276,6 @@ saved_queries: - - diff --git a/website/docs/reference/resource-configs/group.md b/website/docs/reference/resource-configs/group.md index 717d7de89f5..cd0ad2683f5 100644 --- a/website/docs/reference/resource-configs/group.md +++ b/website/docs/reference/resource-configs/group.md @@ -218,14 +218,6 @@ metrics: - - -Support for grouping semantic models has been added in dbt Core v1.7. - - - - - ```yaml @@ -247,20 +239,10 @@ semantic_models: - - - - -Support for grouping saved queries has been added in dbt Core v1.7. - - - - - ```yaml @@ -282,8 +264,6 @@ saved_queries: - - diff --git a/website/docs/reference/resource-configs/meta.md b/website/docs/reference/resource-configs/meta.md index 2bcccdd4141..53a4f77184e 100644 --- a/website/docs/reference/resource-configs/meta.md +++ b/website/docs/reference/resource-configs/meta.md @@ -179,14 +179,6 @@ exposures: - - -Support for grouping semantic models was added in dbt Core v1.7 - - - - - ```yml @@ -201,8 +193,6 @@ semantic_models: The `meta` config can also be defined under the `semantic-models` config block in `dbt_project.yml`. See [configs and properties](/reference/configs-and-properties) for details. - - @@ -249,14 +239,6 @@ metrics: - - -Support for saved queries has been added in dbt Core v1.7. - - - - - ```yml @@ -268,8 +250,6 @@ saved_queries: - - diff --git a/website/docs/reference/resource-configs/postgres-configs.md b/website/docs/reference/resource-configs/postgres-configs.md index 07cfc938f1c..f2bf90a93c0 100644 --- a/website/docs/reference/resource-configs/postgres-configs.md +++ b/website/docs/reference/resource-configs/postgres-configs.md @@ -185,20 +185,3 @@ It's worth noting that, unlike tables, dbt monitors this parameter for changes a This happens via a `DROP/CREATE` of the indexes, which can be thought of as an `ALTER` of the materialized view. Learn more about these parameters in Postgres's [docs](https://www.postgresql.org/docs/current/sql-creatematerializedview.html). - - - -### Limitations - -#### Changing materialization to and from "materialized_view" - -Swapping an already materialized model to a materialized view, and vice versa, is not supported. -The workaround is to manually drop the existing materialization in the data warehouse prior to calling `dbt run`. -Running with `--full-refresh` flag will not work to drop the existing table or view and create the materialized view (and vice versa). -This would only need to be done once as the existing object would then be a materialized view. - -For example,`my_model`, has already been materialized as a table in the underlying data platform via `dbt run`. -If the user changes the model's config to `materialized="materialized_view"`, they will get an error. -The solution is to execute `DROP TABLE my_model` on the data warehouse before trying the model again. - - diff --git a/website/docs/reference/resource-configs/redshift-configs.md b/website/docs/reference/resource-configs/redshift-configs.md index e7149ae484e..b033cd6267e 100644 --- a/website/docs/reference/resource-configs/redshift-configs.md +++ b/website/docs/reference/resource-configs/redshift-configs.md @@ -230,21 +230,6 @@ As with most data platforms, there are limitations associated with materialized Find more information about materialized view limitations in Redshift's [docs](https://docs.aws.amazon.com/redshift/latest/dg/materialized-view-create-sql-command.html#mv_CREATE_MATERIALIZED_VIEW-limitations). - - -#### Changing materialization from "materialized_view" to "table" or "view" - -Swapping a materialized view to a table or view is not supported. -You must manually drop the existing materialized view in the data warehouse before calling `dbt run`. -Normally, re-running with the `--full-refresh` flag would resolve this, but not in this case. -This would only need to be done once as the existing object would then be a materialized view. - -For example, assume that a materialized view, `my_mv.sql`, has already been materialized to the underlying data platform via `dbt run`. -If the user changes the model's config to `materialized="table"`, they will get an error. -The workaround is to execute `DROP MATERIALIZED VIEW my_mv CASCADE` on the data warehouse before trying the model again. - - - ## Unit test limitations diff --git a/website/docs/reference/resource-configs/snowflake-configs.md b/website/docs/reference/resource-configs/snowflake-configs.md index faf2d0c5e16..b95b79241ba 100644 --- a/website/docs/reference/resource-configs/snowflake-configs.md +++ b/website/docs/reference/resource-configs/snowflake-configs.md @@ -299,7 +299,7 @@ Snowflake allows two configuration scenarios for scheduling automatic refreshes: - **Time-based** — Provide a value of the form ` { seconds | minutes | hours | days }`. For example, if the dynamic table needs to be updated every 30 minutes, use `target_lag='30 minutes'`. - **Downstream** — Applicable when the dynamic table is referenced by other dynamic tables. In this scenario, `target_lag='downstream'` allows for refreshes to be controlled at the target, instead of at each layer. -Learn more about `target_lag` in Snowflake's [docs](https://docs.snowflake.com/en/user-guide/dynamic-tables-refresh#understanding-target-lag). +Learn more about `target_lag` in Snowflake's [docs](https://docs.snowflake.com/en/user-guide/dynamic-tables-refresh#understanding-target-lag). Please note that Snowflake supports a target lag of 1 minute or longer. @@ -337,33 +337,6 @@ For dbt limitations, these dbt features are not supported: - [Model contracts](/docs/collaborate/govern/model-contracts) - [Copy grants configuration](/reference/resource-configs/snowflake-configs#copying-grants) - - -#### Changing materialization to and from "dynamic_table" - -Version `1.6.x` does not support altering the materialization from a non-dynamic table be a dynamic table and vice versa. -Re-running with the `--full-refresh` does not resolve this either. -The workaround is manually dropping the existing model in the warehouse prior to calling `dbt run`. -This only needs to be done once for the conversion. - -For example, assume for the example model below, `my_model`, has already been materialized to the underlying data platform via `dbt run`. -If the model config is updated to `materialized="dynamic_table"`, dbt will return an error. -The workaround is to execute `DROP TABLE my_model` on the data warehouse before trying the model again. - - - -```yaml - -{{ config( - materialized="table" # or any model type (e.g. view, incremental) -) }} - -``` - - - - - ## Temporary tables Incremental table merges for Snowflake prefer to utilize a `view` rather than a `temporary table`. The reasoning is to avoid the database write step that a temporary table would initiate and save compile time. diff --git a/website/docs/reference/resource-properties/config.md b/website/docs/reference/resource-properties/config.md index 8190c7dd8ca..1e1867dda04 100644 --- a/website/docs/reference/resource-properties/config.md +++ b/website/docs/reference/resource-properties/config.md @@ -170,14 +170,6 @@ exposures: - - -Support for the `config` property on `semantic_models` was added in dbt Core v1.7 - - - - - ```yml @@ -193,20 +185,10 @@ semantic_models: - - - - -Support for the `config` property on `saved queries` was added in dbt Core v1.7. - - - - - ```yml @@ -226,8 +208,6 @@ saved-queries: - - diff --git a/website/docs/reference/resource-properties/deprecation_date.md b/website/docs/reference/resource-properties/deprecation_date.md index 70f150dc465..501fdc30237 100644 --- a/website/docs/reference/resource-properties/deprecation_date.md +++ b/website/docs/reference/resource-properties/deprecation_date.md @@ -55,7 +55,7 @@ Additionally, [`WARN_ERROR_OPTIONS`](/reference/global-configs/warnings) gives a | `DeprecatedReference` | Referencing a model with a past deprecation date | Producer and consumers | | `UpcomingReferenceDeprecation` | Referencing a model with a future deprecation date | Producer and consumers | -** Example ** +**Example** Example output for an `UpcomingReferenceDeprecation` warning: ``` diff --git a/website/docs/reference/resource-properties/description.md b/website/docs/reference/resource-properties/description.md index 6f32f75efa4..cf7b2b29a5a 100644 --- a/website/docs/reference/resource-properties/description.md +++ b/website/docs/reference/resource-properties/description.md @@ -13,7 +13,7 @@ description: "This guide explains how to use the description key to add YAML des { label: 'Snapshots', value: 'snapshots', }, { label: 'Analyses', value: 'analyses', }, { label: 'Macros', value: 'macros', }, - { label: 'Singular data tests', value: 'singular_data_tests', }, + { label: 'Data tests', value: 'data_tests', }, ] }> @@ -146,17 +146,17 @@ macros: - + - + ```yml version: 2 data_tests: - - name: singular_data_test_name + - name: data_test_name description: markdown_string ``` @@ -167,13 +167,12 @@ data_tests: -The `description` property is available for singular data tests beginning in dbt v1.9. +The `description` property is available for generic and singular data tests beginning in dbt v1.9. - ## Definition diff --git a/website/docs/reference/resource-properties/freshness.md b/website/docs/reference/resource-properties/freshness.md index 03037e7b681..d68dee4fade 100644 --- a/website/docs/reference/resource-properties/freshness.md +++ b/website/docs/reference/resource-properties/freshness.md @@ -37,8 +37,6 @@ A freshness block is used to define the acceptable amount of time between the mo In the `freshness` block, one or both of `warn_after` and `error_after` can be provided. If neither is provided, then dbt will not calculate freshness snapshots for the tables in this source. - - In most cases, the `loaded_at_field` is required. Some adapters support calculating source freshness from the warehouse metadata tables and can exclude the `loaded_at_field`. If a source has a `freshness:` block, dbt will attempt to calculate freshness for that source: @@ -62,29 +60,9 @@ To exclude a source from freshness calculations, you have two options: - Don't add a `freshness:` block. - Explicitly set `freshness: null`. - - - - -Additionally, the `loaded_at_field` is required to calculate freshness for a table. If a `loaded_at_field` is not provided, then dbt will not calculate freshness for the table. - -Freshness blocks are applied hierarchically: -- A `freshness` and `loaded_at_field` property added to a source will be applied to all tables defined in that source -- A `freshness` and `loaded_at_field` property added to a source _table_ will override any properties applied to the source. - -This is useful when all of the tables in a source have the same `loaded_at_field`, as is often the case. - - ## loaded_at_field - -(Optional on adapters that support pulling freshness from warehouse metadata tables, required otherwise.) - - - -(Required) - - +Optional on adapters that support pulling freshness from warehouse metadata tables, required otherwise.

A column name (or expression) that returns a timestamp indicating freshness. If using a date field, you may have to cast it to a timestamp: diff --git a/website/docs/reference/snapshot-configs.md b/website/docs/reference/snapshot-configs.md index e867747dc96..144ecafde9d 100644 --- a/website/docs/reference/snapshot-configs.md +++ b/website/docs/reference/snapshot-configs.md @@ -57,7 +57,7 @@ snapshots: [+](/reference/resource-configs/plus-prefix)[strategy](/reference/resource-configs/strategy): timestamp | check [+](/reference/resource-configs/plus-prefix)[updated_at](/reference/resource-configs/updated_at): [+](/reference/resource-configs/plus-prefix)[check_cols](/reference/resource-configs/check_cols): [] | all - + [+](/reference/resource-configs/plus-prefix)[invalidate_hard_deletes](/reference/resource-configs/invalidate_hard_deletes) : true | false ``` @@ -79,7 +79,7 @@ snapshots: [+](/reference/resource-configs/plus-prefix)[updated_at](/reference/resource-configs/updated_at): [+](/reference/resource-configs/plus-prefix)[check_cols](/reference/resource-configs/check_cols): [] | all [+](/reference/resource-configs/plus-prefix)[snapshot_meta_column_names](/reference/resource-configs/snapshot_meta_column_names): {} - + [+](/reference/resource-configs/plus-prefix)[invalidate_hard_deletes](/reference/resource-configs/invalidate_hard_deletes) : true | false ``` @@ -113,7 +113,7 @@ snapshots: [updated_at](/reference/resource-configs/updated_at): [check_cols](/reference/resource-configs/check_cols): [] | all [snapshot_meta_column_names](/reference/resource-configs/snapshot_meta_column_names): {} - + [invalidate_hard_deletes](/reference/resource-configs/invalidate_hard_deletes) : true | false ``` @@ -125,7 +125,7 @@ snapshots: -Configurations can be applied to snapshots using [YAML syntax](/docs/build/snapshots), available in Versionless and dbt v1.9 and higher, in the the `snapshot` directory file. +Configurations can be applied to snapshots using the [YAML syntax](/docs/build/snapshots), available in Versionless and dbt v1.9 and higher, in the `snapshot` directory file. @@ -140,7 +140,7 @@ Configurations can be applied to snapshots using [YAML syntax](/docs/build/snaps [strategy](/reference/resource-configs/strategy)="timestamp" | "check", [updated_at](/reference/resource-configs/updated_at)="", [check_cols](/reference/resource-configs/check_cols)=[""] | "all" - [snapshot_meta_column_names](/reference/resource-configs/snapshot_meta_column_names)={} + [invalidate_hard_deletes](/reference/resource-configs/invalidate_hard_deletes) : true | false ) }} ``` @@ -236,7 +236,7 @@ snapshots: -Configurations can be applied to snapshots using [YAML syntax](/docs/build/snapshots), available in Versionless and dbt v1.9 and higher, in the the `snapshot` directory file. +Configurations can be applied to snapshots using [YAML syntax](/docs/build/snapshots), available in Versionless and dbt v1.9 and higher, in the `snapshot` directory file. diff --git a/website/docusaurus.config.js b/website/docusaurus.config.js index b68e2e8ec5c..9fe63267d61 100644 --- a/website/docusaurus.config.js +++ b/website/docusaurus.config.js @@ -209,7 +209,7 @@ var siteSettings = { >
diff --git a/website/src/components/hero/styles.module.css b/website/src/components/hero/styles.module.css index f596b53762a..67d3c8c5d68 100644 --- a/website/src/components/hero/styles.module.css +++ b/website/src/components/hero/styles.module.css @@ -49,3 +49,34 @@ width: 60%; } } + +.callToActionsTitle { + font-weight: bold; + margin-top: 20px; + margin-bottom: 20px; + font-size: 1.25rem; + display: block; +} + +.callToActions { + display: flex; + flex-flow: wrap; + gap: 0.8rem; + justify-content: center; +} + +.callToAction { + outline: #fff solid 1px; + border-radius: 4px; + padding: 0 12px; + color: #fff; + transition: all .2s; + cursor: pointer; +} + +.callToAction:hover, .callToAction:active, .callToAction:focus { + text-decoration: none; + outline: rgb(4, 115, 119) solid 1px; + background-color: rgb(4, 115, 119); + color: #fff; +} diff --git a/website/src/components/quickstartGuideList/index.js b/website/src/components/quickstartGuideList/index.js index 0f4b5764340..2b87ae3e4a1 100644 --- a/website/src/components/quickstartGuideList/index.js +++ b/website/src/components/quickstartGuideList/index.js @@ -61,7 +61,7 @@ function QuickstartList({ quickstartData }) { // Update the URL with the new search parameters history.replace({ search: params.toString() }); -}; + }; // Handle all filters const handleDataFilter = () => { @@ -98,6 +98,30 @@ function QuickstartList({ quickstartData }) { handleDataFilter(); }, [selectedTags, selectedLevel, searchInput]); // Added searchInput to dependency array + // Set the featured guides that will show as CTAs in the hero section + // The value of the tag must match a tag in the frontmatter of the guides in order for the filter to apply after clicking + const heroCTAs = [ + { + title: 'Quickstart guides', + value: 'Quickstart' + }, + { + title: 'Use Jinja to improve your SQL code', + value: 'Jinja' + }, + { + title: 'Orchestration', + value: 'Orchestration' + }, + ]; + + // Function to handle CTA clicks + const handleCallToActionClick = (value) => { + const params = new URLSearchParams(location.search); + params.set('tags', value); + history.replace({ search: params.toString() }); + }; + return ( @@ -111,6 +135,13 @@ function QuickstartList({ quickstartData }) { showGraphic={false} customStyles={{ marginBottom: 0 }} classNames={styles.quickstartHero} + callToActions={heroCTAs.map(guide => ({ + title: guide.title, + href: guide.href, + onClick: () => handleCallToActionClick(guide.value), + newTab: guide.newTab + }))} + callToActionsTitle={'Popular guides'} />
@@ -135,7 +166,7 @@ function QuickstartList({ quickstartData }) {
- ) + ); } export default QuickstartList; diff --git a/website/src/theme/DocRoot/Layout/Main/index.js b/website/src/theme/DocRoot/Layout/Main/index.js index a8c9d449b82..154c3cbfab6 100644 --- a/website/src/theme/DocRoot/Layout/Main/index.js +++ b/website/src/theme/DocRoot/Layout/Main/index.js @@ -89,7 +89,7 @@ export default function DocRootLayoutMain({ if (new Date() > new Date(EOLDate)) { setEOLData({ showEOLBanner: true, - EOLBannerText: `This version of dbt Core is no longer supported. There will be no more patches or security fixes. For improved performance, security, and features, upgrade to the latest stable version.`, + EOLBannerText: `This version of dbt Core is no longer supported. There will be no more patches or security fixes. For improved performance, security, and features, upgrade to the latest stable version. Some dbt Cloud customers might have an extended critical support window. `, }); } else if (new Date() > threeMonths) { setEOLData({ diff --git a/website/static/img/best-practices/materializations/model-timing-diagram.png b/website/static/img/best-practices/materializations/model-timing-diagram.png index 75aaf17123f..6dc85a01f1a 100644 Binary files a/website/static/img/best-practices/materializations/model-timing-diagram.png and b/website/static/img/best-practices/materializations/model-timing-diagram.png differ diff --git a/website/static/img/bigquery/bigquery-optional-config.png b/website/static/img/bigquery/bigquery-optional-config.png new file mode 100644 index 00000000000..ba9dba2afac Binary files /dev/null and b/website/static/img/bigquery/bigquery-optional-config.png differ diff --git a/website/static/img/community/spotlight/bruno-souza-de-lima-newimage.jpg b/website/static/img/community/spotlight/bruno-souza-de-lima-newimage.jpg new file mode 100644 index 00000000000..4bcee8d5acc Binary files /dev/null and b/website/static/img/community/spotlight/bruno-souza-de-lima-newimage.jpg differ diff --git a/website/static/img/community/spotlight/christophe-oudar.jpg b/website/static/img/community/spotlight/christophe-oudar.jpg new file mode 100644 index 00000000000..11f31a6a4bd Binary files /dev/null and b/website/static/img/community/spotlight/christophe-oudar.jpg differ diff --git a/website/static/img/community/spotlight/dbt-athena-groupheadshot.png b/website/static/img/community/spotlight/dbt-athena-groupheadshot.png new file mode 100644 index 00000000000..bc4e03fa723 Binary files /dev/null and b/website/static/img/community/spotlight/dbt-athena-groupheadshot.png differ diff --git a/website/static/img/community/spotlight/jenna-jordan.jpg b/website/static/img/community/spotlight/jenna-jordan.jpg new file mode 100644 index 00000000000..527bafb469f Binary files /dev/null and b/website/static/img/community/spotlight/jenna-jordan.jpg differ diff --git a/website/static/img/community/spotlight/mike-stanley.jpg b/website/static/img/community/spotlight/mike-stanley.jpg new file mode 100644 index 00000000000..df1c2e98ddf Binary files /dev/null and b/website/static/img/community/spotlight/mike-stanley.jpg differ diff --git a/website/static/img/community/spotlight/ruth-onyekwe.jpeg b/website/static/img/community/spotlight/ruth-onyekwe.jpeg new file mode 100644 index 00000000000..92c470184b1 Binary files /dev/null and b/website/static/img/community/spotlight/ruth-onyekwe.jpeg differ diff --git a/website/static/img/dbt-env.png b/website/static/img/dbt-env.png new file mode 100644 index 00000000000..d4cf58d7824 Binary files /dev/null and b/website/static/img/dbt-env.png differ diff --git a/website/static/img/docs/cloud-integrations/example-snowflake-native-app-service-token.png b/website/static/img/docs/cloud-integrations/example-snowflake-native-app-service-token.png index 7e4c7ab99da..930182969c2 100644 Binary files a/website/static/img/docs/cloud-integrations/example-snowflake-native-app-service-token.png and b/website/static/img/docs/cloud-integrations/example-snowflake-native-app-service-token.png differ diff --git a/website/static/img/docs/collaborate/dbt-explorer/example-model-details.png b/website/static/img/docs/collaborate/dbt-explorer/example-model-details.png index 9ceee1b3a23..a46f4d4ac5e 100644 Binary files a/website/static/img/docs/collaborate/dbt-explorer/example-model-details.png and b/website/static/img/docs/collaborate/dbt-explorer/example-model-details.png differ diff --git a/website/static/img/docs/collaborate/dbt-explorer/example-recommendations-tab.png b/website/static/img/docs/collaborate/dbt-explorer/example-recommendations-tab.png index 493930c35db..004019bfa54 100644 Binary files a/website/static/img/docs/collaborate/dbt-explorer/example-recommendations-tab.png and b/website/static/img/docs/collaborate/dbt-explorer/example-recommendations-tab.png differ diff --git a/website/static/img/docs/collaborate/dbt-explorer/sigma-example.jpg b/website/static/img/docs/collaborate/dbt-explorer/sigma-example.jpg new file mode 100644 index 00000000000..b1aa4533e08 Binary files /dev/null and b/website/static/img/docs/collaborate/dbt-explorer/sigma-example.jpg differ diff --git a/website/static/img/docs/connect-data-platform/choose-a-connection.png b/website/static/img/docs/connect-data-platform/choose-a-connection.png new file mode 100644 index 00000000000..cf8d106dd59 Binary files /dev/null and b/website/static/img/docs/connect-data-platform/choose-a-connection.png differ diff --git a/website/static/img/docs/connect-data-platform/connection-list.png b/website/static/img/docs/connect-data-platform/connection-list.png new file mode 100644 index 00000000000..c499e9baeba Binary files /dev/null and b/website/static/img/docs/connect-data-platform/connection-list.png differ diff --git a/website/static/img/docs/dbt-cloud/cloud-configuring-dbt-cloud/managed-repo.png b/website/static/img/docs/dbt-cloud/cloud-configuring-dbt-cloud/managed-repo.png index e2014cf3607..d5850f1bed1 100644 Binary files a/website/static/img/docs/dbt-cloud/cloud-configuring-dbt-cloud/managed-repo.png and b/website/static/img/docs/dbt-cloud/cloud-configuring-dbt-cloud/managed-repo.png differ diff --git a/website/static/img/docs/dbt-cloud/cloud-ide/editor-tab-menu-with-save.jpg b/website/static/img/docs/dbt-cloud/cloud-ide/editor-tab-menu-with-save.jpg index 73551cbcaa7..deca4bedc43 100644 Binary files a/website/static/img/docs/dbt-cloud/cloud-ide/editor-tab-menu-with-save.jpg and b/website/static/img/docs/dbt-cloud/cloud-ide/editor-tab-menu-with-save.jpg differ diff --git a/website/static/img/docs/dbt-cloud/cloud-ide/ide-basic-layout.jpg b/website/static/img/docs/dbt-cloud/cloud-ide/ide-basic-layout.jpg index 3960c6a4bff..116644b4764 100644 Binary files a/website/static/img/docs/dbt-cloud/cloud-ide/ide-basic-layout.jpg and b/website/static/img/docs/dbt-cloud/cloud-ide/ide-basic-layout.jpg differ diff --git a/website/static/img/docs/dbt-cloud/cloud-ide/ide-command-bar.jpg b/website/static/img/docs/dbt-cloud/cloud-ide/ide-command-bar.jpg index ba6f8fc22c0..b1d0fd3ec7b 100644 Binary files a/website/static/img/docs/dbt-cloud/cloud-ide/ide-command-bar.jpg and b/website/static/img/docs/dbt-cloud/cloud-ide/ide-command-bar.jpg differ diff --git a/website/static/img/docs/dbt-cloud/cloud-ide/ide-console-overview.jpg b/website/static/img/docs/dbt-cloud/cloud-ide/ide-console-overview.jpg index 8212e9e3311..33780cf76f9 100644 Binary files a/website/static/img/docs/dbt-cloud/cloud-ide/ide-console-overview.jpg and b/website/static/img/docs/dbt-cloud/cloud-ide/ide-console-overview.jpg differ diff --git a/website/static/img/docs/dbt-cloud/cloud-ide/ide-editing.jpg b/website/static/img/docs/dbt-cloud/cloud-ide/ide-editing.jpg index 897497efc5b..d35caf29768 100644 Binary files a/website/static/img/docs/dbt-cloud/cloud-ide/ide-editing.jpg and b/website/static/img/docs/dbt-cloud/cloud-ide/ide-editing.jpg differ diff --git a/website/static/img/docs/dbt-cloud/cloud-ide/ide-editor-command-palette-with-save.jpg b/website/static/img/docs/dbt-cloud/cloud-ide/ide-editor-command-palette-with-save.jpg index 25e4f2b32a1..2b50f870251 100644 Binary files a/website/static/img/docs/dbt-cloud/cloud-ide/ide-editor-command-palette-with-save.jpg and b/website/static/img/docs/dbt-cloud/cloud-ide/ide-editor-command-palette-with-save.jpg differ diff --git a/website/static/img/docs/dbt-cloud/cloud-ide/ide-file-search-with-save.jpg b/website/static/img/docs/dbt-cloud/cloud-ide/ide-file-search-with-save.jpg index 9d8e82b98cb..775e1141330 100644 Binary files a/website/static/img/docs/dbt-cloud/cloud-ide/ide-file-search-with-save.jpg and b/website/static/img/docs/dbt-cloud/cloud-ide/ide-file-search-with-save.jpg differ diff --git a/website/static/img/docs/dbt-cloud/cloud-ide/ide-git-diff-view-with-save.jpg b/website/static/img/docs/dbt-cloud/cloud-ide/ide-git-diff-view-with-save.jpg index 777551dc49b..1f92e5a4cb5 100644 Binary files a/website/static/img/docs/dbt-cloud/cloud-ide/ide-git-diff-view-with-save.jpg and b/website/static/img/docs/dbt-cloud/cloud-ide/ide-git-diff-view-with-save.jpg differ diff --git a/website/static/img/docs/dbt-cloud/cloud-ide/ide-global-command-palette-with-save.jpg b/website/static/img/docs/dbt-cloud/cloud-ide/ide-global-command-palette-with-save.jpg index 32ce741269c..d2c86345895 100644 Binary files a/website/static/img/docs/dbt-cloud/cloud-ide/ide-global-command-palette-with-save.jpg and b/website/static/img/docs/dbt-cloud/cloud-ide/ide-global-command-palette-with-save.jpg differ diff --git a/website/static/img/docs/dbt-cloud/cloud-ide/ide-minimap.jpg b/website/static/img/docs/dbt-cloud/cloud-ide/ide-minimap.jpg index 8da575c2034..ca465bf2ec8 100644 Binary files a/website/static/img/docs/dbt-cloud/cloud-ide/ide-minimap.jpg and b/website/static/img/docs/dbt-cloud/cloud-ide/ide-minimap.jpg differ diff --git a/website/static/img/docs/dbt-cloud/cloud-ide/ide-options-menu-with-save.jpg b/website/static/img/docs/dbt-cloud/cloud-ide/ide-options-menu-with-save.jpg index bd57ee514ee..8a968f684bd 100644 Binary files a/website/static/img/docs/dbt-cloud/cloud-ide/ide-options-menu-with-save.jpg and b/website/static/img/docs/dbt-cloud/cloud-ide/ide-options-menu-with-save.jpg differ diff --git a/website/static/img/docs/dbt-cloud/cloud-ide/ide-side-menu.jpg b/website/static/img/docs/dbt-cloud/cloud-ide/ide-side-menu.jpg index 060d273e3f5..71c182c302a 100644 Binary files a/website/static/img/docs/dbt-cloud/cloud-ide/ide-side-menu.jpg and b/website/static/img/docs/dbt-cloud/cloud-ide/ide-side-menu.jpg differ diff --git a/website/static/img/docs/dbt-cloud/cloud-ide/lineage-console-tab.jpg b/website/static/img/docs/dbt-cloud/cloud-ide/lineage-console-tab.jpg index cc0a0ffc41b..7d27314408c 100644 Binary files a/website/static/img/docs/dbt-cloud/cloud-ide/lineage-console-tab.jpg and b/website/static/img/docs/dbt-cloud/cloud-ide/lineage-console-tab.jpg differ diff --git a/website/static/img/docs/dbt-cloud/cloud-ide/restart-ide.jpg b/website/static/img/docs/dbt-cloud/cloud-ide/restart-ide.jpg index 98d71403cdd..031ec19227f 100644 Binary files a/website/static/img/docs/dbt-cloud/cloud-ide/restart-ide.jpg and b/website/static/img/docs/dbt-cloud/cloud-ide/restart-ide.jpg differ diff --git a/website/static/img/docs/dbt-cloud/cloud-ide/revert-uncommitted-changes-with-save.jpg b/website/static/img/docs/dbt-cloud/cloud-ide/revert-uncommitted-changes-with-save.jpg index bfd5832c001..7f7f520f5bb 100644 Binary files a/website/static/img/docs/dbt-cloud/cloud-ide/revert-uncommitted-changes-with-save.jpg and b/website/static/img/docs/dbt-cloud/cloud-ide/revert-uncommitted-changes-with-save.jpg differ diff --git a/website/static/img/docs/dbt-cloud/connecting-azure-devops/LinktoAzure.png b/website/static/img/docs/dbt-cloud/connecting-azure-devops/LinktoAzure.png index cd233b3f8e7..6cc30d05c6f 100644 Binary files a/website/static/img/docs/dbt-cloud/connecting-azure-devops/LinktoAzure.png and b/website/static/img/docs/dbt-cloud/connecting-azure-devops/LinktoAzure.png differ diff --git a/website/static/img/docs/dbt-cloud/discovery-api/model-timing.png b/website/static/img/docs/dbt-cloud/discovery-api/model-timing.png new file mode 100644 index 00000000000..3510473a090 Binary files /dev/null and b/website/static/img/docs/dbt-cloud/discovery-api/model-timing.png differ diff --git a/website/static/img/docs/dbt-cloud/example-git-signed-commits-setting.png b/website/static/img/docs/dbt-cloud/example-git-signed-commits-setting.png new file mode 100644 index 00000000000..bf3f8169359 Binary files /dev/null and b/website/static/img/docs/dbt-cloud/example-git-signed-commits-setting.png differ diff --git a/website/static/img/docs/dbt-cloud/git-sign-verified.jpg b/website/static/img/docs/dbt-cloud/git-sign-verified.jpg new file mode 100644 index 00000000000..86fbdd58dc9 Binary files /dev/null and b/website/static/img/docs/dbt-cloud/git-sign-verified.jpg differ diff --git a/website/static/img/docs/dbt-cloud/semantic-layer/sl-concept.png b/website/static/img/docs/dbt-cloud/semantic-layer/sl-concept.png new file mode 100644 index 00000000000..f1b1a252dc6 Binary files /dev/null and b/website/static/img/docs/dbt-cloud/semantic-layer/sl-concept.png differ diff --git a/website/static/img/docs/dbt-cloud/using-dbt-cloud/create-ci-job.png b/website/static/img/docs/dbt-cloud/using-dbt-cloud/create-ci-job.png index 4455d52f1a8..e1c94a74539 100644 Binary files a/website/static/img/docs/dbt-cloud/using-dbt-cloud/create-ci-job.png and b/website/static/img/docs/dbt-cloud/using-dbt-cloud/create-ci-job.png differ