Skip to content

Commit

Permalink
Merge branch 'current' into privatelink
Browse files Browse the repository at this point in the history
  • Loading branch information
matthewshaver authored Dec 2, 2023
2 parents 7c7b4d7 + d9e8412 commit 1802da3
Show file tree
Hide file tree
Showing 47 changed files with 366 additions and 26 deletions.
79 changes: 79 additions & 0 deletions website/docs/best-practices/clone-incremental-models.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
---
title: "Clone incremental models as the first step of your CI job"
id: "clone-incremental-models"
description: Learn how to define clone incremental models as the first step of your CI job.
displayText: Clone incremental models as the first step of your CI job
hoverSnippet: Learn how to clone incremental models for CI jobs.
---

Before you begin, you must be aware of a few conditions:
- `dbt clone` is only available with dbt version 1.6 and newer. Refer to our [upgrade guide](/docs/dbt-versions/upgrade-core-in-cloud) for help enabling newer versions in dbt Cloud
- This strategy only works for warehouse that support zero copy cloning (otherwise `dbt clone` will just create pointer views).
- Some teams may want to test that their incremental models run in both incremental mode and full-refresh mode.

Imagine you've created a [Slim CI job](/docs/deploy/continuous-integration) in dbt Cloud and it is configured to:

- Defer to your production environment.
- Run the command `dbt build --select state:modified+` to run and test all of the models you've modified and their downstream dependencies.
- Trigger whenever a developer on your team opens a PR against the main branch.

<Lightbox src="/img/best-practices/slim-ci-job.png" width="70%" title="Example of a slim CI job with the above configurations" />

Now imagine your dbt project looks something like this in the DAG:

<Lightbox src="/img/best-practices/dag-example.png" width="70%" title="Sample project DAG" />

When you open a pull request (PR) that modifies `dim_wizards`, your CI job will kickoff and build _only the modified models and their downstream dependencies_ (in this case, `dim_wizards` and `fct_orders`) into a temporary schema that's unique to your PR.

This build mimics the behavior of what will happen once the PR is merged into the main branch. It ensures you're not introducing breaking changes, without needing to build your entire dbt project.

## What happens when one of the modified models (or one of their downstream dependencies) is an incremental model?

Because your CI job is building modified models into a PR-specific schema, on the first execution of `dbt build --select state:modified+`, the modified incremental model will be built in its entirety _because it does not yet exist in the PR-specific schema_ and [is_incremental will be false](/docs/build/incremental-models#understanding-the-is_incremental-macro). You're running in `full-refresh` mode.

This can be suboptimal because:
- Typically incremental models are your largest datasets, so they take a long time to build in their entirety which can slow down development time and incur high warehouse costs.
- There are situations where a `full-refresh` of the incremental model passes successfully in your CI job but an _incremental_ build of that same table in prod would fail when the PR is merged into main (think schema drift where [on_schema_change](/docs/build/incremental-models#what-if-the-columns-of-my-incremental-model-change) config is set to `fail`)

You can alleviate these problems by zero copy cloning the relevant, pre-exisitng incremental models into your PR-specific schema as the first step of the CI job using the `dbt clone` command. This way, the incremental models already exist in the PR-specific schema when you first execute the command `dbt build --select state:modified+` so the `is_incremental` flag will be `true`.

You'll have two commands for your dbt Cloud CI check to execute:
1. Clone all of the pre-existing incremental models that have been modified or are downstream of another model that has been modified: `dbt clone --select state:modified+,config.materialized:incremental,state:old`
2. Build all of the models that have been modified and their downstream dependencies: `dbt build --select state:modified+`

Because of your first clone step, the incremental models selected in your `dbt build` on the second step will run in incremental mode.

<Lightbox src="/img/best-practices/clone-command.png" width="70%" title="Clone command in the CI config" />

Your CI jobs will run faster, and you're more accurately mimicking the behavior of what will happen once the PR has been merged into main.

### Expansion on "think schema drift" where [on_schema_change](/docs/build/incremental-models#what-if-the-columns-of-my-incremental-model-change) config is set to `fail`" from above

Imagine you have an incremental model `my_incremental_model` with the following config:

```sql

{{
config(
materialized='incremental',
unique_key='unique_id',
on_schema_change='fail'
)
}}

```

Now, let’s say you open up a PR that adds a new column to `my_incremental_model`. In this case:
- An incremental build will fail.
- A `full-refresh` will succeed.

If you have a daily production job that just executes `dbt build` without a `--full-refresh` flag, once the PR is merged into main and the job kicks off, you will get a failure. So the question is - what do you want to happen in CI?
- Do you want to also get a failure in CI, so that you know that once this PR is merged into main you need to immediately execute a `dbt build --full-refresh --select my_incremental_model` in production in order to avoid a failure in prod? This will block your CI check from passing.
- Do you want your CI check to succeed, because once you do run a `full-refresh` for this model in prod you will be in a successful state? This may lead unpleasant surprises if your production job is suddenly failing when you merge this PR into main if you don’t remember you need to execute a `dbt build --full-refresh --select my_incremental_model` in production.

There’s probably no perfect solution here; it’s all just tradeoffs! Our preference would be to have the failing CI job and have to manually override the blocking branch protection rule so that there are no surprises and we can proactively run the appropriate command in production once the PR is merged.

### Expansion on "why `state:old`"

For brand new incremental models, you want them to run in `full-refresh` mode in CI, because they will run in `full-refresh` mode in production when the PR is merged into `main`. They also don't exist yet in the production environment... they're brand new!
If you don't specify this, you won't get an error just a “No relation found in state manifest for…”. So, it technically works without specifying `state:old` but adding `state:old` is more explicit and means it won't even try to clone the brand new incremental models.
2 changes: 1 addition & 1 deletion website/docs/community/resources/getting-help.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,4 +60,4 @@ If you want to receive dbt training, check out our [dbt Learn](https://learn.get
- Billing
- Bug reports related to the web interface

As a rule of thumb, if you are using dbt Cloud, but your problem is related to code within your dbt project, then please follow the above process rather than reaching out to support.
As a rule of thumb, if you are using dbt Cloud, but your problem is related to code within your dbt project, then please follow the above process rather than reaching out to support. Refer to [dbt Cloud support](/docs/dbt-support) for more information.
3 changes: 2 additions & 1 deletion website/docs/docs/build/dimensions.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,8 @@ In a data platform, dimensions is part of a larger structure called a semantic m
Groups are defined within semantic models, alongside entities and measures, and correspond to non-aggregatable columns in your dbt model that provides categorical or time-based context. In SQL, dimensions is typically included in the GROUP BY clause.-->

All dimensions require a `name`, `type` and in some cases, an `expr` parameter.
All dimensions require a `name`, `type` and in some cases, an `expr` parameter. The `name` for your dimension must be unique to the semantic model and can not be the same as an existing `entity` or `measure` within that same model.


| Parameter | Description | Type |
| --------- | ----------- | ---- |
Expand Down
2 changes: 1 addition & 1 deletion website/docs/docs/build/entities.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ tags: [Metrics, Semantic Layer]

Entities are real-world concepts in a business such as customers, transactions, and ad campaigns. We often focus our analyses around specific entities, such as customer churn or annual recurring revenue modeling. We represent entities in our semantic models using id columns that serve as join keys to other semantic models in your semantic graph.

Within a semantic graph, the required parameters for an entity are `name` and `type`. The `name` refers to either the key column name from the underlying data table, or it may serve as an alias with the column name referenced in the `expr` parameter.
Within a semantic graph, the required parameters for an entity are `name` and `type`. The `name` refers to either the key column name from the underlying data table, or it may serve as an alias with the column name referenced in the `expr` parameter. The `name` for your entity must be unique to the semantic model and can not be the same as an existing `measure` or `dimension` within that same model.

Entities can be specified with a single column or multiple columns. Entities (join keys) in a semantic model are identified by their name. Each entity name must be unique within a semantic model, but it doesn't have to be unique across different semantic models.

Expand Down
2 changes: 2 additions & 0 deletions website/docs/docs/build/materializations.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,8 @@ pagination_next: "docs/build/incremental-models"
- ephemeral
- materialized view

You can also configure [custom materializations](/guides/create-new-materializations?step=1) in dbt. Custom materializations are a powerful way to extend dbt's functionality to meet your specific needs.


## Configuring materializations
By default, dbt models are materialized as "views". Models can be configured with a different materialization by supplying the `materialized` configuration parameter as shown below.
Expand Down
3 changes: 2 additions & 1 deletion website/docs/docs/build/measures.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,8 @@ measures:
When you create a measure, you can either give it a custom name or use the `name` of the data platform column directly. If the `name` of the measure is different from the column name, you need to add an `expr` to specify the column name. The `name` of the measure is used when creating a metric.

Measure names must be **unique** across all semantic models in a project.
Measure names must be unique across all semantic models in a project and can not be the same as an existing `entity` or `dimension` within that same model.


### Description

Expand Down
2 changes: 2 additions & 0 deletions website/docs/docs/cloud/billing.md
Original file line number Diff line number Diff line change
Expand Up @@ -126,6 +126,8 @@ All included successful models built numbers above reflect our most current pric

As an Enterprise customer, you pay annually via invoice, monthly in arrears for additional usage (if applicable), and may benefit from negotiated usage rates. Please refer to your order form or contract for your specific pricing details, or [contact the account team](https://www.getdbt.com/contact-demo) with any questions.

Enterprise plan billing information is not available in the dbt Cloud UI. Changes are handled through your dbt Labs Solutions Architect or account team manager.

### Legacy plans

Customers who purchased the dbt Cloud Team plan before August 11, 2023, remain on a legacy pricing plan as long as your account is in good standing. The legacy pricing plan is based on seats and includes unlimited models, subject to reasonable use.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,14 +3,15 @@ title: "About data platform connections"
id: about-connections
description: "Information about data platform connections"
sidebar_label: "About data platform connections"
pagination_next: "docs/cloud/connect-data-platform/connect-starburst-trino"
pagination_next: "docs/cloud/connect-data-platform/connect-microsoft-fabric"
pagination_prev: null
---
dbt Cloud can connect with a variety of data platform providers including:
- [Amazon Redshift](/docs/cloud/connect-data-platform/connect-redshift-postgresql-alloydb)
- [Apache Spark](/docs/cloud/connect-data-platform/connect-apache-spark)
- [Databricks](/docs/cloud/connect-data-platform/connect-databricks)
- [Google BigQuery](/docs/cloud/connect-data-platform/connect-bigquery)
- [Microsoft Fabric](/docs/cloud/connect-data-platform/connect-microsoft-fabric)
- [PostgreSQL](/docs/cloud/connect-data-platform/connect-redshift-postgresql-alloydb)
- [Snowflake](/docs/cloud/connect-data-platform/connect-snowflake)
- [Starburst or Trino](/docs/cloud/connect-data-platform/connect-starburst-trino)
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
---
title: "Connect Microsoft Fabric"
description: "Configure Microsoft Fabric connection."
sidebar_label: "Connect Microsoft Fabric"
---

## Supported authentication methods
The supported authentication methods are:
- Azure Active Directory (Azure AD) service principal
- Azure AD password

SQL password (LDAP) is not supported in Microsoft Fabric Synapse Data Warehouse so you must use Azure AD. This means that to use [Microsoft Fabric](https://www.microsoft.com/en-us/microsoft-fabric) in dbt Cloud, you will need at least one Azure AD service principal to connect dbt Cloud to Fabric, ideally one service principal for each user.

### Active Directory service principal
The following are the required fields for setting up a connection with a Microsoft Fabric using Azure AD service principal authentication.

| Field | Description |
| --- | --- |
| **Server** | The service principal's **host** value for the Fabric test endpoint. |
| **Port** | The port to connect to Microsoft Fabric. You can use `1433` (the default), which is the standard SQL server port number. |
| **Database** | The service principal's **database** value for the Fabric test endpoint. |
| **Authentication** | Choose **Service Principal** from the dropdown. |
| **Tenant ID** | The service principal's **Directory (tenant) ID**. |
| **Client ID** | The service principal's **application (client) ID id**. |
| **Client secret** | The service principal's **client secret** (not the **client secret id**). |


### Active Directory password

The following are the required fields for setting up a connection with a Microsoft Fabric using Azure AD password authentication.

| Field | Description |
| --- | --- |
| **Server** | The server hostname to connect to Microsoft Fabric. |
| **Port** | The server port. You can use `1433` (the default), which is the standard SQL server port number. |
| **Database** | The database name. |
| **Authentication** | Choose **Active Directory Password** from the dropdown. |
| **User** | The AD username. |
| **Password** | The AD username's password. |

## Configuration

To learn how to optimize performance with data platform-specific configurations in dbt Cloud, refer to [Microsoft Fabric DWH configurations](/reference/resource-configs/fabric-configs).
2 changes: 1 addition & 1 deletion website/docs/docs/community-adapters.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,4 +17,4 @@ Community adapters are adapter plugins contributed and maintained by members of
| [TiDB](/docs/core/connect-data-platform/tidb-setup) | [Firebolt](/docs/core/connect-data-platform/firebolt-setup) | [MindsDB](/docs/core/connect-data-platform/mindsdb-setup)
| [Vertica](/docs/core/connect-data-platform/vertica-setup) | [AWS Glue](/docs/core/connect-data-platform/glue-setup) | [MySQL](/docs/core/connect-data-platform/mysql-setup) |
| [Upsolver](/docs/core/connect-data-platform/upsolver-setup) | [Databend Cloud](/docs/core/connect-data-platform/databend-setup) | [fal - Python models](/docs/core/connect-data-platform/fal-setup) |

| [TimescaleDB](https://dbt-timescaledb.debruyn.dev/) | | |
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ dbt Core can connect with a variety of data platform providers including:
- [Apache Spark](/docs/core/connect-data-platform/spark-setup)
- [Databricks](/docs/core/connect-data-platform/databricks-setup)
- [Google BigQuery](/docs/core/connect-data-platform/bigquery-setup)
- [Microsoft Fabric](/docs/core/connect-data-platform/fabric-setup)
- [PostgreSQL](/docs/core/connect-data-platform/postgres-setup)
- [Snowflake](/docs/core/connect-data-platform/snowflake-setup)
- [Starburst or Trino](/docs/core/connect-data-platform/trino-setup)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ meta:
github_repo: 'Microsoft/dbt-fabric'
pypi_package: 'dbt-fabric'
min_core_version: '1.4.0'
cloud_support: Not Supported
cloud_support: Supported
platform_name: 'Microsoft Fabric'
config_page: '/reference/resource-configs/fabric-configs'
---
Expand Down
2 changes: 2 additions & 0 deletions website/docs/docs/dbt-cloud-apis/sl-jdbc.md
Original file line number Diff line number Diff line change
Expand Up @@ -352,6 +352,8 @@ semantic_layer.query(metrics=['food_order_amount', 'order_gross_profit'],
## FAQs
<FAQ path="Troubleshooting/sl-alpn-error" />
- **Why do some dimensions use different syntax, like `metric_time` versus `[Dimension('metric_time')`?**<br />
When you select a dimension on its own, such as `metric_time` you can use the shorthand method which doesn't need the “Dimension” syntax. However, when you perform operations on the dimension, such as adding granularity, the object syntax `[Dimension('metric_time')` is required.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,9 @@ date: 2023-11-28

Public Preview is now available in dbt Cloud for Microsoft Fabric!

To learn more, check out the [Quickstart for dbt Cloud and Microsoft Fabric](/guides/microsoft-fabric?step=1). The guide walks you through:
To learn more, refer to [Connect Microsoft Fabric](/docs/cloud/connect-data-platform/connect-microsoft-fabric) and [Microsoft Fabric DWH configurations](/reference/resource-configs/fabric-configs).

Also, check out the [Quickstart for dbt Cloud and Microsoft Fabric](/guides/microsoft-fabric?step=1). The guide walks you through:

- Loading the Jaffle Shop sample data (provided by dbt Labs) into your Microsoft Fabric warehouse.
- Connecting dbt Cloud to Microsoft Fabric.
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
---
title: "New: Support for Git repository caching"
description: "November 2023: dbt Cloud can cache your project's code (as well as other dbt packages) to ensure runs can begin despite an upstream Git provider's outage."
sidebar_label: "New: Support for Git repository caching"
sidebar_position: 07
tags: [Nov-2023]
date: 2023-11-29
---

Now available for dbt Cloud Enterprise plans is a new option to enable Git repository caching for your job runs. When enabled, dbt Cloud caches your dbt project's Git repository and uses the cached copy instead if there's an outage with the Git provider. This feature improves the reliability and stability of your job runs.

To learn more, refer to [Repo caching](/docs/deploy/deploy-environments#git-repository-caching).

<Lightbox src="/img/docs/deploy/example-repo-caching.png" width="85%" title="Example of the Repository caching option" />
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ import AvailIntegrations from '/snippets/_sl-partner-links.md';
- <span><a href="https://docs.getdbt.com/docs/dbt-cloud-apis/sl-api-overview" target="_self">{frontMatter.meta.api_name}</a></span> to learn how to integrate and query your metrics in downstream tools.
- [dbt Semantic Layer API query syntax](/docs/dbt-cloud-apis/sl-jdbc#querying-the-api-for-metric-metadata)
- [Hex dbt Semantic Layer cells](https://learn.hex.tech/docs/logic-cell-types/transform-cells/dbt-metrics-cells) to set up SQL cells in Hex.
- [Resolve 'Failed APN'](/faqs/Troubleshooting/sl-alpn-error) error when connecting to the dbt Semantic Layer.

</VersionBlock>

Expand Down
5 changes: 2 additions & 3 deletions website/docs/docs/use-dbt-semantic-layer/gsheets.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,10 +54,9 @@ To use the filter functionality, choose the [dimension](docs/build/dimensions) y
- For categorical dimensiosn, type in the dimension value you want to filter by (no quotes needed) and press enter.
- Continue adding additional filters as needed with AND and OR. If it's a time dimension, choose the operator and select from the calendar.



**Limited Use Policy Disclosure**

The dbt Semantic Layer for Sheet's use and transfer to any other app of information received from Google APIs will adhere to [Google API Services User Data Policy](https://developers.google.com/terms/api-services-user-data-policy), including the Limited Use requirements.


## FAQs
<FAQ path="Troubleshooting/sl-alpn-error" />
2 changes: 2 additions & 0 deletions website/docs/docs/use-dbt-semantic-layer/sl-architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,8 @@ The dbt Semantic Layer allows you to define metrics and use various interfaces t

<Lightbox src="/img/docs/dbt-cloud/semantic-layer/sl-architecture.jpg" width="85%" title="The diagram displays how your data flows using the dbt Semantic Layer and the variety of integration tools it supports."/>



## dbt Semantic Layer components

The dbt Semantic Layer includes the following components:
Expand Down
Loading

0 comments on commit 1802da3

Please sign in to comment.