Skip to content

Commit

Permalink
Merge branch 'current' into dbeatty/reaching-cloud-support
Browse files Browse the repository at this point in the history
  • Loading branch information
mirnawong1 authored Dec 8, 2023
2 parents d553b0d + 8192a21 commit ef2832d
Show file tree
Hide file tree
Showing 94 changed files with 328 additions and 19 deletions.
2 changes: 1 addition & 1 deletion contributing/content-style-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -284,7 +284,7 @@ If the list starts getting lengthy and dense, consider presenting the same conte

A bulleted list with introductory text:

> A dbt project is a directory of `.sql` and .yml` files. The directory must contain at a minimum:
> A dbt project is a directory of `.sql` and `.yml` files. The directory must contain at a minimum:
>
> - Models: A model is a single `.sql` file. Each model contains a single `select` statement that either transforms raw data into a dataset that is ready for analytics or, more often, is an intermediate step in such a transformation.
> - A project file: A `dbt_project.yml` file, which configures and defines your dbt project.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ displayText: Materializations best practices
hoverSnippet: Read this guide to understand the incremental models you can create in dbt.
---

So far we’ve looked at tables and views, which map to the traditional objects in the data warehouse. As mentioned earlier, incremental models are a little different. This where we start to deviate from this pattern with more powerful and complex materializations.
So far we’ve looked at tables and views, which map to the traditional objects in the data warehouse. As mentioned earlier, incremental models are a little different. This is where we start to deviate from this pattern with more powerful and complex materializations.

- 📚 **Incremental models generate tables.** They physically persist the data itself to the warehouse, just piece by piece. What’s different is **how we build that table**.
- 💅 **Only apply our transformations to rows of data with new or updated information**, this maximizes efficiency.
Expand Down Expand Up @@ -53,7 +53,7 @@ where
updated_at > (select max(updated_at) from {{ this }})
```

Let’s break down that `where` clause a bit, because this where the action is with incremental models. Stepping through the code **_right-to-left_** we:
Let’s break down that `where` clause a bit, because this is where the action is with incremental models. Stepping through the code **_right-to-left_** we:

1. Get our **cutoff.**
1. Select the `max(updated_at)` timestamp — the **most recent record**
Expand Down Expand Up @@ -138,7 +138,7 @@ where
{% endif %}
```

Fantastic! We’ve got a working incremental model. On our first run, when there is no corresponding table in the warehouse, `is_incremental` will evaluate to false and we’ll capture the entire table. On subsequent runs is it will evaluate to true and we’ll apply our filter logic, capturing only the newer data.
Fantastic! We’ve got a working incremental model. On our first run, when there is no corresponding table in the warehouse, `is_incremental` will evaluate to false and we’ll capture the entire table. On subsequent runs it will evaluate to true and we’ll apply our filter logic, capturing only the newer data.

### Late arriving facts

Expand Down
5 changes: 5 additions & 0 deletions website/docs/docs/build/metricflow-commands.md
Original file line number Diff line number Diff line change
Expand Up @@ -556,3 +556,8 @@ Keep in mind that modifying your shell configuration files can have an impact on
</details>
<details>
<summary>Why is my query limited to 100 rows in the dbt Cloud CLI?</summary>
The default <code>limit</code> for query issues from the dbt Cloud CLI is 100 rows. We set this default to prevent returning unnecessarily large data sets as the dbt Cloud CLI is typically used to query the dbt Semantic Layer during the development process, not for production reporting or to access large data sets. For most workflows, you only need to return a subset of the data.<br /><br />
However, you can change this limit if needed by setting the <code>--limit</code> option in your query. For example, to return 1000 rows, you can run <code>dbt sl list metrics --limit 1000</code>.
</details>
2 changes: 1 addition & 1 deletion website/docs/docs/build/semantic-models.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ semantic_models:
- name: the_name_of_the_semantic_model ## Required
description: same as always ## Optional
model: ref('some_model') ## Required
default: ## Required
defaults: ## Required
agg_time_dimension: dimension_name ## Required if the model contains dimensions
entities: ## Required
- see more information in entities
Expand Down
8 changes: 5 additions & 3 deletions website/docs/docs/cloud/manage-access/sso-overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,8 +57,9 @@ Non-admin users that currently login with a password will no longer be able to d
### Security best practices

There are a few scenarios that might require you to login with a password. We recommend these security best-practices for the two most common scenarios:
* **Onboarding partners and contractors** - We highly recommend that you add partners and contractors to your Identity Provider. IdPs like Okta and Azure Active Directory (AAD) offer capabilities explicitly for temporary employees. We highly recommend that you reach out to your IT team to provision an SSO license for these situations. Using an IdP highly secure, reduces any breach risk, and significantly increases the security posture of your dbt Cloud environment.
* **Identity Provider is down -** Account admins will continue to be able to log in with a password which would allow them to work with your Identity Provider to troubleshoot the problem.
* **Onboarding partners and contractors** &mdash; We highly recommend that you add partners and contractors to your Identity Provider. IdPs like Okta and Azure Active Directory (AAD) offer capabilities explicitly for temporary employees. We highly recommend that you reach out to your IT team to provision an SSO license for these situations. Using an IdP highly secure, reduces any breach risk, and significantly increases the security posture of your dbt Cloud environment.
* **Identity Provider is down** &mdash; Account admins will continue to be able to log in with a password which would allow them to work with your Identity Provider to troubleshoot the problem.
* **Offboarding admins** &mdash; When offboarding admins, revoke access to dbt Cloud by deleting the user from your environment; otherwise, they can continue to use username/password credentials to log in.

### Next steps for non-admin users currently logging in with passwords

Expand All @@ -67,4 +68,5 @@ If you have any non-admin users logging into dbt Cloud with a password today:
1. Ensure that all users have a user account in your identity provider and are assigned dbt Cloud so they won’t lose access.
2. Alert all dbt Cloud users that they won’t be able to use a password for logging in anymore unless they are already an Admin with a password.
3. We **DO NOT** recommend promoting any users to Admins just to preserve password-based logins because you will reduce security of your dbt Cloud environment.
**


1 change: 1 addition & 0 deletions website/docs/docs/cloud/secure/about-privatelink.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,3 +23,4 @@ dbt Cloud supports the following data platforms for use with the PrivateLink fea
- [Databricks](/docs/cloud/secure/databricks-privatelink)
- [Redshift](/docs/cloud/secure/redshift-privatelink)
- [Postgres](/docs/cloud/secure/postgres-privatelink)
- [VCS](/docs/cloud/secure/vcs-privatelink)
6 changes: 4 additions & 2 deletions website/docs/docs/collaborate/explore-projects.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
title: "Explore your dbt projects"
sidebar_label: "Explore dbt projects"
description: "Learn about dbt Explorer and how to interact with it to understand, improve, and leverage your data pipelines."
pagination_next: "docs/collaborate/explore-multiple-projects"
pagination_next: "docs/collaborate/model-performance"
pagination_prev: null
---

Expand Down Expand Up @@ -36,7 +36,7 @@ For a richer experience with dbt Explorer, you must:
- Run [dbt source freshness](/reference/commands/source#dbt-source-freshness) within a job in the environment to view source freshness data.
- Run [dbt snapshot](/reference/commands/snapshot) or [dbt build](/reference/commands/build) within a job in the environment to view snapshot details.

Richer and more timely metadata will become available as dbt, the Discovery API, and the underlying dbt Cloud platform evolves.
Richer and more timely metadata will become available as dbt Core, the Discovery API, and the underlying dbt Cloud platform evolves.

## Explore your project's lineage graph {#project-lineage}

Expand All @@ -46,6 +46,8 @@ If you don't see the project lineage graph immediately, click **Render Lineage**

The nodes in the lineage graph represent the project’s resources and the edges represent the relationships between the nodes. Nodes are color-coded and include iconography according to their resource type.

By default, dbt Explorer shows the project's [applied state](/docs/dbt-cloud-apis/project-state#definition-logical-vs-applied-state-of-dbt-nodes) lineage. That is, it shows models that have been successfully built and are available to query, not just the models defined in the project.

To explore the lineage graphs of tests and macros, view [their resource details pages](#view-resource-details). By default, dbt Explorer excludes these resources from the full lineage graph unless a search query returns them as results.

To interact with the full lineage graph, you can:
Expand Down
41 changes: 41 additions & 0 deletions website/docs/docs/collaborate/model-performance.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
---
title: "Model performance"
sidebar_label: "Model performance"
description: "Learn about ."
---

dbt Explorer provides metadata on dbt Cloud runs for in-depth model performance and quality analysis. This feature assists in reducing infrastructure costs and saving time for data teams by highlighting where to fine-tune projects and deployments &mdash; such as model refactoring or job configuration adjustments.

<LoomVideo id='98f33b3b7a374df0b7c04747eae6ef44' />

:::tip Beta

The model performance beta feature is now available in dbt Explorer! Check it out!
:::

## The Performance overview page

You can pinpoint areas for performance enhancement by using the Performance overview page. This page presents a comprehensive analysis across all project models and displays the longest-running models, those most frequently executed, and the ones with the highest failure rates during runs/tests. Data can be segmented by environment and job type which can offer insights into:

- Most executed models (total count).
- Models with the longest execution time (average duration).
- Models with the most failures, detailing run failures (percentage and count) and test failures (percentage and count).

Each data point links to individual models in Explorer.

<Lightbox src="/img/docs/collaborate/dbt-explorer/example-performance-overview-page.png" width="80%" title="Example of Performance overview page"/>

You can view historical metadata for up to the past three months. Select the time horizon using the filter, which defaults to a two-week lookback.

<Lightbox src="/img/docs/collaborate/dbt-explorer/ex-2-week-default.png" title="Example of dropdown"/>

## The Model performance tab

You can view trends in execution times, counts, and failures by using the Model performance tab for historical performance analysis. Daily execution data includes:

- Average model execution time.
- Model execution counts, including failures/errors (total sum).

Clicking on a data point reveals a table listing all job runs for that day, with each row providing a direct link to the details of a specific run.

<Lightbox src="/img/docs/collaborate/dbt-explorer/example-model-performance-tab.png" title="Example of the Model performance tab"/>
50 changes: 50 additions & 0 deletions website/docs/docs/collaborate/project-recommendations.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
---
title: "Project recommendations"
sidebar_label: "Project recommendations"
description: "dbt Explorer provides recommendations that you can take to improve the quality of your dbt project."
---

:::tip Beta

The project recommendations beta feature is now available in dbt Explorer! Check it out!

:::

dbt Explorer provides recommendations about your project from the `dbt_project_evaluator` [package](https://hub.getdbt.com/dbt-labs/dbt_project_evaluator/latest/) using metadata from the Discovery API.

Explorer also offers a global view, showing all the recommendations across the project for easy sorting and summarizing.

These recommendations provide insight into how you can build a more well documented, well tested, and well built project, leading to less confusion and more trust.

The Recommendations overview page includes two top-level metrics measuring the test and documentation coverage of the models in your project.

- **Model test coverage** &mdash; The percent of models in your project (models not from a package or imported via dbt Mesh) with at least one dbt test configured on them.
- **Model documentation coverage** &mdash; The percent of models in your project (models not from a package or imported via dbt Mesh) with a description.

<Lightbox src="/img/docs/collaborate/dbt-explorer/example-recommendations-overview.png" width="80%" title="Example of the Recommendations overview page with project metrics and the recommendations for all resources in the project"/>

## List of rules

| Category | Name | Description | Package Docs Link |
| --- | --- | --- | --- |
| Modeling | Direct Join to Source | Model that joins both a model and source, indicating a missing staging model | [GitHub](https://dbt-labs.github.io/dbt-project-evaluator/0.8/rules/modeling/#direct-join-to-source) |
| Modeling | Duplicate Sources | More than one source node corresponds to the same data warehouse relation | [GitHub](https://dbt-labs.github.io/dbt-project-evaluator/0.8/rules/modeling/#duplicate-sources) |
| Modeling | Multiple Sources Joined | Models with more than one source parent, indicating lack of staging models | [GitHub](https://dbt-labs.github.io/dbt-project-evaluator/0.8/rules/modeling/#multiple-sources-joined) |
| Modeling | Root Model | Models with no parents, indicating potential hardcoded references and need for sources | [GitHub](https://dbt-labs.github.io/dbt-project-evaluator/0.8/rules/modeling/#root-models) |
| Modeling | Source Fanout | Sources with more than one model child, indicating a need for staging models | [GitHub](https://dbt-labs.github.io/dbt-project-evaluator/0.8/rules/modeling/#source-fanout) |
| Modeling | Unused Source | Sources that are not referenced by any resource | [GitHub](https://dbt-labs.github.io/dbt-project-evaluator/0.8/rules/modeling/#unused-sources) |
| Performance | Exposure Dependent on View | Exposures with at least one model parent materialized as a view, indicating potential query performance issues | [GitHub](https://dbt-labs.github.io/dbt-project-evaluator/0.8/rules/performance/#exposure-parents-materializations) |
| Testing | Missing Primary Key Test | Models with insufficient testing on the grain of the model. | [GitHub](https://dbt-labs.github.io/dbt-project-evaluator/0.8/rules/testing/#missing-primary-key-tests) |
| Documentation | Undocumented Models | Models without a model-level description | [GitHub](https://dbt-labs.github.io/dbt-project-evaluator/0.8/rules/documentation/#undocumented-models) |
| Documentation | Undocumented Source | Sources (collections of source tables) without descriptions | [GitHub](https://dbt-labs.github.io/dbt-project-evaluator/0.8/rules/documentation/#undocumented-sources) |
| Documentation | Undocumented Source Tables | Source tables without descriptions | [GitHub](https://dbt-labs.github.io/dbt-project-evaluator/0.8/rules/documentation/#undocumented-source-tables) |
| Governance | Public Model Missing Contract | Models with public access that do not have a model contract to ensure the data types | [GitHub](https://dbt-labs.github.io/dbt-project-evaluator/0.8/rules/governance/#public-models-without-contracts) |


## The Recommendations tab

Models, sources and exposures each also have a Recommendations tab on their resource details page, with the specific recommendations that correspond to that resource:

<Lightbox src="/img/docs/collaborate/dbt-explorer/example-recommendations-tab.png" width="80%" title="Example of the Recommendations tab "/>


Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
---
title: "Update: Extended attributes is GA"
description: "December 2023: The extended attributes feature is now GA in dbt Cloud. It enables you to override dbt adapter YAML attributes at the environment level."
sidebar_label: "Update: Extended attributes is GA"
sidebar_position: 10
tags: [Dec-2023]
date: 2023-12-06
---

The extended attributes feature in dbt Cloud is now GA! It allows for an environment level override on any YAML attribute that a dbt adapter accepts in its `profiles.yml`. You can provide a YAML snippet to add or replace any [profile](/docs/core/connect-data-platform/profiles.yml) value.

To learn more, refer to [Extended attributes](/docs/dbt-cloud-environments#extended-attributes).

The **Extended Atrributes** text box is available from your environment's settings page:

<Lightbox src="/img/docs/dbt-cloud/using-dbt-cloud/extended-attributes.jpg" width="85%" title="Example of the Extended Attributes text box" />
4 changes: 2 additions & 2 deletions website/docs/docs/deploy/retry-jobs.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ If your dbt job run completed with a status of **Error**, you can rerun it from
<Lightbox src="/img/docs/deploy/native-retry.gif" width="70%" title="Example of the Rerun options in dbt Cloud"/>

## Related content
- [Retry a failed run for a job](/dbt-cloud/api-v2#/operations/Retry%20a%20failed%20run%20for%20a%20job) API endpoint
- [Retry a failed run for a job](/dbt-cloud/api-v2#/operations/Retry%20Failed%20Job) API endpoint
- [Run visibility](/docs/deploy/run-visibility)
- [Jobs](/docs/deploy/jobs)
- [Job commands](/docs/deploy/job-commands)
- [Job commands](/docs/deploy/job-commands)
2 changes: 1 addition & 1 deletion website/docs/guides/bigquery-qs.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ In order to let dbt connect to your warehouse, you'll need to generate a keyfile
- Click **Next** to create a new service account.
2. Create a service account for your new project from the [Service accounts page](https://console.cloud.google.com/projectselector2/iam-admin/serviceaccounts?supportedpurview=project). For more information, refer to [Create a service account](https://developers.google.com/workspace/guides/create-credentials#create_a_service_account) in the Google Cloud docs. As an example for this guide, you can:
- Type `dbt-user` as the **Service account name**
- From the **Select a role** dropdown, choose **BigQuery Admin** and click **Continue**
- From the **Select a role** dropdown, choose **BigQuery Job User** and **BigQuery Data Editor** roles and click **Continue**
- Leave the **Grant users access to this service account** fields blank
- Click **Done**
3. Create a service account key for your new project from the [Service accounts page](https://console.cloud.google.com/iam-admin/serviceaccounts?walkthrough_id=iam--create-service-account-keys&start_index=1#step_index=1). For more information, refer to [Create a service account key](https://cloud.google.com/iam/docs/creating-managing-service-account-keys#creating) in the Google Cloud docs. When downloading the JSON file, make sure to use a filename you can easily remember. For example, `dbt-user-creds.json`. For security reasons, dbt Labs recommends that you protect this JSON file like you would your identity credentials; for example, don't check the JSON file into your version control software.
Expand Down
2 changes: 1 addition & 1 deletion website/docs/guides/manual-install-qs.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ When you use dbt Core to work with dbt, you will be editing files locally using

* To use dbt Core, it's important that you know some basics of the Terminal. In particular, you should understand `cd`, `ls` and `pwd` to navigate through the directory structure of your computer easily.
* Install dbt Core using the [installation instructions](/docs/core/installation-overview) for your operating system.
* Complete [Setting up (in BigQuery)](/guides/bigquery?step=2) and [Loading data (BigQuery)](/guides/bigquery?step=3).
* Complete appropriate Setting up and Loading data steps in the Quickstart for dbt Cloud series. For example, for BigQuery, complete [Setting up (in BigQuery)](/guides/bigquery?step=2) and [Loading data (BigQuery)](/guides/bigquery?step=3).
* [Create a GitHub account](https://github.com/join) if you don't already have one.

### Create a starter project
Expand Down
3 changes: 3 additions & 0 deletions website/docs/reference/artifacts/dbt-artifacts.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,3 +48,6 @@ In the manifest, the `metadata` may also include:
#### Notes:
- The structure of dbt artifacts is canonized by [JSON schemas](https://json-schema.org/), which are hosted at **schemas.getdbt.com**.
- Artifact versions may change in any minor version of dbt (`v1.x.0`). Each artifact is versioned independently.

## Related docs
- [Other artifacts](/reference/artifacts/other-artifacts) files such as `index.html` or `graph_summary.json`.
Loading

0 comments on commit ef2832d

Please sign in to comment.