diff --git a/website/blog/2024-11-04-test-smarter-not-harder.md b/website/blog/2024-11-04-test-smarter-not-harder.md new file mode 100644 index 00000000000..58adfb38cb9 --- /dev/null +++ b/website/blog/2024-11-04-test-smarter-not-harder.md @@ -0,0 +1,163 @@ +--- +title: "Test smarter not harder: add the right tests to your dbt project" +description: "Testing your data should drive action, not accumulate alerts. We synthesized countless customer experiences to build a repeatable testing framework." +slug: test-smarter-not-harder + +authors: [faith_mckenna, jerrie_kumalah_kenney] + +tags: [analytics craft] +hide_table_of_contents: false + +date: 2024-11-11 +is_featured: true +--- + + + +The [Analytics Development Lifecycle (ADLC)](https://www.getdbt.com/resources/guides/the-analytics-development-lifecycle) is a workflow for improving data maturity and velocity. Testing is a key phase here. Many dbt developers tend to focus on [primary keys and source freshness.](https://www.getdbt.com/blog/building-a-data-quality-framework-with-dbt-and-dbt-cloud) We think there is a more holistic and in-depth path to tread. Testing is a key piece of the ADLC, and it should drive data quality. + +In this blog, we’ll walk through a plan to define data quality. This will look like: + +- identifying *data hygiene* issues +- identifying *business-focused anomaly* issues +- identifying *stats-focused anomaly* issues + +Once we have *defined* data quality, we’ll move on to *prioritize* those concerns. We will: + +- think through each concern in terms of the breadth of impact +- decide if each concern should be at error or warning severity + + + +### Who are we? + +Let’s start with introductions - we’re Faith and Jerrie, and we work on dbt Labs’s training and services teams, respectively. By working closely with countless companies using dbt, we’ve gained unique perspectives of the landscape. + +The training team collates problems organizations think about today and gauge how our solutions fit. These are shorter engagements, which means we see the data world shift and change in real time. Resident Architects spend much more time with teams to craft much more in-depth solutions, figure out where those solutions are helping, and where problems still need to be addressed. Trainers help identify patterns in the problems data teams face, and Resident Architects dive deep on solutions. + +Today, we’ll guide you through a particularly thorny problem: testing. + +## Why testing? + +Mariah Rogers broke early ground on data quality and testing in her [Coalesce 2022 talk](https://www.youtube.com/watch?v=hxvVhmhWRJA). We’ve seen similar talks again at Coalesce 2024, like [this one](https://www.youtube.com/watch?v=iCG-5vqMRAo) from the data team at Aiven and [this one](https://www.youtube.com/watch?v=5bRG3y9IM4Q&list=PL0QYlrC86xQnWJ72sJlzDqPS0peE7j9Ed&index=71) from the co-founder at Omni Analytics. These talks share a common theme: testing your dbt project too much can get out of control quickly, leading to alert fatigue. + +In our customer engagements, we see *wildly different approaches* to testing data. We’ve definitely seen what Mariah, the Aiven team, and the Omni team have described, which is so many tests that errors and alerts just become noise. We’ve also seen the opposite end of the spectrum—only primary keys being tested. From our field experiences, we believe there’s room for a middle path. +A desire for a better approach to data quality and testing isn’t just anecdotal to Coalesce, or to dbt’s training and services. The dbt community has long called for a more intentional approach to data quality and testing - data quality is on the industry’s mind! In fact, [57% of respondents](https://www.getdbt.com/resources/reports/state-of-analytics-engineering-2024) to dbt’s 2024 State of Analytics Engineering survey said that data quality is a predominant issue facing their day-to-day work. + +### What does d@tA qUaL1Ty even mean?! + +High-quality data is *trusted* and *used frequently.* It doesn’t get argued over or endlessly scrutinized for matching to other data. Data *testing* should lead to higher data *quality* and insights, period. + +Best practices in data quality are still nascent. That said, a lot of important baseline work has been done here. There are [case](https://medium.com/@AtheonAnalytics/mastering-data-testing-with-dbt-part-1-689b2a025675) [studies](https://medium.com/@AtheonAnalytics/mastering-data-testing-with-dbt-part-2-c4031af3df18) on implementing dbt testing well. dbt Labs also has an [Advanced Testing](https://learn.getdbt.com/courses/advanced-testing) course, emphasizing that testing should spur action and be focused and informative enough to help address failures. You can even enforce testing best practices and dbt Labs’s own best practices using the [dbt_meta_testing](https://hub.getdbt.com/tnightengale/dbt_meta_testing/latest/) or [dbt_project_evaluator](https://github.com/dbt-labs/dbt-project-evaluator) packages and dbt Explorer’s [Recommendations](https://docs.getdbt.com/docs/collaborate/project-recommendations) page. + +The missing piece is still cohesion and guidance for everyday practitioners to help develop their testing framework. + +To recap, we’re going to start with: + +- identifying *data hygiene* issues +- identifying *business-focused anomaly* issues +- identifying *stats-focused anomaly* issues + +Next, we’ll prioritize. We will: + +- think through each concern in terms of the breadth of impact +- decide if each concern should be at error or warning severity + +Get a pen and paper (or a google doc) and join us in constructing your own testing framework. + +## Identifying data quality issues in your pipeline + +Let’s start our framework by *identifying* types of data quality issues. + +In our daily work with customers, we find that data quality issues tend to fall into one of three broad buckets: *data hygiene, business-focused anomalies,* and *stats-focused anomalies.* Read the bucket descriptions below, and list 2-3 data quality concerns in your own business context that fall into each bucket. + +### Bucket 1: Data hygiene + +*Data hygiene* issues are concerns you address in your [staging layer.](https://docs.getdbt.com/best-practices/how-we-structure/2-staging) Hygienic data meets your expectations around formatting, completeness, and granularity requirements. Here are a few examples. + +- *Granularity:* primary keys are unique and not null. Duplicates throw off calculations. +- *Completeness:* columns that should always contain text, *do.* Incomplete data often has to get excluded, reducing your overall analytical power. +- *Formatting:* email addresses always have a valid domain. Incorrect emails may affect things like marketing outreach. + +### Bucket 2: Business-focused anomalies + +*Business-focused anomalies* catch unexpected behavior. You can flag unexpected behavior by clearly defining *expected* behavior. *Business-focused anomalies* are when aspects of the data differ from what you know to be typical in your business. You’ll know what’s typical either through your own analyses, your colleagues’ analyses, or things your stakeholder homies point out to you. + +Since business-focused anomaly testing is set by a human, it will be fluid and need to be adjusted periodically. Here’s an example. + +Imagine you’re a sales analyst. Generally, you know that if your daily sales amount goes up or down by more than 20% daily, that’s bad. Specifically, it’s usually a warning sign for fraud or the order management system (OMS) dropping orders. You set a test in dbt to fail if any given day’s sales amount is a delta of 20% from the previous day. This works for a while. + +Then, you have a stretch of 3 months where your test fails 5 times a week! Every time you investigate, it turns out to be valid consumer behavior. You’re suddenly in hypergrowth, and sales are legitimately increasing that much. + +Your 20%-change fraud and OMS failure detector is no longer valid. You need to investigate anew which sales spikes or drops indicate fraud or OMS problems. Once you figure out a new threshold, you’ll go back and adjust your testing criteria. + +Although your data’s expected behavior will shift over time, you should still commit to defining business-focused anomalies to grow your understanding of what is normal for your data. + +Here’s how to identify potential anomalies. + +Start at your business intelligence (BI) layer. Pick 1-3 dashboards or tables that you *know* are used frequently. List these 1-3 dashboards or tables. For each dashboard or table you have, identify 1-3 “expected” behaviors that your end-users rely on. Here are a few examples to get you thinking: + +- Revenue numbers should not change by more than X% in Y amount of time. This could indicate fraud or OMS problems. +- Monthly active users should not decline more than X% after the initial onboarding period. This might indicate user dissatisfaction, usability issues, or that users not finding a feature valuable. +- Exam passing rates should stay above Y%. A decline below that threshold may indicate recent content changes or technical issues are affecting understanding or accessibility. + +You should also consider what data issues you have had in the past! Look through recent data incidents and pick out 3 or 4 to guard against next time. These might be in a #data-questions channel or perhaps a DM from a stakeholder. + +### Bucket 3: Stats-focused anomalies + +*Stats-focused anomalies* are fluctuations that go against your expected volumes or metrics. Some examples include: + +- Volume anomalies. This could be site traffic amounts that may indicate illicit behavior, or perhaps site traffic dropping one day then doubling the next, indicating that a chunk of data were not loaded properly. +- Dimensional anomalies, like too many product types underneath a particular product line that may indicate incorrect barcodes. +- Column anomalies, like sale values more than a certain number of standard deviations from a mean, that may indicate improper discounting. + +Overall, stats-focused anomalies can indicate system flaws, illicit site behavior, or fraud, depending on your industry. They also tend to require more advanced testing practices than we are covering in this blog. We feel stats-based anomalies are worth exploring once you have a good handle on your data hygiene and business-focused anomalies. We won’t give recommendations on stats-focused anomalies in this post. + +## How to prioritize data quality concerns in your pipeline + +Now, you have a written and categorized list of data hygiene concerns and business-focused anomalies to guard against. It’s time to *prioritize* which quality issues deserve to fail your pipelines. + +To prioritize your data quality concerns, think about real-life impact. A couple of guiding questions to consider are: + +- Are your numbers *customer-facing?* For example, maybe you work with temperature-tracking devices. Your customers rely on these devices to show them average temperatures on perishable goods like strawberries in-transit. What happens if the temperature of the strawberries reads as 300C when they know their refrigerated truck was working just fine? How is your brand perception impacted when the numbers are wrong? +- Are your numbers *used to make financial decisions?* For example, is the marketing team relying on your numbers to choose how to spend campaign funds? +- Are your numbers *executive-facing?* Will executives use these numbers to reallocate funds or shift priorities? + +We think these 3 categories above constitute high-impact, pipeline-failing events, and should be your top priorities. Of course, adjust priority order if your business context calls for it. + +Consult your list of data quality issues in the categories we mention above. Decide and mark if any are customer facing, used for financial decisions, or are executive-facing. Mark any data quality issues in those categories as “error”. These are your pipeline-failing events. + +If any data quality concerns fall outside of these 3 categories, we classify them as **nice-to-knows**. **Nice-to-know** data quality testing *can* be helpful. But if you don’t have a *specific action you can immediately take* when a nice-to-know quality test fails, the test *should be a warning, not an error.* + +You could also remove nice-to-know tests altogether. Data testing should drive action. The more alerts you have in your pipeline, the less action you will take. Configure alerts with care! + +However, we do think nice-to-know tests are worth keeping *if and only if* you are gathering evidence for action you plan to take within the next 6 months, like product feature research. In a scenario like that, those tests should still be set to warning. + +### Start your action plan + +Now, your data quality concerns are listed and prioritized. Next, add 1 or 2 initial debugging steps you will take if/when the issues surface. These steps should get added to your framework document. Additionally, consider adding them to a [test’s description.](https://discourse.getdbt.com/t/is-it-possible-to-add-a-description-to-singular-tests/5472/4) + +This step is *important.* Data quality testing should spur action, not accumulate alerts. Listing initial debugging steps for each concern will refine your list to the most critical elements. + +If you can't identify an action step for any quality issue, *remove it*. Put it on a backlog and research what you can do when it surfaces later. + +Here’s a few examples from our list of unexpected behaviors above. + +- For calculated field X, a value above Y or below Z is not possible. + - *Debugging initial steps* + - Use dbt test SQL or recent test results in dbt Explorer to find problematic rows + - Check these rows in staging and first transformed model + - Pinpoint where unusual values first appear +- Revenue shouldn’t change by more than X% in Y amount of time. + - *Debugging initial steps:* + - Check recent revenue values in staging model + - Identify transactions near min/max values + - Discuss outliers with sales ops team + +You now have written out a prioritized list of data quality concerns, as well as action steps to take when each concern surfaces. Next, consult [hub.getdbt.com](http://hub.getdbt.com) and find tests that address each of your highest priority concerns. [dbt-expectations](https://hub.getdbt.com/calogica/dbt_expectations/latest/) and [dbt_utils](https://hub.getdbt.com/dbt-labs/dbt_utils/latest/) are great places to start. + +The data tests you’ve marked as “errors” above should get error-level severity. Any concerns falling into that nice-to-know category should either *not get tested* or have their tests *set to warning.* + +Your data quality priorities list is a living reference document. We recommend linking it in your project’s README so that you can go back and edit it as your testing needs evolve. Additionally, developers in your project should have easy access to this document. Maintaining good data quality is everyone’s responsibility! + +As you try these ideas out, come to the dbt Community Slack and let us know what works and what doesn’t. Data is a community of practice, and we are eager to hear what comes out of yours. diff --git a/website/blog/authors.yml b/website/blog/authors.yml index 271130a477d..3070ec806b5 100644 --- a/website/blog/authors.yml +++ b/website/blog/authors.yml @@ -214,6 +214,14 @@ euan_johnston: - icon: fa-github url: https://github.com/euanjohnston-dev name: Euan Johnston +faith_mckenna: + image_url: /img/blog/authors/faith_pic.png + job_title: Senior Technical Instructor + links: + - icon: fa-linkedin + url: https://www.linkedin.com/in/faithlierheimer/ + name: Faith McKenna + organization: dbt Labs filip_byrén: image_url: /img/blog/authors/filip-eqt.png job_title: VP and Software Architect @@ -275,6 +283,14 @@ jeremy_cohen: job_title: Product Manager name: Jeremy Cohen organization: dbt Labs +jerrie_kumalah_kenney: + image_url: /img/blog/authors/jerrie.jpg + job_title: Resident Architect + links: + - icon: fa-linkedin + url: https://www.linkedin.com/in/jerriekumalah/ + name: Jerrie Kumalah Kenney + organization: dbt Labs jess_williams: image_url: /img/blog/authors/jess.png job_title: Head of Professional Services @@ -606,4 +622,4 @@ yu_ishikawa: - icon: fa-linkedin url: https://www.linkedin.com/in/yuishikawa0301 name: Yu Ishikawa - organization: Ubie \ No newline at end of file + organization: Ubie diff --git a/website/docs/docs/build/custom-target-names.md b/website/docs/docs/build/custom-target-names.md index ac7036de572..218fec4283d 100644 --- a/website/docs/docs/build/custom-target-names.md +++ b/website/docs/docs/build/custom-target-names.md @@ -24,6 +24,6 @@ To set a custom target name for a job in dbt Cloud, configure the **Target Name* ## dbt Cloud IDE -When developing in dbt Cloud, you can set a custom target name in your development credentials. Go to your account (from the gear menu in the top right hand corner), select the project under **Credentials**, and update the target name. +When developing in dbt Cloud, you can set a custom target name in your development credentials. Click your account name above the profile icon in the left panel, select **Account settings**, then go to **Credentials**. Choose the project to update the target name. diff --git a/website/docs/docs/build/python-models.md b/website/docs/docs/build/python-models.md index 811379a0d2c..28136f91e9c 100644 --- a/website/docs/docs/build/python-models.md +++ b/website/docs/docs/build/python-models.md @@ -660,6 +660,40 @@ models: **Docs:** ["Developer Guide: Snowpark Python"](https://docs.snowflake.com/en/developer-guide/snowpark/python/index.html) +#### Third-party Snowflake packages + +To use a third-party Snowflake package that isn't available in Snowflake Anaconda, upload your package by following [this example](https://docs.snowflake.com/en/developer-guide/udf/python/udf-python-packages#importing-packages-through-a-snowflake-stage), and then configure the `imports` setting in the dbt Python model to reference to the zip file in your Snowflake staging. + +Here’s a complete example configuration using a zip file, including using `imports` in a Python model: + +```python + +def model(dbt, session): + # Configure the model + dbt.config( + materialized="table", + imports=["@mystage/mycustompackage.zip"], # Specify the external package location + ) + + # Example data transformation using the imported package + # (Assuming `some_external_package` has a function we can call) + data = { + "name": ["Alice", "Bob", "Charlie"], + "score": [85, 90, 88] + } + df = pd.DataFrame(data) + + # Process data with the external package + df["adjusted_score"] = df["score"].apply(lambda x: some_external_package.adjust_score(x)) + + # Return the DataFrame as the model output + return df + +``` + +For more information on using this configuration, refer to [Snowflake's documentation](https://community.snowflake.com/s/article/how-to-use-other-python-packages-in-snowpark) on uploading and using other python packages in Snowpark not published on Snowflake's Anaconda channel. + +
diff --git a/website/docs/docs/cloud-integrations/set-up-snowflake-native-app.md b/website/docs/docs/cloud-integrations/set-up-snowflake-native-app.md index 49e6f90e41f..ff151d4636e 100644 --- a/website/docs/docs/cloud-integrations/set-up-snowflake-native-app.md +++ b/website/docs/docs/cloud-integrations/set-up-snowflake-native-app.md @@ -45,7 +45,10 @@ The following are the prerequisites for dbt Cloud and Snowflake. Configure dbt Cloud and Snowflake Cortex to power the **Ask dbt** chatbot. 1. In dbt Cloud, browse to your Semantic Layer configurations. - 1. From the gear menu, select **Account settings**. In the left sidebar, select **Projects** and choose your dbt project from the project list. + + 1. Navigate to the left hand side panel and click your account name. From there, select **Account settings**. + 1. In the left sidebar, select **Projects** and choose your dbt project from the project list. + 1. In the **Project details** panel, click the **Edit Semantic Layer Configuration** link (which is below the **GraphQL URL** option). 1. In the **Semantic Layer Configuration Details** panel, identify the Snowflake credentials (which you'll use to access Snowflake Cortex) and the environment against which the Semantic Layer is run. Save the username, role, and the environment in a temporary location to use later on. @@ -67,7 +70,7 @@ Configure dbt Cloud and Snowflake Cortex to power the **Ask dbt** chatbot. ## Configure dbt Cloud Collect the following pieces of information from dbt Cloud to set up the application. -1. From the gear menu in dbt Cloud, select **Account settings**. In the left sidebar, select **API tokens > Service tokens**. Create a service token with access to all the projects you want to access in the dbt Snowflake Native App. Grant these permission sets: +1. Navigate to the left-hand side panel and click your account name. From there, select **Account settings**. Then click **API tokens > Service tokens**. Create a service token with access to all the projects you want to access in the dbt Snowflake Native App. Grant these permission sets: - **Manage marketplace apps** - **Job Admin** - **Metadata Only** diff --git a/website/docs/docs/cloud/manage-access/environment-permissions-setup.md b/website/docs/docs/cloud/manage-access/environment-permissions-setup.md index 1a3f2724819..5b41477e456 100644 --- a/website/docs/docs/cloud/manage-access/environment-permissions-setup.md +++ b/website/docs/docs/cloud/manage-access/environment-permissions-setup.md @@ -15,7 +15,7 @@ Environment-level permissions are not the same as account-level [role-based acce In your dbt Cloud account: -1. Open the **gear menu** and select **Account settings**. From the left-side menu, select **Groups & Licenses**. While you can edit existing groups, we recommend not altering the default `Everyone`, `Member`, and `Owner` groups. +1. Click your account name, above your profile icon on the left side panel, then select **Account settings**. From there, select **Groups & Licenses**. While you can edit existing groups, we recommend not altering the default `Everyone`, `Member`, and `Owner` groups. diff --git a/website/docs/docs/cloud/manage-access/set-up-bigquery-oauth.md b/website/docs/docs/cloud/manage-access/set-up-bigquery-oauth.md index 9a356814111..e528e2ebc1f 100644 --- a/website/docs/docs/cloud/manage-access/set-up-bigquery-oauth.md +++ b/website/docs/docs/cloud/manage-access/set-up-bigquery-oauth.md @@ -25,13 +25,14 @@ To use BigQuery in the dbt Cloud IDE, all developers must: ### Locate the redirect URI value To get started, locate the connection's redirect URI for configuring BigQuery OAuth. To do so: - - Select the gear menu in the upper left corner and choose **Account settings** + - Navigate to your account name, above your profile icon on the left side panel + - Select **Account settings** from the menu - From the left sidebar, select **Projects** - Choose the project from the list - Select **Connection** to edit the connection details - Locate the **Redirect URI** field under the **OAuth 2.0 Settings** section. Copy this value to your clipboard to use later on. - + ### Creating a BigQuery OAuth 2.0 client ID and secret To get started, you need to create a client ID and secret for [authentication](https://cloud.google.com/bigquery/docs/authentication) with BigQuery. This client ID and secret will be stored in dbt Cloud to manage the OAuth connection between dbt Cloud users and BigQuery. @@ -64,10 +65,12 @@ Now that you have an OAuth app set up in BigQuery, you'll need to add the client ### Authenticating to BigQuery Once the BigQuery OAuth app is set up for a dbt Cloud project, each dbt Cloud user will need to authenticate with BigQuery in order to use the IDE. To do so: -- Select the gear menu in the upper left corner and choose **Profile settings** +- Navigate to your account name, above your profile icon on the left side panel +- Select **Account settings** from the menu - From the left sidebar, select **Credentials** - Choose the project from the list - Select **Authenticate BigQuery Account** + You will then be redirected to BigQuery and asked to approve the drive, cloud platform, and BigQuery scopes, unless the connection is less privileged. diff --git a/website/docs/docs/core/connect-data-platform/teradata-setup.md b/website/docs/docs/core/connect-data-platform/teradata-setup.md index 7b964b23b3d..f4ffbe37f35 100644 --- a/website/docs/docs/core/connect-data-platform/teradata-setup.md +++ b/website/docs/docs/core/connect-data-platform/teradata-setup.md @@ -8,7 +8,7 @@ meta: github_repo: 'Teradata/dbt-teradata' pypi_package: 'dbt-teradata' min_core_version: 'v0.21.0' - cloud_support: Not Supported + cloud_support: Supported min_supported_version: 'n/a' slack_channel_name: '#db-teradata' slack_channel_link: 'https://getdbt.slack.com/archives/C027B6BHMT3' @@ -18,6 +18,7 @@ meta: Some core functionality may be limited. If you're interested in contributing, check out the source code in the repository listed in the next section. + import SetUpPages from '/snippets/_setup-pages-intro.md'; @@ -26,17 +27,17 @@ import SetUpPages from '/snippets/_setup-pages-intro.md'; ## Python compatibility -| Plugin version | Python 3.9 | Python 3.10 | Python 3.11 | -| -------------- | ----------- | ----------- | ------------ | -|1.0.0.x | ✅ | ❌ | ❌ -|1.1.x.x | ✅ | ✅ | ❌ -|1.2.x.x | ✅ | ✅ | ❌ -|1.3.x.x | ✅ | ✅ | ❌ -|1.4.x.x | ✅ | ✅ | ✅ -|1.5.x | ✅ | ✅ | ✅ -|1.6.x | ✅ | ✅ | ✅ -|1.7.x | ✅ | ✅ | ✅ -|1.8.x | ✅ | ✅ | ✅ +| Plugin version | Python 3.9 | Python 3.10 | Python 3.11 | Python 3.12 | +|----------------|------------|-------------|-------------|-------------| +| 1.0.0.x | ✅ | ❌ | ❌ | ❌ | +| 1.1.x.x | ✅ | ✅ | ❌ | ❌ | +| 1.2.x.x | ✅ | ✅ | ❌ | ❌ | +| 1.3.x.x | ✅ | ✅ | ❌ | ❌ | +| 1.4.x.x | ✅ | ✅ | ✅ | ❌ | +| 1.5.x | ✅ | ✅ | ✅ | ❌ | +| 1.6.x | ✅ | ✅ | ✅ | ❌ | +| 1.7.x | ✅ | ✅ | ✅ | ❌ | +| 1.8.x | ✅ | ✅ | ✅ | ✅ | ## dbt dependent packages version compatibility @@ -46,6 +47,8 @@ import SetUpPages from '/snippets/_setup-pages-intro.md'; | 1.6.7 | 1.6.7 | 1.1.1 | 1.1.1 | | 1.7.x | 1.7.x | 1.1.1 | 1.1.1 | | 1.8.x | 1.8.x | 1.1.1 | 1.1.1 | +| 1.8.x | 1.8.x | 1.2.0 | 1.2.0 | +| 1.8.x | 1.8.x | 1.3.0 | 1.3.0 | ### Connecting to Teradata diff --git a/website/docs/docs/dbt-versions/release-notes.md b/website/docs/docs/dbt-versions/release-notes.md index c3f0bbfbe06..6759a026e02 100644 --- a/website/docs/docs/dbt-versions/release-notes.md +++ b/website/docs/docs/dbt-versions/release-notes.md @@ -24,6 +24,7 @@ Release notes are grouped by month for both multi-tenant and virtual private clo - Improved handling of queries when multiple tables are selected in a data source. - Fixed a bug when an IN filter contained a lot of values. - Better error messaging for queries that can't be parsed correctly. +- **Enhancement**: The dbt Semantic Layer supports creating new credentials for users who don't have permissions to create service tokens. In the **Credentials & service tokens** side panel, the **+Add Service Token** option is unavailable for those users who don't have permission. Instead, the side panel displays a message indicating that the user doesn't have permission to create a service token and should contact their administration. Refer to [Set up dbt Semantic Layer](/docs/use-dbt-semantic-layer/setup-sl) for more details. ## October 2024 diff --git a/website/docs/docs/dbt-versions/upgrade-dbt-version-in-cloud.md b/website/docs/docs/dbt-versions/upgrade-dbt-version-in-cloud.md index 35758d46afd..cfe27d5e9d7 100644 --- a/website/docs/docs/dbt-versions/upgrade-dbt-version-in-cloud.md +++ b/website/docs/docs/dbt-versions/upgrade-dbt-version-in-cloud.md @@ -23,7 +23,7 @@ To upgrade an environment in the [dbt Cloud Admin API](/docs/dbt-cloud-apis/admi Configure your project to use a different dbt Core version than what's configured in your [development environment](/docs/dbt-cloud-environments#types-of-environments). This _override_ only affects your user account, no one else's. Use this to safely test new dbt features before upgrading the dbt version for your projects. -1. From the gear menu, select **Profile settings**. +1. Click your account name from the left side panel and select **Account settings**. 1. Choose **Credentials** from the sidebar and select a project. This opens a side panel. 1. In the side panel, click **Edit** and scroll to the **User development settings** section. Choose a version from the **dbt version** dropdown and click **Save**. @@ -41,7 +41,7 @@ Configure your project to use a different dbt Core version than what's configure Each job in dbt Cloud can be configured to inherit parameters from the environment it belongs to. - + The example job seen in the screenshot above belongs to the environment "Prod". It inherits the dbt version of its environment as shown by the **Inherited from ENVIRONMENT_NAME (DBT_VERSION)** selection. You may also manually override the dbt version of a specific job to be any of the current Core releases supported by Cloud by selecting another option from the dropdown. diff --git a/website/docs/docs/deploy/ci-jobs.md b/website/docs/docs/deploy/ci-jobs.md index 12d880d1543..7ab7f65796d 100644 --- a/website/docs/docs/deploy/ci-jobs.md +++ b/website/docs/docs/deploy/ci-jobs.md @@ -95,11 +95,15 @@ Automatically test your semantic nodes (metrics, semantic models, and saved quer To do this, add the command `dbt sl validate --select state:modified+` in the CI job. This ensures the validation of modified semantic nodes and their downstream dependencies. + + +#### Benefits - Testing semantic nodes in a CI job supports deferral and selection of semantic nodes. - It allows you to catch issues early in the development process and deliver high-quality data to your end users. - Semantic validation executes an explain query in the data warehouse for semantic nodes to ensure the generated SQL will execute. - For semantic nodes and models that aren't downstream of modified models, dbt Cloud defers to the production models +### Set up semantic validations in your CI job To learn how to set this up, refer to the following steps: 1. Navigate to the **Job setting** page and click **Edit**. diff --git a/website/docs/docs/deploy/job-notifications.md b/website/docs/docs/deploy/job-notifications.md index 62c51461ab2..d898bc813e0 100644 --- a/website/docs/docs/deploy/job-notifications.md +++ b/website/docs/docs/deploy/job-notifications.md @@ -56,7 +56,7 @@ If there has been a change in user roles or Slack permissions where you no longe ::: ### Prerequisites -- You must be an administrator of the Slack workspace. +- You must be a Slack Workspace Owner. - You must be an account admin to configure Slack notifications in dbt Cloud. For more details, refer to [Users and licenses](/docs/cloud/manage-access/seats-and-users). - The integration only supports _public_ channels in the Slack workspace. diff --git a/website/docs/docs/deploy/webhooks.md b/website/docs/docs/deploy/webhooks.md index ffea38b5b84..52ce2a1fe56 100644 --- a/website/docs/docs/deploy/webhooks.md +++ b/website/docs/docs/deploy/webhooks.md @@ -36,7 +36,7 @@ You can also check out the free [dbt Fundamentals course](https://learn.getdbt.c ## Create a webhook subscription {#create-a-webhook-subscription} -From your **Account Settings** in dbt Cloud (using the gear menu in the top right corner), click **Create New Webhook** in the **Webhooks** section. You can find the appropriate dbt Cloud access URL for your region and plan with [Regions & IP addresses](/docs/cloud/about-cloud/access-regions-ip-addresses). +Navigate to **Account settings** in dbt Cloud (by clicking your account name from the left side panel), and click **Create New Webhook** in the **Webhooks** section. You can find the appropriate dbt Cloud access URL for your region and plan with [Regions & IP addresses](/docs/cloud/about-cloud/access-regions-ip-addresses). To configure your new webhook: diff --git a/website/docs/faqs/Accounts/change-billing.md b/website/docs/faqs/Accounts/change-billing.md index 11290728c98..2b2aa607c16 100644 --- a/website/docs/faqs/Accounts/change-billing.md +++ b/website/docs/faqs/Accounts/change-billing.md @@ -6,6 +6,6 @@ id: change-billing --- -If you want to change your account's credit card details, select the gear menu in the upper right corner of dbt Cloud. Go to Account Settings → Billing → Payment Information. Enter the new credit card details on the respective fields then click on **Update payment information**. Only the _account owner_ can make this change. +If you want to change your account's credit card details, go to the left side panel, click **Account settings** → **Billing** → scroll to **Payment information**. Enter the new credit card details on the respective fields then click on **Update payment information**. Only the _account owner_ can make this change. To change your billing name or location address, send our Support team a message at support@getdbt.com with the newly updated information, and we can make that change for you! diff --git a/website/docs/guides/azure-synapse-analytics-qs.md b/website/docs/guides/azure-synapse-analytics-qs.md index 4f0285e6623..94beddfec80 100644 --- a/website/docs/guides/azure-synapse-analytics-qs.md +++ b/website/docs/guides/azure-synapse-analytics-qs.md @@ -92,7 +92,7 @@ In this quickstart guide, you'll learn how to use dbt Cloud with [Azure Synapse ## Connect dbt Cloud to Azure Synapse Analytics -1. Create a new project in dbt Cloud. Open the gear menu in the top right corner, select **Account settings** and click **+ New Project**. +1. Create a new project in dbt Cloud. Click on your account name in the left side menu, select **Account settings**, and click **+ New Project**. 2. Enter a project name and click **Continue**. 3. Choose **Synapse** as your connection and click **Next**. 4. In the **Configure your environment** section, enter the **Settings** for your new project: diff --git a/website/docs/guides/bigquery-qs.md b/website/docs/guides/bigquery-qs.md index 19a4ff8fbb0..0820c23934d 100644 --- a/website/docs/guides/bigquery-qs.md +++ b/website/docs/guides/bigquery-qs.md @@ -85,7 +85,7 @@ In order to let dbt connect to your warehouse, you'll need to generate a keyfile 3. Create a service account key for your new project from the [Service accounts page](https://console.cloud.google.com/iam-admin/serviceaccounts?walkthrough_id=iam--create-service-account-keys&start_index=1#step_index=1). For more information, refer to [Create a service account key](https://cloud.google.com/iam/docs/creating-managing-service-account-keys#creating) in the Google Cloud docs. When downloading the JSON file, make sure to use a filename you can easily remember. For example, `dbt-user-creds.json`. For security reasons, dbt Labs recommends that you protect this JSON file like you would your identity credentials; for example, don't check the JSON file into your version control software. ## Connect dbt Cloud to BigQuery​ -1. Create a new project in [dbt Cloud](/docs/cloud/about-cloud/access-regions-ip-addresses). From **Account settings** (using the gear menu in the top right corner), click **+ New Project**. +1. Create a new project in [dbt Cloud](/docs/cloud/about-cloud/access-regions-ip-addresses). Navigate to **Account settings** (by clicking on your account name in the left side menu), and click **+ New Project**. 2. Enter a project name and click **Continue**. 3. For the warehouse, click **BigQuery** then **Next** to set up your connection. 4. Click **Upload a Service Account JSON File** in settings. diff --git a/website/docs/guides/microsoft-fabric-qs.md b/website/docs/guides/microsoft-fabric-qs.md index 157ab2e6b89..6bacf4177df 100644 --- a/website/docs/guides/microsoft-fabric-qs.md +++ b/website/docs/guides/microsoft-fabric-qs.md @@ -101,7 +101,7 @@ In this quickstart guide, you'll learn how to use dbt Cloud with [Microsoft Fabr ## Connect dbt Cloud to Microsoft Fabric -1. Create a new project in dbt Cloud. From **Account settings** (using the gear menu in the top right corner), click **+ New Project**. +1. Create a new project in dbt Cloud. Navigate to **Account settings** (by clicking on your account name in the left side menu), and click **+ New Project**. 2. Enter a project name and click **Continue**. 3. Choose **Fabric** as your connection and click **Next**. 4. In the **Configure your environment** section, enter the **Settings** for your new project: diff --git a/website/docs/guides/redshift-qs.md b/website/docs/guides/redshift-qs.md index 8b950472506..83fafad1d12 100644 --- a/website/docs/guides/redshift-qs.md +++ b/website/docs/guides/redshift-qs.md @@ -170,7 +170,7 @@ Now we are going to load our sample data into the S3 bucket that our Cloudformat ``` ## Connect dbt Cloud to Redshift -1. Create a new project in [dbt Cloud](/docs/cloud/about-cloud/access-regions-ip-addresses). From **Account settings** (using the gear menu in the top right corner), click **+ New Project**. +1. Create a new project in [dbt Cloud](/docs/cloud/about-cloud/access-regions-ip-addresses). Navigate to **Account settings** (by clicking on your account name in the left side menu), and click **+ New Project**. 2. Enter a project name and click **Continue**. 3. For the warehouse, click **Redshift** then **Next** to set up your connection. 4. Enter your Redshift settings. Reference your credentials you saved from the CloudFormation template. diff --git a/website/docs/guides/sl-snowflake-qs.md b/website/docs/guides/sl-snowflake-qs.md index b5a0e559c5b..d9de3f0e5fd 100644 --- a/website/docs/guides/sl-snowflake-qs.md +++ b/website/docs/guides/sl-snowflake-qs.md @@ -291,7 +291,7 @@ Using Partner Connect allows you to create a complete dbt account with your [Sno 5. After you have filled out the form and clicked **Complete Registration**, you will be logged into dbt Cloud automatically. -6. From your **Account Settings** in dbt Cloud (using the gear menu in the upper right corner), choose the "Partner Connect Trial" project and select **snowflake** in the overview table. Select **Edit** and update the **Database** field to `analytics` and the **Warehouse** field to `transforming`. +6. Click your account name in the left side menu and select **Account settings**, choose the "Partner Connect Trial" project, and select **snowflake** in the overview table. Select **Edit** and update the **Database** field to `analytics` and the **Warehouse** field to `transforming`. @@ -301,7 +301,7 @@ Using Partner Connect allows you to create a complete dbt account with your [Sno -1. Create a new project in dbt Cloud. From **Account settings** (using the gear menu in the top right corner), click **+ New Project**. +1. Create a new project in dbt Cloud. Navigate to **Account settings** (by clicking on your account name in the left side menu), and click **+ New Project**. 2. Enter a project name and click **Continue**. 3. For the warehouse, click **Snowflake** then **Next** to set up your connection. diff --git a/website/docs/guides/snowflake-qs.md b/website/docs/guides/snowflake-qs.md index bc27d1e1a4f..1eae3a13fb0 100644 --- a/website/docs/guides/snowflake-qs.md +++ b/website/docs/guides/snowflake-qs.md @@ -170,7 +170,7 @@ Using Partner Connect allows you to create a complete dbt account with your [Sno 5. After you have filled out the form and clicked **Complete Registration**, you will be logged into dbt Cloud automatically. -6. From your **Account Settings** in dbt Cloud (using the gear menu in the upper right corner), choose the "Partner Connect Trial" project and select **snowflake** in the overview table. Select edit and update the fields **Database** and **Warehouse** to be `analytics` and `transforming`, respectively. +6. Go to the left side menu and click your account name, then select **Account settings**, choose the "Partner Connect Trial" project, and select **snowflake** in the overview table. Select edit and update the fields **Database** and **Warehouse** to be `analytics` and `transforming`, respectively. @@ -180,7 +180,7 @@ Using Partner Connect allows you to create a complete dbt account with your [Sno -1. Create a new project in dbt Cloud. From **Account settings** (using the gear menu in the top right corner), click **+ New Project**. +1. Create a new project in dbt Cloud. Navigate to **Account settings** (by clicking on your account name in the left side menu), and click **+ New Project**. 2. Enter a project name and click **Continue**. 3. For the warehouse, click **Snowflake** then **Next** to set up your connection. diff --git a/website/snippets/_new-sl-setup.md b/website/snippets/_new-sl-setup.md index 39cd2b22b9a..eccd6db4c09 100644 --- a/website/snippets/_new-sl-setup.md +++ b/website/snippets/_new-sl-setup.md @@ -35,17 +35,22 @@ This credential controls the physical access to underlying data accessed by the *If you're on a Team plan and need to add more credentials, consider upgrading to our [Enterprise plan](https://www.getdbt.com/contact). Enterprise users can refer to [Add more credentials](#4-add-more-credentials) for detailed steps on adding multiple credentials.* -1. After selecting the deployment environment, you should see the **Credentials & service tokens** page. -2. Click the **Add Semantic Layer credential** button. -3. In the **1. Add credentials** section, enter the credentials specific to your data platform that you want the Semantic Layer to use. +#### 1. Select deployment environment + - After selecting the deployment environment, you should see the **Credentials & service tokens** page. + - Click the **Add Semantic Layer credential** button. + +#### 2. Configure credential + - In the **1. Add credentials** section, enter the credentials specific to your data platform that you want the Semantic Layer to use. - Use credentials with minimal privileges. The Semantic Layer requires read access to the schema(s) containing the dbt models used in your semantic models for downstream applications - -4. After adding credentials, scroll to **2. Map new service token**. -5. Name the token and ensure the permission set includes 'Semantic Layer Only' and 'Metadata Only'. -6. Click **Save**. Once the token is generated, you won't be able to view this token again so make sure to record it somewhere safe. +#### 3. Create or link service tokens + - If you have permission to create service tokens, you’ll see the [**Map new service token** option](/docs/use-dbt-semantic-layer/setup-sl#map-service-tokens-to-credentials) after adding the credential. Name the token, set permissions to 'Semantic Layer Only' and 'Metadata Only', and click **Save**. + - Once the token is generated, you won't be able to view this token again, so make sure to record it somewhere safe. + - If you don’t have access to create service tokens, you’ll see a message prompting you to contact your admin to create one for you. Admins can create and link tokens as needed. + :::info - Team plans can create multiple service tokens that link to a single underlying credential, but each project can only have one credential. @@ -67,26 +72,28 @@ dbt Cloud Enterprise plans can optionally add multiple credentials and map them We recommend configuring credentials and service tokens to reflect your teams and their roles. For example, create tokens or credentials that align with your team's needs, such as providing access to finance-related schemas to the Finance team. -Note that: + + - Admins can link multiple service tokens to a single credential within a project, but each service token can only be linked to one credential per project. - When you send a request through the APIs, the service token of the linked credential will follow access policies of the underlying view and tables used to build your semantic layer requests. - - -To add multiple credentials and map them to service tokens: - -1. After configuring your environment, on the **Credentials & service tokens** page, click the **Add Semantic Layer credential** button to create multiple credentials and map them to a service token. -2. In the **1. Add credentials** section, fill in the data platform's credential fields. We recommend using “read-only” credentials. - - -3. In the **2. Map new service token** section, map a service token to the credential you configured in the previous step. dbt Cloud automatically selects the service token permission set you need (Semantic Layer Only and Metadata Only). - -4. To add another service token during configuration, click **Add Service Token**. -5. You can link more service tokens to the same credential later on in the **Semantic Layer Configuration Details** page. To add another service token to an existing Semantic Layer configuration, click **Add service token** under the **Linked service tokens** section. -6. Click **Save** to link the service token to the credential. Remember to copy and save the service token securely, as it won't be viewable again after generation. + + +#### 1. Add more credentials +- After configuring your environment, on the **Credentials & service tokens** page, click the **Add Semantic Layer credential** button to create multiple credentials and map them to a service token.
+- In the **1. Add credentials** section, fill in the data platform's credential fields. We recommend using “read-only” credentials. + + +#### 2. Map service tokens to credentials +- In the **2. Map new service token** section, [map a service token to the credential](/docs/use-dbt-semantic-layer/setup-sl#map-service-tokens-to-credentials) you configured in the previous step. dbt Cloud automatically selects the service token permission set you need (Semantic Layer Only and Metadata Only). +- To add another service token during configuration, click **Add Service Token**. +- You can link more service tokens to the same credential later on in the **Semantic Layer Configuration Details** page. To add another service token to an existing Semantic Layer configuration, click **Add service token** under the **Linked service tokens** section. +- Click **Save** to link the service token to the credential. Remember to copy and save the service token securely, as it won't be viewable again after generation. -7. To delete a credential, go back to the **Credentials & service tokens** page. -8. Under **Linked Service Tokens**, click **Edit** and, select **Delete Credential** to remove a credential. +#### 3. Delete credentials +- To delete a credential, go back to the **Credentials & service tokens** page. +- Under **Linked Service Tokens**, click **Edit** and, select **Delete Credential** to remove a credential. When you delete a credential, any service tokens mapped to that credential in the project will no longer work and will break for any end users. @@ -107,6 +114,15 @@ To re-enable the dbt Semantic Layer setup in the future, you will need to recrea The following are the additional flexible configurations for Semantic Layer credentials. +### Map service tokens to credentials +- After configuring your environment, you can map additional service tokens to the same credential if you have the required [permissions](/docs/cloud/manage-access/about-user-access#permission-sets). +- Go to the **Credentials & service tokens** page and click the **+Add Service Token** button in the **Linked Service Tokens** section. +- Type the service token name and select the permission set you need (Semantic Layer Only and Metadata Only). +- Click **Save** to link the service token to the credential. +- Remember to copy and save the service token securely, as it won't be viewable again after generation. + + + ### Unlink service tokens - Unlink a service token from the credential by clicking **Unlink** under the **Linked service tokens** section. If you try to query the Semantic Layer with an unlinked credential, you'll experience an error in your BI tool because no valid token is mapped. @@ -115,7 +131,7 @@ To re-enable the dbt Semantic Layer setup in the future, you will need to recrea - View your Semantic Layer credential directly by navigating to the **API tokens** and then **Service tokens** page. - Select the service token to view the credential it's linked to. This is useful if you want to know which service tokens are mapped to credentials in your project. -**Create a new service token** +#### Create a new service token - From the **Service tokens** page, create a new service token and map it to the credential(s) (assuming the semantic layer permission exists). This is useful if you want to create a new service token and directly map it to a credential in your project. - Make sure to select the correct permission set for the service token (Semantic Layer Only and Metadata Only). diff --git a/website/static/img/blog/authors/faith_pic.png b/website/static/img/blog/authors/faith_pic.png new file mode 100644 index 00000000000..3635183bba3 Binary files /dev/null and b/website/static/img/blog/authors/faith_pic.png differ diff --git a/website/static/img/blog/authors/jerrie.jpg b/website/static/img/blog/authors/jerrie.jpg new file mode 100644 index 00000000000..9ae49d2fffe Binary files /dev/null and b/website/static/img/blog/authors/jerrie.jpg differ diff --git a/website/static/img/docs/dbt-cloud/cloud-configuring-dbt-cloud/choosing-dbt-version/job-settings.png b/website/static/img/docs/dbt-cloud/cloud-configuring-dbt-cloud/choosing-dbt-version/job-settings.png index 8048df4c67a..bbbd852efbe 100644 Binary files a/website/static/img/docs/dbt-cloud/cloud-configuring-dbt-cloud/choosing-dbt-version/job-settings.png and b/website/static/img/docs/dbt-cloud/cloud-configuring-dbt-cloud/choosing-dbt-version/job-settings.png differ diff --git a/website/static/img/docs/dbt-cloud/deployment/sl-ci-job.png b/website/static/img/docs/dbt-cloud/deployment/sl-ci-job.png new file mode 100644 index 00000000000..e64822e1fe7 Binary files /dev/null and b/website/static/img/docs/dbt-cloud/deployment/sl-ci-job.png differ diff --git a/website/static/img/docs/dbt-cloud/semantic-layer/sl-add-service-token.gif b/website/static/img/docs/dbt-cloud/semantic-layer/sl-add-service-token.gif new file mode 100644 index 00000000000..a27df85e8ec Binary files /dev/null and b/website/static/img/docs/dbt-cloud/semantic-layer/sl-add-service-token.gif differ diff --git a/website/static/img/docs/dbt-cloud/semantic-layer/sl-credential-no-service-token.jpg b/website/static/img/docs/dbt-cloud/semantic-layer/sl-credential-no-service-token.jpg new file mode 100644 index 00000000000..5a6ab83d96b Binary files /dev/null and b/website/static/img/docs/dbt-cloud/semantic-layer/sl-credential-no-service-token.jpg differ diff --git a/website/static/img/docs/dbt-cloud/using-dbt-cloud/dbt-cloud-enterprise/BQ-auth/dbt-cloud-bq-id-secret-02.png b/website/static/img/docs/dbt-cloud/using-dbt-cloud/dbt-cloud-enterprise/BQ-auth/dbt-cloud-bq-id-secret-02.png new file mode 100644 index 00000000000..40d1a6b3be8 Binary files /dev/null and b/website/static/img/docs/dbt-cloud/using-dbt-cloud/dbt-cloud-enterprise/BQ-auth/dbt-cloud-bq-id-secret-02.png differ diff --git a/website/static/img/docs/dbt-versions/experimental-feats.png b/website/static/img/docs/dbt-versions/experimental-feats.png index 93764f66b7c..fb1a4dbaf87 100644 Binary files a/website/static/img/docs/dbt-versions/experimental-feats.png and b/website/static/img/docs/dbt-versions/experimental-feats.png differ