Merge branch 'current' into mirnawong1-patch-22

dbt-labs · Jan 25, 2024 · c13eb9e · c13eb9e
2 parents f7b9968 + 191f4fb
commit c13eb9e
Show file tree

Hide file tree

Showing 34 changed files with 354 additions and 198 deletions.
diff --git a/contributing/developer-blog.md b/contributing/developer-blog.md
@@ -6,7 +6,7 @@
 
 The dbt Developer Blog is a place where analytics practitioners can go to share their knowledge with the community. Analytics Engineering is a discipline we’re all  building together. The developer blog exists to cultivate the collective knowledge that exists on how to build and scale effective data teams.
 
-We currently have editorial capacity for 10 Community contributed developer blogs per quarter - if we are oversubscribed we suggest you post on another platform or hold off until the editorial team is ready to take on more posts.
+We currently have editorial capacity for a few Community contributed developer blogs per quarter - if we are oversubscribed we suggest you post on another platform or hold off until the editorial team is ready to take on more posts.
 
 ### What makes a good developer blog post?
 

diff --git a/...cs/best-practices/how-we-build-our-metrics/semantic-layer-6-advanced-metrics.md b/...cs/best-practices/how-we-build-our-metrics/semantic-layer-6-advanced-metrics.md
@@ -21,13 +21,11 @@ We're not limited to just passing measures through to our metrics, we can also _
 
 ```YAML
  - name: food_revenue
-    description: The revenue from food in each order.
-    label: Food Revenue
-    type: simple
-    type_params:
-      measure: revenue
-      filter: |
-        {{ Dimension('order__is_food_order') }} = true
+   description: The revenue from food in each order.
+   label: Food Revenue
+   type: simple
+   type_params:
+     measure: food_revenue
 ```
 
 - 📝 Now we can set up our ratio metric.

diff --git a/website/docs/best-practices/how-we-style/1-how-we-style-our-dbt-models.md b/website/docs/best-practices/how-we-style/1-how-we-style-our-dbt-models.md
@@ -8,6 +8,10 @@ id: 1-how-we-style-our-dbt-models
 - 👥 Models should be pluralized, for example, `customers`, `orders`, `products`.
 - 🔑 Each model should have a primary key.
 - 🔑 The primary key of a model should be named `<object>_id`, for example, `account_id`. This makes it easier to know what `id` is being referenced in downstream joined models.
+- Use underscores for naming dbt models; avoid dots.
+  - ✅  `models_without_dots`
+  - ❌ `models.with.dots`
+  - Most data platforms use dots to separate `database.schema.object`, so using underscores instead of dots reduces your need for [quoting](/reference/resource-properties/quoting) as well as the risk of issues in certain parts of dbt Cloud. For more background, refer to [this GitHub issue](https://github.com/dbt-labs/dbt-core/issues/3246).
 - 🔑 Keys should be string data types.
 - 🔑 Consistency is key! Use the same field names across models where possible. For example, a key to the `customers` table should be named `customer_id` rather than `user_id` or 'id'.
 - ❌ Do not use abbreviations or aliases. Emphasize readability over brevity. For example, do not use `cust` for `customer` or `o` for `orders`.

diff --git a/website/docs/docs/build/conversion-metrics.md b/website/docs/docs/build/conversion-metrics.md
@@ -32,16 +32,20 @@ The specification for conversion metrics is as follows:
 | `constant_properties` | List of constant properties.  | List | Optional |
 | `base_property` | The property from the base semantic model that you want to hold constant.  | Entity or Dimension | Optional |
 | `conversion_property` | The property from the conversion semantic model that you want to hold constant.  | Entity or Dimension | Optional |
+| `fill_nulls_with` | Set the value in your metric definition instead of null (such as zero). | String | Optional |
+
+Refer to [additional settings](#additional-settings) to learn how to customize conversion metrics with settings for null values, calculation type, and constant properties.
 
 The following code example displays the complete specification for conversion metrics and details how they're applied:
 
 ```yaml
 metrics:
   - name: The metric name # Required
-    description: the metric description # Optional
+    description: The metric description # Optional
     type: conversion # Required
     label: # Required
     type_params: # Required
+      fills_nulls_with: Set the value in your metric definition instead of null (such as zero) # Optional
       conversion_type_params: # Required
         entity: ENTITY # Required
         calculation: CALCULATION_TYPE # Optional. default: conversion_rate. options: conversions(buys) or conversion_rate (buys/visits), and more to come.
@@ -89,6 +93,7 @@ Next, define a conversion metric as follows:
   type: conversion
   label: Visit to Buy Conversion Rate (7-day window)
   type_params:
+    fills_nulls_with: 0
     conversion_type_params:
       base_measure: visits
       conversion_measure: sellers
@@ -117,7 +122,7 @@ inner join (
     select *, uuid_string() as uuid from buys -- Adds a uuid column to uniquely identify the different rows
 ) b
 on
-v.user_id = b.user_id and v.ds <= b.ds and v.ds > b.ds - interval '7 day'
+v.user_id = b.user_id and v.ds <= b.ds and v.ds > b.ds - interval '7 days'
 ```
 
 The dataset returns the following (note that there are two potential conversion events for the first visit):
@@ -147,7 +152,6 @@ inner join (
 ) b
 on
 v.user_id = b.user_id and v.ds <= b.ds and v.ds > b.ds - interval '7 day'
-
 ```
 
 The dataset returns the following:
@@ -249,7 +253,7 @@ Use the following additional settings to customize your conversion metrics:
 To return zero in the final data set, you can set the value of a null conversion event to zero instead of null. You can add the `fill_nulls_with` parameter to your conversion metric definition like this:
 
 ```yaml
-- name: vist_to_buy_conversion_rate_7_day_window
+- name: visit_to_buy_conversion_rate_7_day_window
   description: "Conversion rate from viewing a page to making a purchase"
   type: conversion
   label: Visit to Seller Conversion Rate (7 day window)
@@ -345,7 +349,6 @@ on
   and v.ds <= buy_source.ds
   and v.ds > buy_source.ds - interval '7 day'
   and buy_source.product_id = v.product_id --Joining on the constant property product_id
-
 ```
 
 </TabItem>

diff --git a/website/docs/docs/build/cumulative-metrics.md b/website/docs/docs/build/cumulative-metrics.md
@@ -20,6 +20,7 @@ This metric is common for calculating things like weekly active users, or month-
 | `measure` | The measure you are referencing. | Required |
 | `window` | The accumulation window, such as 1 month, 7 days, 1 year. This can't be used with `grain_to_date`. | Optional  |
 | `grain_to_date` | Sets the accumulation grain, such as month will accumulate data for one month. Then restart at the beginning of the next. This can't be used with `window`. | Optional |
+| `fill_nulls_with` | Set the value in your metric definition instead of null (such as zero).| Optional |
 
 The following displays the complete specification for cumulative metrics, along with an example:
 
@@ -30,13 +31,15 @@ metrics:
     type: cumulative # Required
     label: The value that will be displayed in downstream tools # Required
     type_params: # Required
+      fill_nulls_with: Set the value in your metric definition instead of null (such as zero) # Optional
       measure: The measure you are referencing # Required
       window: The accumulation window, such as 1 month, 7 days, 1 year. # Optional. Cannot be used with grain_to_date
       grain_to_date: Sets the accumulation grain, such as month will accumulate data for one month, then restart at the beginning of the next.  # Optional. Cannot be used with window
 
 ```
 
 ## Limitations
+
 Cumulative metrics are currently under active development and have the following limitations:
 - You are required to use [`metric_time` dimension](/docs/build/dimensions#time) when querying cumulative metrics. If you don't use `metric_time` in the query, the cumulative metric will return incorrect results because it won't perform the time spine join. This means you cannot reference time dimensions other than the `metric_time` in the query.
 
@@ -59,19 +62,22 @@ metrics:
     description: The cumulative value of all orders
     type: cumulative
     type_params:
+      fill_nulls_with: 0
       measure: order_total
   - name: cumulative_order_total_l1m
     label: Cumulative Order total (L1M)   
     description: Trailing 1 month cumulative order amount
     type: cumulative
     type_params:
+      fills_nulls_with: 0
       measure: order_total
       window: 1 month
   - name: cumulative_order_total_mtd
     label: Cumulative Order total (MTD)
     description: The month to date value of all orders
     type: cumulative
     type_params:
+      fills_nulls_with: 0
       measure: order_total
       grain_to_date: month
 ```
@@ -201,16 +207,16 @@ The current method connects the metric table to a timespine table using the prim
 
 ``` sql
 select
-  count(distinct distinct_users) as weekly_active_users
-  , metric_time
+  count(distinct distinct_users) as weekly_active_users,
+  metric_time
 from (
   select
-    subq_3.distinct_users as distinct_users
-    , subq_3.metric_time as metric_time
+    subq_3.distinct_users as distinct_users,
+    subq_3.metric_time as metric_time
   from (
     select
-      subq_2.distinct_users as distinct_users
-      , subq_1.metric_time as metric_time
+      subq_2.distinct_users as distinct_users,
+      subq_1.metric_time as metric_time
     from (
       select
         metric_time
@@ -223,8 +229,8 @@ from (
     ) subq_1
     inner join (
       select
-        distinct_users as distinct_users
-        , date_trunc('day', ds) as metric_time
+        distinct_users as distinct_users,
+        date_trunc('day', ds) as metric_time
       from demo_schema.transactions transactions_src_426
       where (
         (date_trunc('day', ds)) >= cast('1999-12-26' as timestamp)
@@ -241,6 +247,7 @@ from (
   ) subq_3
 )
 group by
-  metric_time
-limit 100
+  metric_time,
+limit 100;
+
 ```
diff --git a/website/docs/docs/build/derived-metrics.md b/website/docs/docs/build/derived-metrics.md
@@ -21,6 +21,7 @@ In MetricFlow, derived metrics are metrics created by defining an expression usi
 | `metrics` |  The list of metrics used in the derived metrics. | Required  |
 | `alias` | Optional alias for the metric that you can use in the expr. | Optional |
 | `filter` | Optional filter to apply to the metric. | Optional |
+| `fill_nulls_with` | Set the value in your metric definition instead of null (such as zero). | Optional |
 | `offset_window` | Set the period for the offset window, such as 1 month. This will return the value of the metric one month from the metric time.  | Optional |
 
 The following displays the complete specification for derived metrics, along with an example.
@@ -32,6 +33,7 @@ metrics:
     type: derived # Required
     label: The value that will be displayed in downstream tools #Required
     type_params: # Required
+      fill_nulls_with: Set the value in your metric definition instead of null (such as zero) # Optional
       expr: the derived expression # Required
       metrics: # The list of metrics used in the derived metrics # Required
         - name: the name of the metrics. must reference a metric you have already defined # Required
@@ -49,6 +51,7 @@ metrics:
     type: derived
     label: Order Gross Profit
     type_params:
+      fill_nulls_with: 0
       expr: revenue - cost
       metrics:
         - name: order_total
@@ -60,6 +63,7 @@ metrics:
     description: "The gross profit for each food order."
     type: derived
     type_params:
+      fill_nulls_with: 0
       expr: revenue - cost
       metrics:
         - name: order_total
@@ -96,6 +100,7 @@ The following example displays how you can calculate monthly revenue growth usin
   description: Percentage of customers that are active now and those active 1 month ago
   label: customer_retention
   type_params:
+    fill_nulls_with: 0
     expr: (active_customers/ active_customers_prev_month)
     metrics:
       - name: active_customers
@@ -115,6 +120,7 @@ You can query any granularity and offset window combination. The following examp
   type: derived
   label: d7 Bookings Change
   type_params:
+    fill_nulls_with: 0
     expr: bookings - bookings_7_days_ago
     metrics:
       - name: bookings
@@ -126,10 +132,10 @@ You can query any granularity and offset window combination. The following examp
 
 When you run the query  `dbt sl query --metrics d7_booking_change --group-by metric_time__month` for the metric, here's how it's calculated. For dbt Core, you can use the `mf query` prefix. 
 
-1. We retrieve the raw, unaggregated dataset with the specified measures and dimensions at the smallest level of detail, which is currently 'day'.
-2. Then, we perform an offset join on the daily dataset, followed by performing a date trunc and aggregation to the requested granularity.
+1. Retrieve the raw, unaggregated dataset with the specified measures and dimensions at the smallest level of detail, which is currently 'day'.
+2. Then, perform an offset join on the daily dataset, followed by performing a date trunc and aggregation to the requested granularity.
    For example, to calculate `d7_booking_change` for July 2017: 
-   - First, we sum up all the booking values for each day in July to calculate the bookings metric.
+   - First, sum up all the booking values for each day in July to calculate the bookings metric.
    - The following table displays the range of days that make up this monthly aggregation.
 
 |   | Orders | Metric_time |
@@ -139,7 +145,7 @@ When you run the query  `dbt sl query --metrics d7_booking_change --group-by met
 |   | 78 | 2017-07-01 |
 | Total  | 7438 | 2017-07-01 |
 
-3. Next, we calculate July's bookings with a 7-day offset. The following table displays the range of days that make up this monthly aggregation. Note that the month begins 7 days later (offset by 7 days) on 2017-07-24.
+3. Calculate July's bookings with a 7-day offset. The following table displays the range of days that make up this monthly aggregation. Note that the month begins 7 days later (offset by 7 days) on 2017-07-24.
 
 |   | Orders | Metric_time |
 | - | ---- | -------- |
@@ -148,7 +154,7 @@ When you run the query  `dbt sl query --metrics d7_booking_change --group-by met
 |   | 83 | 2017-06-24 |
 | Total  | 7252 | 2017-07-01 |
 
-4. Lastly, we calculate the derived metric and return the final result set:
+4. Lastly, calculate the derived metric and return the final result set:
 
 ```bash
 bookings - bookings_7_days_ago would be compile as 7438 - 7252 = 186. 

diff --git a/website/docs/docs/build/materializations.md b/website/docs/docs/build/materializations.md
@@ -100,8 +100,9 @@ When using the `table` materialization, your model is rebuilt as a <Term id="tab
   - Ephemeral models can help keep your <Term id="data-warehouse" /> clean by reducing clutter (also consider splitting your models across multiple schemas by [using custom schemas](/docs/build/custom-schemas)).
 * **Cons:**
     * You cannot select directly from this model.
-    * Operations (e.g. macros called via `dbt run-operation` cannot `ref()` ephemeral nodes)
+    * [Operations](/docs/build/hooks-operations#about-operations) (for example, macros called using [`dbt run-operation`](/reference/commands/run-operation) cannot `ref()` ephemeral nodes)
     * Overuse of ephemeral materialization can also make queries harder to debug.
+    * Ephemeral materialization doesn't support [model contracts](/docs/collaborate/govern/model-contracts#where-are-contracts-supported).
 * **Advice:**  Use the ephemeral materialization for:
     * very light-weight transformations that are early on in your DAG
     * are only used in one or two downstream models, and

diff --git a/website/docs/docs/build/metrics-overview.md b/website/docs/docs/build/metrics-overview.md
@@ -9,7 +9,7 @@ pagination_next: "docs/build/cumulative"
 
 Once you've created your semantic models, it's time to start adding metrics! Metrics can be defined in the same YAML files as your semantic models, or split into separate YAML files into any other subdirectories (provided that these subdirectories are also within the same dbt project repo)
 
-The keys for metrics definitions are: 
+The keys for metrics definitions are:
 
 | Parameter | Description | Type |
 | --------- | ----------- | ---- |
@@ -22,7 +22,6 @@ The keys for metrics definitions are:
 | `filter` | You can optionally add a filter string to any metric type, applying filters to dimensions, entities, or time dimensions during metric computation. Consider it as your WHERE clause.   | Optional |
 |  `meta` | Additional metadata you want to add to your metric. | Optional |
 
-
 Here's a complete example of the metrics spec configuration:
 
 ```yaml
@@ -39,14 +38,7 @@ metrics:
       null
 ```
 
-This page explains the different supported metric types you can add to your dbt project. 
-<!--
-- [Cumulative](#cumulative-metrics) — Cumulative metrics aggregate a measure over a given window.
-- [Derived](#derived-metrics) — An expression of other metrics, which allows you to do calculations on top of metrics.
-- [Expression](#expression-metrics) — Allow measures to be modified using a SQL expression.
-- [Measure proxy](#measure-proxy-metrics) — Metrics that refer directly to one measure.
-- [Ratio](#ratio-metrics) — Create a ratio out of two measures. 
--->
+This page explains the different supported metric types you can add to your dbt project.
 
 ### Conversion metrics <Lifecycle status='new'/>
 
@@ -55,10 +47,11 @@ This page explains the different supported metric types you can add to your dbt
 ```yaml
 metrics:
   - name: The metric name # Required
-    description: the metric description # Optional
+    description: The metric description # Optional
     type: conversion # Required
     label: # Required
     type_params: # Required
+      fills_nulls_with: Set the value in your metric definition instead of null (such as zero) # Optional
       conversion_type_params: # Required
         entity: ENTITY # Required
         calculation: CALCULATION_TYPE # Optional. default: conversion_rate. options: conversions(buys) or conversion_rate (buys/visits), and more to come.
@@ -82,9 +75,10 @@ metrics:
       - [email protected]
     type: cumulative
     type_params:
+      fills_nulls_with: 0
       measures:
         - distinct_users
-    #Omitting window will accumulate the measure over all time
+    # Omitting window will accumulate the measure over all time
       window: 7 days
 
 ```
@@ -100,6 +94,7 @@ metrics:
     type: derived
     label: Order Gross Profit
     type_params:
+      fills_nulls_with: 0
       expr: revenue - cost
       metrics:
         - name: order_total
@@ -139,6 +134,7 @@ metrics:
 # Define the metrics from the semantic manifest as numerator or denominator
     type: ratio
     type_params:
+      fills_nulls_with: 0
       numerator: cancellations
       denominator: transaction_amount
       filter: |     # add optional constraint string. This applies to both the numerator and denominator
@@ -157,6 +153,7 @@ metrics:
       filter: |   #  add optional constraint string. This applies to both the numerator and denominator
         {{ Dimension('customer__country') }} = 'MX'  
 ```
+
 ### Simple metrics
 
 [Simple metrics](/docs/build/simple) point directly to a measure. You may think of it as a function that takes only one measure as the input.
@@ -171,6 +168,7 @@ metrics:
   - name: cancellations
     type: simple
     type_params:
+      fills_nulls_with: 0
       measure: cancellations_usd  # Specify the measure you are creating a proxy for.
     filter: |
       {{ Dimension('order__value')}} > 100 and {{Dimension('user__acquisition')}}
@@ -187,6 +185,7 @@ filter: |
 filter: |
   {{ TimeDimension('time_dimension', 'granularity') }}
 ```
+
 ### Further configuration 
 
 You can set more metadata for your metrics, which can be used by other tools later on. The way this metadata is used will vary based on the specific integration partner

diff --git a/website/docs/docs/build/models.md b/website/docs/docs/build/models.md
@@ -6,8 +6,6 @@ pagination_next: "docs/build/sql-models"
 pagination_prev: null
 ---
 
-## Overview
-
 dbt Core and Cloud are composed of different moving parts working harmoniously. All of them are important to what dbt does — transforming data—the 'T' in ELT. When you execute `dbt run`, you are running a model that will transform your data without that data ever leaving your warehouse.
 
 Models are where your developers spend most of their time within a dbt environment. Models are primarily written as a `select` statement and saved as a `.sql` file. While the definition is straightforward, the complexity of the execution will vary from environment to environment.  Models will be written and rewritten as needs evolve and your organization finds new ways to maximize efficiency.