Skip to content

Commit

Permalink
Merge branch 'current' into mirnawong1-patch-22
Browse files Browse the repository at this point in the history
  • Loading branch information
mirnawong1 authored Jan 25, 2024
2 parents f7b9968 + 191f4fb commit c13eb9e
Show file tree
Hide file tree
Showing 34 changed files with 354 additions and 198 deletions.
2 changes: 1 addition & 1 deletion contributing/developer-blog.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@

The dbt Developer Blog is a place where analytics practitioners can go to share their knowledge with the community. Analytics Engineering is a discipline we’re all building together. The developer blog exists to cultivate the collective knowledge that exists on how to build and scale effective data teams.

We currently have editorial capacity for 10 Community contributed developer blogs per quarter - if we are oversubscribed we suggest you post on another platform or hold off until the editorial team is ready to take on more posts.
We currently have editorial capacity for a few Community contributed developer blogs per quarter - if we are oversubscribed we suggest you post on another platform or hold off until the editorial team is ready to take on more posts.

### What makes a good developer blog post?

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,13 +21,11 @@ We're not limited to just passing measures through to our metrics, we can also _

```YAML
- name: food_revenue
description: The revenue from food in each order.
label: Food Revenue
type: simple
type_params:
measure: revenue
filter: |
{{ Dimension('order__is_food_order') }} = true
description: The revenue from food in each order.
label: Food Revenue
type: simple
type_params:
measure: food_revenue
```
- 📝 Now we can set up our ratio metric.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,10 @@ id: 1-how-we-style-our-dbt-models
- 👥 Models should be pluralized, for example, `customers`, `orders`, `products`.
- 🔑 Each model should have a primary key.
- 🔑 The primary key of a model should be named `<object>_id`, for example, `account_id`. This makes it easier to know what `id` is being referenced in downstream joined models.
- Use underscores for naming dbt models; avoid dots.
-`models_without_dots`
-`models.with.dots`
- Most data platforms use dots to separate `database.schema.object`, so using underscores instead of dots reduces your need for [quoting](/reference/resource-properties/quoting) as well as the risk of issues in certain parts of dbt Cloud. For more background, refer to [this GitHub issue](https://github.com/dbt-labs/dbt-core/issues/3246).
- 🔑 Keys should be string data types.
- 🔑 Consistency is key! Use the same field names across models where possible. For example, a key to the `customers` table should be named `customer_id` rather than `user_id` or 'id'.
- ❌ Do not use abbreviations or aliases. Emphasize readability over brevity. For example, do not use `cust` for `customer` or `o` for `orders`.
Expand Down
13 changes: 8 additions & 5 deletions website/docs/docs/build/conversion-metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,16 +32,20 @@ The specification for conversion metrics is as follows:
| `constant_properties` | List of constant properties. | List | Optional |
| `base_property` | The property from the base semantic model that you want to hold constant. | Entity or Dimension | Optional |
| `conversion_property` | The property from the conversion semantic model that you want to hold constant. | Entity or Dimension | Optional |
| `fill_nulls_with` | Set the value in your metric definition instead of null (such as zero). | String | Optional |

Refer to [additional settings](#additional-settings) to learn how to customize conversion metrics with settings for null values, calculation type, and constant properties.

The following code example displays the complete specification for conversion metrics and details how they're applied:

```yaml
metrics:
- name: The metric name # Required
description: the metric description # Optional
description: The metric description # Optional
type: conversion # Required
label: # Required
type_params: # Required
fills_nulls_with: Set the value in your metric definition instead of null (such as zero) # Optional
conversion_type_params: # Required
entity: ENTITY # Required
calculation: CALCULATION_TYPE # Optional. default: conversion_rate. options: conversions(buys) or conversion_rate (buys/visits), and more to come.
Expand Down Expand Up @@ -89,6 +93,7 @@ Next, define a conversion metric as follows:
type: conversion
label: Visit to Buy Conversion Rate (7-day window)
type_params:
fills_nulls_with: 0
conversion_type_params:
base_measure: visits
conversion_measure: sellers
Expand Down Expand Up @@ -117,7 +122,7 @@ inner join (
select *, uuid_string() as uuid from buys -- Adds a uuid column to uniquely identify the different rows
) b
on
v.user_id = b.user_id and v.ds <= b.ds and v.ds > b.ds - interval '7 day'
v.user_id = b.user_id and v.ds <= b.ds and v.ds > b.ds - interval '7 days'
```

The dataset returns the following (note that there are two potential conversion events for the first visit):
Expand Down Expand Up @@ -147,7 +152,6 @@ inner join (
) b
on
v.user_id = b.user_id and v.ds <= b.ds and v.ds > b.ds - interval '7 day'
```

The dataset returns the following:
Expand Down Expand Up @@ -249,7 +253,7 @@ Use the following additional settings to customize your conversion metrics:
To return zero in the final data set, you can set the value of a null conversion event to zero instead of null. You can add the `fill_nulls_with` parameter to your conversion metric definition like this:

```yaml
- name: vist_to_buy_conversion_rate_7_day_window
- name: visit_to_buy_conversion_rate_7_day_window
description: "Conversion rate from viewing a page to making a purchase"
type: conversion
label: Visit to Seller Conversion Rate (7 day window)
Expand Down Expand Up @@ -345,7 +349,6 @@ on
and v.ds <= buy_source.ds
and v.ds > buy_source.ds - interval '7 day'
and buy_source.product_id = v.product_id --Joining on the constant property product_id
```

</TabItem>
Expand Down
27 changes: 17 additions & 10 deletions website/docs/docs/build/cumulative-metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ This metric is common for calculating things like weekly active users, or month-
| `measure` | The measure you are referencing. | Required |
| `window` | The accumulation window, such as 1 month, 7 days, 1 year. This can't be used with `grain_to_date`. | Optional |
| `grain_to_date` | Sets the accumulation grain, such as month will accumulate data for one month. Then restart at the beginning of the next. This can't be used with `window`. | Optional |
| `fill_nulls_with` | Set the value in your metric definition instead of null (such as zero).| Optional |

The following displays the complete specification for cumulative metrics, along with an example:

Expand All @@ -30,13 +31,15 @@ metrics:
type: cumulative # Required
label: The value that will be displayed in downstream tools # Required
type_params: # Required
fill_nulls_with: Set the value in your metric definition instead of null (such as zero) # Optional
measure: The measure you are referencing # Required
window: The accumulation window, such as 1 month, 7 days, 1 year. # Optional. Cannot be used with grain_to_date
grain_to_date: Sets the accumulation grain, such as month will accumulate data for one month, then restart at the beginning of the next. # Optional. Cannot be used with window

```

## Limitations

Cumulative metrics are currently under active development and have the following limitations:
- You are required to use [`metric_time` dimension](/docs/build/dimensions#time) when querying cumulative metrics. If you don't use `metric_time` in the query, the cumulative metric will return incorrect results because it won't perform the time spine join. This means you cannot reference time dimensions other than the `metric_time` in the query.

Expand All @@ -59,19 +62,22 @@ metrics:
description: The cumulative value of all orders
type: cumulative
type_params:
fill_nulls_with: 0
measure: order_total
- name: cumulative_order_total_l1m
label: Cumulative Order total (L1M)
description: Trailing 1 month cumulative order amount
type: cumulative
type_params:
fills_nulls_with: 0
measure: order_total
window: 1 month
- name: cumulative_order_total_mtd
label: Cumulative Order total (MTD)
description: The month to date value of all orders
type: cumulative
type_params:
fills_nulls_with: 0
measure: order_total
grain_to_date: month
```
Expand Down Expand Up @@ -201,16 +207,16 @@ The current method connects the metric table to a timespine table using the prim

``` sql
select
count(distinct distinct_users) as weekly_active_users
, metric_time
count(distinct distinct_users) as weekly_active_users,
metric_time
from (
select
subq_3.distinct_users as distinct_users
, subq_3.metric_time as metric_time
subq_3.distinct_users as distinct_users,
subq_3.metric_time as metric_time
from (
select
subq_2.distinct_users as distinct_users
, subq_1.metric_time as metric_time
subq_2.distinct_users as distinct_users,
subq_1.metric_time as metric_time
from (
select
metric_time
Expand All @@ -223,8 +229,8 @@ from (
) subq_1
inner join (
select
distinct_users as distinct_users
, date_trunc('day', ds) as metric_time
distinct_users as distinct_users,
date_trunc('day', ds) as metric_time
from demo_schema.transactions transactions_src_426
where (
(date_trunc('day', ds)) >= cast('1999-12-26' as timestamp)
Expand All @@ -241,6 +247,7 @@ from (
) subq_3
)
group by
metric_time
limit 100
metric_time,
limit 100;
```
16 changes: 11 additions & 5 deletions website/docs/docs/build/derived-metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ In MetricFlow, derived metrics are metrics created by defining an expression usi
| `metrics` | The list of metrics used in the derived metrics. | Required |
| `alias` | Optional alias for the metric that you can use in the expr. | Optional |
| `filter` | Optional filter to apply to the metric. | Optional |
| `fill_nulls_with` | Set the value in your metric definition instead of null (such as zero). | Optional |
| `offset_window` | Set the period for the offset window, such as 1 month. This will return the value of the metric one month from the metric time. | Optional |

The following displays the complete specification for derived metrics, along with an example.
Expand All @@ -32,6 +33,7 @@ metrics:
type: derived # Required
label: The value that will be displayed in downstream tools #Required
type_params: # Required
fill_nulls_with: Set the value in your metric definition instead of null (such as zero) # Optional
expr: the derived expression # Required
metrics: # The list of metrics used in the derived metrics # Required
- name: the name of the metrics. must reference a metric you have already defined # Required
Expand All @@ -49,6 +51,7 @@ metrics:
type: derived
label: Order Gross Profit
type_params:
fill_nulls_with: 0
expr: revenue - cost
metrics:
- name: order_total
Expand All @@ -60,6 +63,7 @@ metrics:
description: "The gross profit for each food order."
type: derived
type_params:
fill_nulls_with: 0
expr: revenue - cost
metrics:
- name: order_total
Expand Down Expand Up @@ -96,6 +100,7 @@ The following example displays how you can calculate monthly revenue growth usin
description: Percentage of customers that are active now and those active 1 month ago
label: customer_retention
type_params:
fill_nulls_with: 0
expr: (active_customers/ active_customers_prev_month)
metrics:
- name: active_customers
Expand All @@ -115,6 +120,7 @@ You can query any granularity and offset window combination. The following examp
type: derived
label: d7 Bookings Change
type_params:
fill_nulls_with: 0
expr: bookings - bookings_7_days_ago
metrics:
- name: bookings
Expand All @@ -126,10 +132,10 @@ You can query any granularity and offset window combination. The following examp

When you run the query `dbt sl query --metrics d7_booking_change --group-by metric_time__month` for the metric, here's how it's calculated. For dbt Core, you can use the `mf query` prefix.

1. We retrieve the raw, unaggregated dataset with the specified measures and dimensions at the smallest level of detail, which is currently 'day'.
2. Then, we perform an offset join on the daily dataset, followed by performing a date trunc and aggregation to the requested granularity.
1. Retrieve the raw, unaggregated dataset with the specified measures and dimensions at the smallest level of detail, which is currently 'day'.
2. Then, perform an offset join on the daily dataset, followed by performing a date trunc and aggregation to the requested granularity.
For example, to calculate `d7_booking_change` for July 2017:
- First, we sum up all the booking values for each day in July to calculate the bookings metric.
- First, sum up all the booking values for each day in July to calculate the bookings metric.
- The following table displays the range of days that make up this monthly aggregation.

| | Orders | Metric_time |
Expand All @@ -139,7 +145,7 @@ When you run the query `dbt sl query --metrics d7_booking_change --group-by met
| | 78 | 2017-07-01 |
| Total | 7438 | 2017-07-01 |

3. Next, we calculate July's bookings with a 7-day offset. The following table displays the range of days that make up this monthly aggregation. Note that the month begins 7 days later (offset by 7 days) on 2017-07-24.
3. Calculate July's bookings with a 7-day offset. The following table displays the range of days that make up this monthly aggregation. Note that the month begins 7 days later (offset by 7 days) on 2017-07-24.

| | Orders | Metric_time |
| - | ---- | -------- |
Expand All @@ -148,7 +154,7 @@ When you run the query `dbt sl query --metrics d7_booking_change --group-by met
| | 83 | 2017-06-24 |
| Total | 7252 | 2017-07-01 |

4. Lastly, we calculate the derived metric and return the final result set:
4. Lastly, calculate the derived metric and return the final result set:

```bash
bookings - bookings_7_days_ago would be compile as 7438 - 7252 = 186.
Expand Down
3 changes: 2 additions & 1 deletion website/docs/docs/build/materializations.md
Original file line number Diff line number Diff line change
Expand Up @@ -100,8 +100,9 @@ When using the `table` materialization, your model is rebuilt as a <Term id="tab
- Ephemeral models can help keep your <Term id="data-warehouse" /> clean by reducing clutter (also consider splitting your models across multiple schemas by [using custom schemas](/docs/build/custom-schemas)).
* **Cons:**
* You cannot select directly from this model.
* Operations (e.g. macros called via `dbt run-operation` cannot `ref()` ephemeral nodes)
* [Operations](/docs/build/hooks-operations#about-operations) (for example, macros called using [`dbt run-operation`](/reference/commands/run-operation) cannot `ref()` ephemeral nodes)
* Overuse of ephemeral materialization can also make queries harder to debug.
* Ephemeral materialization doesn't support [model contracts](/docs/collaborate/govern/model-contracts#where-are-contracts-supported).
* **Advice:** Use the ephemeral materialization for:
* very light-weight transformations that are early on in your DAG
* are only used in one or two downstream models, and
Expand Down
23 changes: 11 additions & 12 deletions website/docs/docs/build/metrics-overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ pagination_next: "docs/build/cumulative"

Once you've created your semantic models, it's time to start adding metrics! Metrics can be defined in the same YAML files as your semantic models, or split into separate YAML files into any other subdirectories (provided that these subdirectories are also within the same dbt project repo)

The keys for metrics definitions are:
The keys for metrics definitions are:

| Parameter | Description | Type |
| --------- | ----------- | ---- |
Expand All @@ -22,7 +22,6 @@ The keys for metrics definitions are:
| `filter` | You can optionally add a filter string to any metric type, applying filters to dimensions, entities, or time dimensions during metric computation. Consider it as your WHERE clause. | Optional |
| `meta` | Additional metadata you want to add to your metric. | Optional |


Here's a complete example of the metrics spec configuration:

```yaml
Expand All @@ -39,14 +38,7 @@ metrics:
null
```
This page explains the different supported metric types you can add to your dbt project.
<!--
- [Cumulative](#cumulative-metrics) — Cumulative metrics aggregate a measure over a given window.
- [Derived](#derived-metrics) — An expression of other metrics, which allows you to do calculations on top of metrics.
- [Expression](#expression-metrics) — Allow measures to be modified using a SQL expression.
- [Measure proxy](#measure-proxy-metrics) — Metrics that refer directly to one measure.
- [Ratio](#ratio-metrics) — Create a ratio out of two measures.
-->
This page explains the different supported metric types you can add to your dbt project.
### Conversion metrics <Lifecycle status='new'/>
Expand All @@ -55,10 +47,11 @@ This page explains the different supported metric types you can add to your dbt
```yaml
metrics:
- name: The metric name # Required
description: the metric description # Optional
description: The metric description # Optional
type: conversion # Required
label: # Required
type_params: # Required
fills_nulls_with: Set the value in your metric definition instead of null (such as zero) # Optional
conversion_type_params: # Required
entity: ENTITY # Required
calculation: CALCULATION_TYPE # Optional. default: conversion_rate. options: conversions(buys) or conversion_rate (buys/visits), and more to come.
Expand All @@ -82,9 +75,10 @@ metrics:
- [email protected]
type: cumulative
type_params:
fills_nulls_with: 0
measures:
- distinct_users
#Omitting window will accumulate the measure over all time
# Omitting window will accumulate the measure over all time
window: 7 days

```
Expand All @@ -100,6 +94,7 @@ metrics:
type: derived
label: Order Gross Profit
type_params:
fills_nulls_with: 0
expr: revenue - cost
metrics:
- name: order_total
Expand Down Expand Up @@ -139,6 +134,7 @@ metrics:
# Define the metrics from the semantic manifest as numerator or denominator
type: ratio
type_params:
fills_nulls_with: 0
numerator: cancellations
denominator: transaction_amount
filter: | # add optional constraint string. This applies to both the numerator and denominator
Expand All @@ -157,6 +153,7 @@ metrics:
filter: | # add optional constraint string. This applies to both the numerator and denominator
{{ Dimension('customer__country') }} = 'MX'
```

### Simple metrics

[Simple metrics](/docs/build/simple) point directly to a measure. You may think of it as a function that takes only one measure as the input.
Expand All @@ -171,6 +168,7 @@ metrics:
- name: cancellations
type: simple
type_params:
fills_nulls_with: 0
measure: cancellations_usd # Specify the measure you are creating a proxy for.
filter: |
{{ Dimension('order__value')}} > 100 and {{Dimension('user__acquisition')}}
Expand All @@ -187,6 +185,7 @@ filter: |
filter: |
{{ TimeDimension('time_dimension', 'granularity') }}
```

### Further configuration

You can set more metadata for your metrics, which can be used by other tools later on. The way this metadata is used will vary based on the specific integration partner
Expand Down
2 changes: 0 additions & 2 deletions website/docs/docs/build/models.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,6 @@ pagination_next: "docs/build/sql-models"
pagination_prev: null
---

## Overview

dbt Core and Cloud are composed of different moving parts working harmoniously. All of them are important to what dbt does — transforming data—the 'T' in ELT. When you execute `dbt run`, you are running a model that will transform your data without that data ever leaving your warehouse.

Models are where your developers spend most of their time within a dbt environment. Models are primarily written as a `select` statement and saved as a `.sql` file. While the definition is straightforward, the complexity of the execution will vary from environment to environment. Models will be written and rewritten as needs evolve and your organization finds new ways to maximize efficiency.
Expand Down
Loading

0 comments on commit c13eb9e

Please sign in to comment.