Skip to content

Commit

Permalink
adds fill_nulls_with to all metric types (#4768)
Browse files Browse the repository at this point in the history
This PR adds the parameter `fill_nulls_with` to all metric types. it was
already added for conversion metrics, but also needs to be added to
ratio, derived, cumulative, and simple metrics.
  • Loading branch information
mirnawong1 authored Jan 23, 2024
2 parents 219c484 + 5c27454 commit faf80d1
Show file tree
Hide file tree
Showing 6 changed files with 71 additions and 42 deletions.
7 changes: 6 additions & 1 deletion website/docs/docs/build/conversion-metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,16 +32,20 @@ The specification for conversion metrics is as follows:
| `constant_properties` | List of constant properties. | List | Optional |
| `base_property` | The property from the base semantic model that you want to hold constant. | Entity or Dimension | Optional |
| `conversion_property` | The property from the conversion semantic model that you want to hold constant. | Entity or Dimension | Optional |
| `fill_nulls_with` | Set the value in your metric definition instead of null (such as zero). | String | Optional |

Refer to [additional settings](#additional-settings) to learn how to customize conversion metrics with settings for null values, calculation type, and constant properties.

The following code example displays the complete specification for conversion metrics and details how they're applied:

```yaml
metrics:
- name: The metric name # Required
description: the metric description # Optional
description: The metric description # Optional
type: conversion # Required
label: # Required
type_params: # Required
fills_nulls_with: Set the value in your metric definition instead of null (such as zero) # Optional
conversion_type_params: # Required
entity: ENTITY # Required
calculation: CALCULATION_TYPE # Optional. default: conversion_rate. options: conversions(buys) or conversion_rate (buys/visits), and more to come.
Expand Down Expand Up @@ -89,6 +93,7 @@ Next, define a conversion metric as follows:
type: conversion
label: Visit to Buy Conversion Rate (7-day window)
type_params:
fills_nulls_with: 0
conversion_type_params:
base_measure: visits
conversion_measure: sellers
Expand Down
27 changes: 17 additions & 10 deletions website/docs/docs/build/cumulative-metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ This metric is common for calculating things like weekly active users, or month-
| `measure` | The measure you are referencing. | Required |
| `window` | The accumulation window, such as 1 month, 7 days, 1 year. This can't be used with `grain_to_date`. | Optional |
| `grain_to_date` | Sets the accumulation grain, such as month will accumulate data for one month. Then restart at the beginning of the next. This can't be used with `window`. | Optional |
| `fill_nulls_with` | Set the value in your metric definition instead of null (such as zero).| Optional |

The following displays the complete specification for cumulative metrics, along with an example:

Expand All @@ -30,13 +31,15 @@ metrics:
type: cumulative # Required
label: The value that will be displayed in downstream tools # Required
type_params: # Required
fill_nulls_with: Set the value in your metric definition instead of null (such as zero) # Optional
measure: The measure you are referencing # Required
window: The accumulation window, such as 1 month, 7 days, 1 year. # Optional. Cannot be used with grain_to_date
grain_to_date: Sets the accumulation grain, such as month will accumulate data for one month, then restart at the beginning of the next. # Optional. Cannot be used with window

```

## Limitations

Cumulative metrics are currently under active development and have the following limitations:
- You are required to use [`metric_time` dimension](/docs/build/dimensions#time) when querying cumulative metrics. If you don't use `metric_time` in the query, the cumulative metric will return incorrect results because it won't perform the time spine join. This means you cannot reference time dimensions other than the `metric_time` in the query.

Expand All @@ -59,19 +62,22 @@ metrics:
description: The cumulative value of all orders
type: cumulative
type_params:
fill_nulls_with: 0
measure: order_total
- name: cumulative_order_total_l1m
label: Cumulative Order total (L1M)
description: Trailing 1 month cumulative order amount
type: cumulative
type_params:
fills_nulls_with: 0
measure: order_total
window: 1 month
- name: cumulative_order_total_mtd
label: Cumulative Order total (MTD)
description: The month to date value of all orders
type: cumulative
type_params:
fills_nulls_with: 0
measure: order_total
grain_to_date: month
```
Expand Down Expand Up @@ -201,16 +207,16 @@ The current method connects the metric table to a timespine table using the prim

``` sql
select
count(distinct distinct_users) as weekly_active_users
, metric_time
count(distinct distinct_users) as weekly_active_users,
metric_time
from (
select
subq_3.distinct_users as distinct_users
, subq_3.metric_time as metric_time
subq_3.distinct_users as distinct_users,
subq_3.metric_time as metric_time
from (
select
subq_2.distinct_users as distinct_users
, subq_1.metric_time as metric_time
subq_2.distinct_users as distinct_users,
subq_1.metric_time as metric_time
from (
select
metric_time
Expand All @@ -223,8 +229,8 @@ from (
) subq_1
inner join (
select
distinct_users as distinct_users
, date_trunc('day', ds) as metric_time
distinct_users as distinct_users,
date_trunc('day', ds) as metric_time
from demo_schema.transactions transactions_src_426
where (
(date_trunc('day', ds)) >= cast('1999-12-26' as timestamp)
Expand All @@ -241,6 +247,7 @@ from (
) subq_3
)
group by
metric_time
limit 100
metric_time,
limit 100;
```
16 changes: 11 additions & 5 deletions website/docs/docs/build/derived-metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ In MetricFlow, derived metrics are metrics created by defining an expression usi
| `metrics` | The list of metrics used in the derived metrics. | Required |
| `alias` | Optional alias for the metric that you can use in the expr. | Optional |
| `filter` | Optional filter to apply to the metric. | Optional |
| `fill_nulls_with` | Set the value in your metric definition instead of null (such as zero). | Optional |
| `offset_window` | Set the period for the offset window, such as 1 month. This will return the value of the metric one month from the metric time. | Optional |

The following displays the complete specification for derived metrics, along with an example.
Expand All @@ -32,6 +33,7 @@ metrics:
type: derived # Required
label: The value that will be displayed in downstream tools #Required
type_params: # Required
fill_nulls_with: Set the value in your metric definition instead of null (such as zero) # Optional
expr: the derived expression # Required
metrics: # The list of metrics used in the derived metrics # Required
- name: the name of the metrics. must reference a metric you have already defined # Required
Expand All @@ -49,6 +51,7 @@ metrics:
type: derived
label: Order Gross Profit
type_params:
fill_nulls_with: 0
expr: revenue - cost
metrics:
- name: order_total
Expand All @@ -60,6 +63,7 @@ metrics:
description: "The gross profit for each food order."
type: derived
type_params:
fill_nulls_with: 0
expr: revenue - cost
metrics:
- name: order_total
Expand Down Expand Up @@ -96,6 +100,7 @@ The following example displays how you can calculate monthly revenue growth usin
description: Percentage of customers that are active now and those active 1 month ago
label: customer_retention
type_params:
fill_nulls_with: 0
expr: (active_customers/ active_customers_prev_month)
metrics:
- name: active_customers
Expand All @@ -115,6 +120,7 @@ You can query any granularity and offset window combination. The following examp
type: derived
label: d7 Bookings Change
type_params:
fill_nulls_with: 0
expr: bookings - bookings_7_days_ago
metrics:
- name: bookings
Expand All @@ -126,10 +132,10 @@ You can query any granularity and offset window combination. The following examp

When you run the query `dbt sl query --metrics d7_booking_change --group-by metric_time__month` for the metric, here's how it's calculated. For dbt Core, you can use the `mf query` prefix.

1. We retrieve the raw, unaggregated dataset with the specified measures and dimensions at the smallest level of detail, which is currently 'day'.
2. Then, we perform an offset join on the daily dataset, followed by performing a date trunc and aggregation to the requested granularity.
1. Retrieve the raw, unaggregated dataset with the specified measures and dimensions at the smallest level of detail, which is currently 'day'.
2. Then, perform an offset join on the daily dataset, followed by performing a date trunc and aggregation to the requested granularity.
For example, to calculate `d7_booking_change` for July 2017:
- First, we sum up all the booking values for each day in July to calculate the bookings metric.
- First, sum up all the booking values for each day in July to calculate the bookings metric.
- The following table displays the range of days that make up this monthly aggregation.

| | Orders | Metric_time |
Expand All @@ -139,7 +145,7 @@ When you run the query `dbt sl query --metrics d7_booking_change --group-by met
| | 78 | 2017-07-01 |
| Total | 7438 | 2017-07-01 |

3. Next, we calculate July's bookings with a 7-day offset. The following table displays the range of days that make up this monthly aggregation. Note that the month begins 7 days later (offset by 7 days) on 2017-07-24.
3. Calculate July's bookings with a 7-day offset. The following table displays the range of days that make up this monthly aggregation. Note that the month begins 7 days later (offset by 7 days) on 2017-07-24.

| | Orders | Metric_time |
| - | ---- | -------- |
Expand All @@ -148,7 +154,7 @@ When you run the query `dbt sl query --metrics d7_booking_change --group-by met
| | 83 | 2017-06-24 |
| Total | 7252 | 2017-07-01 |

4. Lastly, we calculate the derived metric and return the final result set:
4. Lastly, calculate the derived metric and return the final result set:

```bash
bookings - bookings_7_days_ago would be compile as 7438 - 7252 = 186.
Expand Down
23 changes: 11 additions & 12 deletions website/docs/docs/build/metrics-overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ pagination_next: "docs/build/cumulative"

Once you've created your semantic models, it's time to start adding metrics! Metrics can be defined in the same YAML files as your semantic models, or split into separate YAML files into any other subdirectories (provided that these subdirectories are also within the same dbt project repo)

The keys for metrics definitions are:
The keys for metrics definitions are:

| Parameter | Description | Type |
| --------- | ----------- | ---- |
Expand All @@ -22,7 +22,6 @@ The keys for metrics definitions are:
| `filter` | You can optionally add a filter string to any metric type, applying filters to dimensions, entities, or time dimensions during metric computation. Consider it as your WHERE clause. | Optional |
| `meta` | Additional metadata you want to add to your metric. | Optional |


Here's a complete example of the metrics spec configuration:

```yaml
Expand All @@ -39,14 +38,7 @@ metrics:
null
```
This page explains the different supported metric types you can add to your dbt project.
<!--
- [Cumulative](#cumulative-metrics) — Cumulative metrics aggregate a measure over a given window.
- [Derived](#derived-metrics) — An expression of other metrics, which allows you to do calculations on top of metrics.
- [Expression](#expression-metrics) — Allow measures to be modified using a SQL expression.
- [Measure proxy](#measure-proxy-metrics) — Metrics that refer directly to one measure.
- [Ratio](#ratio-metrics) — Create a ratio out of two measures.
-->
This page explains the different supported metric types you can add to your dbt project.
### Conversion metrics <Lifecycle status='new'/>
Expand All @@ -55,10 +47,11 @@ This page explains the different supported metric types you can add to your dbt
```yaml
metrics:
- name: The metric name # Required
description: the metric description # Optional
description: The metric description # Optional
type: conversion # Required
label: # Required
type_params: # Required
fills_nulls_with: Set the value in your metric definition instead of null (such as zero) # Optional
conversion_type_params: # Required
entity: ENTITY # Required
calculation: CALCULATION_TYPE # Optional. default: conversion_rate. options: conversions(buys) or conversion_rate (buys/visits), and more to come.
Expand All @@ -82,9 +75,10 @@ metrics:
- [email protected]
type: cumulative
type_params:
fills_nulls_with: 0
measures:
- distinct_users
#Omitting window will accumulate the measure over all time
# Omitting window will accumulate the measure over all time
window: 7 days

```
Expand All @@ -100,6 +94,7 @@ metrics:
type: derived
label: Order Gross Profit
type_params:
fills_nulls_with: 0
expr: revenue - cost
metrics:
- name: order_total
Expand Down Expand Up @@ -139,6 +134,7 @@ metrics:
# Define the metrics from the semantic manifest as numerator or denominator
type: ratio
type_params:
fills_nulls_with: 0
numerator: cancellations
denominator: transaction_amount
filter: | # add optional constraint string. This applies to both the numerator and denominator
Expand All @@ -157,6 +153,7 @@ metrics:
filter: | # add optional constraint string. This applies to both the numerator and denominator
{{ Dimension('customer__country') }} = 'MX'
```

### Simple metrics

[Simple metrics](/docs/build/simple) point directly to a measure. You may think of it as a function that takes only one measure as the input.
Expand All @@ -171,6 +168,7 @@ metrics:
- name: cancellations
type: simple
type_params:
fills_nulls_with: 0
measure: cancellations_usd # Specify the measure you are creating a proxy for.
filter: |
{{ Dimension('order__value')}} > 100 and {{Dimension('user__acquisition')}}
Expand All @@ -187,6 +185,7 @@ filter: |
filter: |
{{ TimeDimension('time_dimension', 'granularity') }}
```

### Further configuration

You can set more metadata for your metrics, which can be used by other tools later on. The way this metadata is used will vary based on the specific integration partner
Expand Down
31 changes: 19 additions & 12 deletions website/docs/docs/build/ratio-metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ Ratio allows you to create a ratio between two metrics. You simply specify a num
| `denominator` | The name of the metric used for the denominator, or structure of properties. | Required |
| `filter` | Optional filter for the numerator or denominator. | Optional |
| `alias` | Optional alias for the numerator or denominator. | Optional |
| `fill_nulls_with` | Set the value in your metric definition instead of null (such as zero). | Optional |

The following displays the complete specification for ratio metrics, along with an example.

Expand All @@ -31,6 +32,7 @@ metrics:
type: ratio # Required
label: The value that will be displayed in downstream tools #Required
type_params: # Required
fill_nulls_with: Set value instead of null (such as zero) # Optional
numerator: The name of the metric used for the numerator, or structure of properties # Required
name: Name of metric used for the numerator # Required
filter: Filter for the numerator # Optional
Expand All @@ -50,40 +52,41 @@ metrics:
label: Food Order Ratio
type: ratio
type_params:
fill_nulls_with: 0
numerator: food_orders
denominator: orders

```
## Ratio metrics using different semantic models
The system will simplify and turn the numerator and denominator in a ratio metric from different semantic models by computing their values in sub-queries. It will then join the result set based on common dimensions to calculate the final ratio. Here's an example of the SQL generated for such a ratio metric.
```sql
select
subq_15577.metric_time as metric_time
, cast(subq_15577.mql_queries_created_test as double) / cast(nullif(subq_15582.distinct_query_users, 0) as double) as mql_queries_per_active_user
subq_15577.metric_time as metric_time,
cast(subq_15577.mql_queries_created_test as double) / cast(nullif(subq_15582.distinct_query_users, 0) as double) as mql_queries_per_active_user
from (
select
metric_time
, sum(mql_queries_created_test) as mql_queries_created_test
metric_time,
sum(mql_queries_created_test) as mql_queries_created_test
from (
select
cast(query_created_at as date) as metric_time
, case when query_status in ('PENDING','MODE') then 1 else 0 end as mql_queries_created_test
cast(query_created_at as date) as metric_time,
case when query_status in ('PENDING','MODE') then 1 else 0 end as mql_queries_created_test
from prod_dbt.mql_query_base mql_queries_test_src_2552
) subq_15576
group by
metric_time
) subq_15577
inner join (
select
metric_time
, count(distinct distinct_query_users) as distinct_query_users
metric_time,
count(distinct distinct_query_users) as distinct_query_users
from (
select
cast(query_created_at as date) as metric_time
, case when query_status in ('MODE','PENDING') then email else null end as distinct_query_users
cast(query_created_at as date) as metric_time,
case when query_status in ('MODE','PENDING') then email else null end as distinct_query_users
from prod_dbt.mql_query_base mql_queries_src_2585
) subq_15581
group by
Expand Down Expand Up @@ -115,6 +118,7 @@ metrics:
- [email protected]
type: ratio
type_params:
fill_nulls_with: 0
numerator:
name: distinct_purchasers
filter: |
Expand All @@ -124,4 +128,7 @@ metrics:
name: distinct_purchasers
```
Note the `filter` and `alias` parameters for the metric referenced in the numerator. Use the `filter` parameter to apply a filter to the metric it's attached to. The `alias` parameter is used to avoid naming conflicts in the rendered SQL queries when the same metric is used with different filters. If there are no naming conflicts, the `alias` parameter can be left out.
Note the `filter` and `alias` parameters for the metric referenced in the numerator.
- Use the `filter` parameter to apply a filter to the metric it's attached to.
- The `alias` parameter is used to avoid naming conflicts in the rendered SQL queries when the same metric is used with different filters.
- If there are no naming conflicts, the `alias` parameter can be left out.
Loading

0 comments on commit faf80d1

Please sign in to comment.