Skip to content

Commit

Permalink
adds sl course link and updates fill_nulls_with and join_to_timespine (
Browse files Browse the repository at this point in the history
…#4912)

This PR does the following:

- adds links to the newly launched SL video courses in various places
for visual learning
- Updates info related to fill_nulls_width and join_to_timespine in
various pages: simple and cumulative metrics page. adds
join_to_timespine to all metrics pages, and removes fill_nulls_with in
ration and derived metrics page. (see
[linear](https://linear.app/dbt-labs/issue/SL-1676/docs-for-fill-nulls-with-and-join-to-timespine-are-incorrect)
for more details)

this PR also turns the about metricflow faqs into `detailsToggle`
component.
  • Loading branch information
mirnawong1 authored Feb 21, 2024
2 parents 2463672 + a942f8b commit 23c8c87
Show file tree
Hide file tree
Showing 18 changed files with 235 additions and 175 deletions.
71 changes: 35 additions & 36 deletions website/docs/docs/build/about-metricflow.md
Original file line number Diff line number Diff line change
Expand Up @@ -245,43 +245,42 @@ metrics:
## FAQs
<details>
<summary>Do my datasets need to be normalized?</summary>
<div>
<div>Not at all! While a cleaned and well-modeled data set can be extraordinarily powerful and is the ideal input, you can use any dataset from raw to fully denormalized datasets. <br /><br />It's recommended that you apply quality data consistency, such as filtering bad data, normalizing common objects, and data modeling of keys and tables, in upstream applications. The Semantic Layer is more efficient at doing data denormalization instead of normalization. <br /><br />If you have not invested in data consistency, that is okay. The Semantic Layer can take SQL queries or expressions to define consistent datasets.</div>
</div>
</details>
<details>
<summary>Why is normalized data the ideal input?</summary>
<div>
<div> MetricFlow is built to do denormalization efficiently. There are better tools to take raw datasets and accomplish the various tasks required to build data consistency and organized data models. On the other end, by putting in denormalized data you are potentially creating redundancy which is technically challenging to manage, and you are reducing the potential granularity that MetricFlow can use to aggregate metrics.</div>
</div>
</details>
<details>
<summary>Why not just make metrics the same as measures?</summary>
<div>
<div>One principle of MetricFlow is to reduce the duplication of logic sometimes referred to as Don't Repeat Yourself(DRY).<br /><br />Many metrics are constructed from reused measures and in some cases constructed from measures from different semantic models. This allows for metrics to be built breadth-first (metrics that can stand alone) instead of depth-first (where you have multiple metrics acting as functions of each other).<br /><br />Additionally, not all metrics are constructed off of measures. As an example, a conversion metric is likely defined as the presence or absence of an event record after some other event record.</div>
</div>
</details>
<details>
<summary>How does the Semantic Layer handle joins?</summary>
<div>
<div>MetricFlow builds joins based on the types of keys and parameters that are passed to entities. To better understand how joins are constructed see our documentation on join types.<br /><br />Rather than capturing arbitrary join logic, MetricFlow captures the types of each identifier and then helps the user to navigate to appropriate joins. This allows us to avoid the construction of fan out and chasm joins as well as generate legible SQL.</div>
</div>
</details>
<details>
<summary>Are entities and join keys the same thing?</summary>
<div>
<div>If it helps you to think of entities as join keys, that is very reasonable. Entities in MetricFlow have applications beyond joining two tables, such as acting as a dimension.</div>
</div>
</details>
<details>
<summary>Can a table without a primary or unique entities have dimensions?</summary>
<div>
<div>Yes, but because a dimension is considered an attribute of the primary or unique ent of the table, they are only usable by the metrics that are defined in that table. They cannot be joined to metrics from other tables. This is common in event logs.</div>
</div>
</details>
<detailsToggle alt_header="Do my datasets need to be normalized?">
Not at all! While a cleaned and well-modeled data set can be extraordinarily powerful and is the ideal input, you can use any dataset from raw to fully denormalized datasets.
It's recommended that you apply quality data consistency, such as filtering bad data, normalizing common objects, and data modeling of keys and tables, in upstream applications. The Semantic Layer is more efficient at doing data denormalization instead of normalization.
If you have not invested in data consistency, that is okay. The Semantic Layer can take SQL queries or expressions to define consistent datasets.
</detailsToggle>
<detailsToggle alt_header="Why is normalized data the ideal input?">
MetricFlow is built to do denormalization efficiently. There are better tools to take raw datasets and accomplish the various tasks required to build data consistency and organized data models. On the other end, by putting in denormalized data you are potentially creating redundancy which is technically challenging to manage, and you are reducing the potential granularity that MetricFlow can use to aggregate metrics.
</detailsToggle>
<detailsToggle alt_header="Why not just make metrics the same as measures?">
One principle of MetricFlow is to reduce the duplication of logic sometimes referred to as Don't Repeat Yourself(DRY).
Many metrics are constructed from reused measures and in some cases constructed from measures from different semantic models. This allows for metrics to be built breadth-first (metrics that can stand alone) instead of depth-first (where you have multiple metrics acting as functions of each other).
Additionally, not all metrics are constructed off of measures. As an example, a conversion metric is likely defined as the presence or absence of an event record after some other event record.
</detailsToggle>
<detailsToggle alt_header="How does the dbt Semantic Layer handle joins?">
The dbt Semantic Layer, powered by MetricFlow, builds joins based on the types of keys and parameters that are passed to entities. To better understand how joins are constructed see our documentation on join types.
Rather than capturing arbitrary join logic, MetricFlow captures the types of each identifier and then helps the user to navigate to appropriate joins. This allows us to avoid the construction of fan out and chasm joins as well as generate legible SQL.
</detailsToggle>
<detailsToggle alt_header="Are entities and join keys the same thing?">
If it helps you to think of entities as join keys, that is very reasonable. Entities in MetricFlow have applications beyond joining two tables, such as acting as a dimension.
</detailsToggle>
<detailsToggle alt_header="Can a table without a primary or unique entities have dimensions?">
Yes, but because a dimension is considered an attribute of the primary or unique ent of the table, they are only usable by the metrics that are defined in that table. They cannot be joined to metrics from other tables. This is common in event logs.
</detailsToggle>
## Related docs
- [Joins](/docs/build/join-logic)
Expand Down
5 changes: 1 addition & 4 deletions website/docs/docs/build/build-metrics-intro.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,14 +11,12 @@ pagination_prev: null

Use MetricFlow in dbt to centrally define your metrics. As a key component of the [dbt Semantic Layer](/docs/use-dbt-semantic-layer/dbt-sl), MetricFlow is responsible for SQL query construction and defining specifications for dbt semantic models and metrics. It uses familiar constructs like semantic models and metrics to avoid duplicative coding, optimize your development workflow, ensure data governance for company metrics, and guarantee consistency for data consumers.


MetricFlow allows you to:
- Intuitively define metrics in your dbt project
- Develop from your preferred environment, whether that's the [dbt Cloud CLI](/docs/cloud/cloud-cli-installation), [dbt Cloud IDE](/docs/cloud/dbt-cloud-ide/develop-in-the-cloud), or [dbt Core](/docs/core/installation-overview)
- Use [MetricFlow commands](/docs/build/metricflow-commands) to query and test those metrics in your development environment
- Harness the true magic of the universal dbt Semantic Layer and dynamically query these metrics in downstream tools (Available for dbt Cloud [Team or Enterprise](https://www.getdbt.com/pricing/) accounts only).


<div className="grid--3-col">

<Card
Expand Down Expand Up @@ -57,12 +55,11 @@ MetricFlow allows you to:
link="/docs/use-dbt-semantic-layer/avail-sl-integrations"
icon="dbt-bit"/>


</div> <br />


## Related docs

- [The dbt Semantic Layer: what's next](https://www.getdbt.com/blog/dbt-semantic-layer-whats-next/) blog
- [Get started with MetricFlow](/docs/build/sl-getting-started)
- [dbt Semantic Layer on-demand courses](https://courses.getdbt.com/courses/semantic-layer)
- [dbt Semantic Layer FAQs](/docs/use-dbt-semantic-layer/sl-faqs)
67 changes: 41 additions & 26 deletions website/docs/docs/build/conversion-metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,23 +16,28 @@ Conversion metrics are different from [ratio metrics](/docs/build/ratio) because

The specification for conversion metrics is as follows:

| Parameter | Description | Type | Required/Optional |
| --- | --- | --- | --- |
| `name` | The name of the metric. | String | Required |
| `description` | The description of the metric. | String | Optional |
| `type` | The type of metric (such as derived, ratio, and so on.). In this case, set as 'conversion' | String | Required |
| `label` | Displayed value in downstream tools. | String | Required |
| `type_params` | Specific configurations for each metric type. | List | Required |
| `conversion_type_params` | Additional configuration specific to conversion metrics. | List | Required |
| `entity` | The entity for each conversion event. | Entity | Required |
| `calculation` | Method of calculation. Either `conversion_rate` or `conversions`. Defaults to `conversion_rate`. | String | Optional |
| `base_measure` | The base conversion event measure. | Measure | Required |
| `conversion_measure` | The conversion event measure. | Measure | Required |
| `window` | The time window for the conversion event, such as 7 days, 1 week, 3 months. Defaults to infinity. | String | Optional |
| `constant_properties` | List of constant properties. | List | Optional |
| `base_property` | The property from the base semantic model that you want to hold constant. | Entity or Dimension | Optional |
| `conversion_property` | The property from the conversion semantic model that you want to hold constant. | Entity or Dimension | Optional |
| `fill_nulls_with` | Set the value in your metric definition instead of null (such as zero). | String | Optional |
| Parameter | Description | Type |
| --- | --- | --- |
| `name` | The name of the metric. | Required |
| `description` | The description of the metric. | Optional |
| `type` | The type of metric (such as derived, ratio, and so on.). In this case, set as 'conversion' | Required |
| `label` | Displayed value in downstream tools. | Required |
| `type_params` | Specific configurations for each metric type. | Required |
| `conversion_type_params` | Additional configuration specific to conversion metrics. | Required |
| `entity` | The entity for each conversion event. | Required |
| `calculation` | Method of calculation. Either `conversion_rate` or `conversions`. Defaults to `conversion_rate`. | Optional |
| `base_measure` | A list of base measure inputs | Required |
| `base_measure:name` | The base conversion event measure. | Required |
| `base_measure:fill_nulls_with` | Set the value in your metric definition instead of null (such as zero). | Optional |
| `base_measure:join_to_timespine` | Boolean that indicates if the aggregated measure should be joined to the time spine table to fill in missing dates. Default `false`. | Optional |
| `conversion_measure` | A list of conversion measure inputs. | Required |
| `conversion_measure:name` | The base conversion event measure.| Required |
| `conversion_measure:fill_nulls_with` | Set the value in your metric definition instead of null (such as zero). | Optional |
| `conversion_measure:join_to_timespine` | Boolean that indicates if the aggregated measure should be joined to the time spine table to fill in missing dates. Default `false`. | Optional |
| `window` | The time window for the conversion event, such as 7 days, 1 week, 3 months. Defaults to infinity. | Optional |
| `constant_properties` | List of constant properties. | Optional |
| `base_property` | The property from the base semantic model that you want to hold constant. | Optional |
| `conversion_property` | The property from the conversion semantic model that you want to hold constant. | Optional |

Refer to [additional settings](#additional-settings) to learn how to customize conversion metrics with settings for null values, calculation type, and constant properties.

Expand All @@ -43,14 +48,19 @@ metrics:
- name: The metric name # Required
description: The metric description # Optional
type: conversion # Required
label: # Required
label: YOUR_LABEL # Required
type_params: # Required
fills_nulls_with: Set the value in your metric definition instead of null (such as zero) # Optional
conversion_type_params: # Required
entity: ENTITY # Required
calculation: CALCULATION_TYPE # Optional. default: conversion_rate. options: conversions(buys) or conversion_rate (buys/visits), and more to come.
base_measure: MEASURE # Required
conversion_measure: MEASURE # Required
base_measure:
name: The name of the measure # Required
fill_nulls_with: Set the value in your metric definition instead of null (such as zero) # Optional
join_to_timespine: true/false # Boolean that indicates if the aggregated measure should be joined to the time spine table to fill in missing dates. Default `false`. # Optional
conversion_measure:
name: The name of the measure # Required
fill_nulls_with: Set the value in your metric definition instead of null (such as zero) # Optional
join_to_timespine: true/false # Boolean that indicates if the aggregated measure should be joined to the time spine table to fill in missing dates. Default `false`. # Optional
window: TIME_WINDOW # Optional. default: infinity. window to join the two events. Follows a similar format as time windows elsewhere (such as 7 days)
constant_properties: # Optional. List of constant properties default: None
- base_property: DIMENSION or ENTITY # Required. A reference to a dimension/entity of the semantic model linked to the base_measure
Expand Down Expand Up @@ -93,10 +103,12 @@ Next, define a conversion metric as follows:
type: conversion
label: Visit to Buy Conversion Rate (7-day window)
type_params:
fills_nulls_with: 0
conversion_type_params:
base_measure: visits
base_measure:
name: visits
fill_nulls_with: 0
conversion_measure: sellers
name: sellers
entity: user
window: 7 days
```
Expand Down Expand Up @@ -260,7 +272,8 @@ To return zero in the final data set, you can set the value of a null conversion
type_params:
conversion_type_params:
calculation: conversions
base_measure: visits
base_measure:
name: visits
conversion_measure:
name: buys
fill_nulls_with: 0
Expand Down Expand Up @@ -289,7 +302,8 @@ You can change the default to display the number of conversions by setting the `
type_params:
conversion_type_params:
calculation: conversions
base_measure: visits
base_measure:
name: visits
conversion_measure:
name: buys
fill_nulls_with: 0
Expand Down Expand Up @@ -321,7 +335,8 @@ In this case, you want to set `product_id` as the constant property. You can spe
type_params:
conversion_type_params:
calculation: conversions
base_measure: view_item_detail
base_measure:
name: view_item_detail
conversion_measure: purchase
entity: user
window: 1 week
Expand Down
Loading

0 comments on commit 23c8c87

Please sign in to comment.