Skip to content

Commit

Permalink
Merge branch 'current' into alexisweill-incremental-microbatch
Browse files Browse the repository at this point in the history
  • Loading branch information
mirnawong1 authored Oct 11, 2024
2 parents 87e662c + f622de2 commit 9fc3d6e
Show file tree
Hide file tree
Showing 17 changed files with 291 additions and 48 deletions.
55 changes: 51 additions & 4 deletions website/docs/docs/build/dimensions.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ Refer to the following example to see how dimensions are used in a semantic mode
semantic_models:
- name: transactions
description: A record for every transaction that takes place. Carts are considered multiple transactions for each SKU.
model: {{ ref("fact_transactions") }}
model: {{ ref('fact_transactions') }}
defaults:
agg_time_dimension: order_date
# --- entities ---
Expand Down Expand Up @@ -122,7 +122,7 @@ dbt sl query --metrics users_created,users_deleted --group-by metric_time__year
mf query --metrics users_created,users_deleted --group-by metric_time__year --order-by metric_time__year
```

You can set `is_partition` for time to define specific time spans. Additionally, use the `type_params` section to set `time_granularity` to adjust aggregation details (hourly, daily, weekly, and so on).
You can set `is_partition` for time to define specific time spans. Additionally, use the `type_params` section to set `time_granularity` to adjust aggregation details (daily, weekly, and so on).

<Tabs queryString="dimension">

Expand Down Expand Up @@ -161,6 +161,8 @@ measures:

<TabItem value="time_gran" label="time_granularity">

<VersionBlock firstVersion="1.9">

`time_granularity` specifies the grain of a time dimension. MetricFlow will transform the underlying column to the specified granularity. For example, if you add hourly granularity to a time dimension column, MetricFlow will run a `date_trunc` function to convert the timestamp to hourly. You can easily change the time grain at query time and aggregate it to a coarser grain, for example, from hourly to monthly. However, you can't go from a coarser grain to a finer grain (monthly to hourly).

Our supported granularities are:
Expand All @@ -172,6 +174,7 @@ Our supported granularities are:
* hour
* day
* week
* month
* quarter
* year

Expand Down Expand Up @@ -204,6 +207,50 @@ measures:
agg: sum
```

</VersionBlock>

<VersionBlock lastVersion="1.8">

`time_granularity` specifies the grain of a time dimension. MetricFlow will transform the underlying column to the specified granularity. For example, if you add daily granularity to a time dimension column, MetricFlow will run a `date_trunc` function to convert the timestamp to daily. You can easily change the time grain at query time and aggregate it to a coarser grain, for example, from daily to monthly. However, you can't go from a coarser grain to a finer grain (monthly to daily).

Our supported granularities are:
* day
* week
* month
* quarter
* year

Aggregation between metrics with different granularities is possible, with the Semantic Layer returning results at the coarsest granularity by default. For example, when querying two metrics with daily and monthly granularity, the resulting aggregation will be at the monthly level.

```yaml
dimensions:
- name: created_at
type: time
label: "Date of creation"
expr: ts_created # ts_created is the underlying column name from the table
is_partition: True
type_params:
time_granularity: day
- name: deleted_at
type: time
label: "Date of deletion"
expr: ts_deleted # ts_deleted is the underlying column name from the table
is_partition: True
type_params:
time_granularity: day
measures:
- name: users_deleted
expr: 1
agg: sum
agg_time_dimension: deleted_at
- name: users_created
expr: 1
agg: sum
```

</VersionBlock>

</TabItem>

</Tabs>
Expand Down Expand Up @@ -313,7 +360,7 @@ Additionally, the entity is tagged as `natural` to differentiate it from a `prim
semantic_models:
- name: sales_person_tiers
description: SCD Type II table of tiers for salespeople
model: {{ref(sales_person_tiers)}}
model: {{ ref('sales_person_tiers') }}
defaults:
agg_time_dimension: tier_start
Expand Down Expand Up @@ -355,7 +402,7 @@ semantic_models:
There is a transaction, product, sales_person, and customer id for
every transaction. There is only one transaction id per
transaction. The `metric_time` or date is reflected in UTC.
model: {{ ref(fact_transactions) }}
model: {{ ref('fact_transactions') }}
defaults:
agg_time_dimension: metric_time

Expand Down
164 changes: 160 additions & 4 deletions website/docs/docs/build/metricflow-time-spine.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,15 @@
---
title: MetricFlow time spine
id: metricflow-time-spine
description: "MetricFlow expects a default timespine table called metricflow_time_spine"
description: "MetricFlow expects a default time spine table called metricflow_time_spine"
sidebar_label: "MetricFlow time spine"
tags: [Metrics, Semantic Layer]
---
<VersionBlock firstVersion="1.9">

It's common in analytics engineering to have a date dimension or "time spine" table as a base table for different types of time-based joins and aggregations. The structure of this table is typically a base column of daily or hourly dates, with additional columns for other time grains, like fiscal quarters, defined based on the base column. You can join other tables to the time spine on the base column to calculate metrics like revenue at a point in time, or to aggregate to a specific time grain.
<!-- this whole section is for 1.9 and higher + Versionless -->

It's common in analytics engineering to have a date dimension or "time spine" table as a base table for different types of time-based joins and aggregations. The structure of this table is typically a base column of daily or hourly dates, with additional columns for other time grains, like fiscal quarters, defined based on the base column. You can join other tables to the time spine on the base column to calculate metrics like revenue at a point in time, or to aggregate to a specific time grain.

MetricFlow requires you to define at least one dbt model which provides a time-spine, and then specify (in YAML) the columns to be used for time-based joins. MetricFlow will join against the time-spine model for the following types of metrics and dimensions:

Expand Down Expand Up @@ -74,6 +76,7 @@ This example creates a time spine at an hourly grain and a daily grain: `time_sp
<Lightbox src="/img/time_spines.png" width="50%" title="Time spine directory structure" />
<!--
<VersionBlock lastVersion="1.8">
<File name="models/_models.yml">
Expand All @@ -98,6 +101,7 @@ models:
</File>
</VersionBlock>
-->
- This example configuration shows a time spine model called `time_spine_hourly` and `time_spine_daily`. It sets the time spine configurations under the `time_spine` key.
- The `standard_granularity_column` is the column that maps to one of our [standard granularities](/docs/build/dimensions?dimension=time_gran). This column must be set under the `columns` key and should have a grain that is finer or equal to any custom granularity columns defined in the same model.
Expand Down Expand Up @@ -290,13 +294,165 @@ and date_hour < dateadd(day, 30, current_timestamp())
</File>


</VersionBlock>

<VersionBlock lastVersion="1.8">

<!-- this whole section is for 1.8 and and lower -->

MetricFlow uses a time spine table to construct cumulative metrics. By default, MetricFlow expects the time spine table to be named `metricflow_time_spine` and doesn't support using a different name. For supported granularities, refer to the [dimensions](/docs/build/dimensions?dimension=time_gran#time) page.

To create this table, you need to create a model in your dbt project called `metricflow_time_spine` and add the following code:

### Daily

<VersionBlock lastVersion="1.6">
<File name='metricflow_time_spine.sql'>

```sql
{{
config(
materialized = 'table',
)
}}
with days as (
{{
dbt_utils.date_spine(
'day',
"to_date('01/01/2000','mm/dd/yyyy')",
"to_date('01/01/2025','mm/dd/yyyy')"
)
}}
),
final as (
select cast(date_day as date) as date_day
from days
)
select * from final
-- filter the time spine to a specific range
where date_day > dateadd(year, -4, current_timestamp())
and date_hour < dateadd(day, 30, current_timestamp())
```
</File>
</VersionBlock>

<VersionBlock firstVersion="1.7">
<File name='metricflow_time_spine.sql'>


```sql
{{
config(
materialized = 'table',
)
}}
with days as (
{{
dbt.date_spine(
'day',
"to_date('01/01/2000','mm/dd/yyyy')",
"to_date('01/01/2025','mm/dd/yyyy')"
)
}}
),
final as (
select cast(date_day as date) as date_day
from days
)
select * from final
where date_day > dateadd(year, -4, current_timestamp())
and date_hour < dateadd(day, 30, current_timestamp())
```

</File>
</VersionBlock>

### Daily (BigQuery)

Use this model if you're using BigQuery. BigQuery supports `DATE()` instead of `TO_DATE()`:

<VersionBlock lastVersion="1.6">

<File name="metricflow_time_spine.sql">

```sql
{{config(materialized='table')}}
with days as (
{{dbt_utils.date_spine(
'day',
"DATE(2000,01,01)",
"DATE(2025,01,01)"
)
}}
),
final as (
select cast(date_day as date) as date_day
from days
)
select *
from final
-- filter the time spine to a specific range
where date_day > dateadd(year, -4, current_timestamp())
and date_hour < dateadd(day, 30, current_timestamp())
```
</File>
</VersionBlock>

<VersionBlock firstVersion="1.7">

<File name="metricflow_time_spine.sql">

```sql
{{config(materialized='table')}}
with days as (
{{dbt.date_spine(
'day',
"DATE(2000,01,01)",
"DATE(2025,01,01)"
)
}}
),
final as (
select cast(date_day as date) as date_day
from days
)
select *
from final
-- filter the time spine to a specific range
where date_day > dateadd(year, -4, current_timestamp())
and date_hour < dateadd(day, 30, current_timestamp())
```

</File>
</VersionBlock>

You only need to include the `date_day` column in the table. MetricFlow can handle broader levels of detail, but finer grains are only supported in versions 1.9 and higher.

</VersionBlock>


## Custom calendar <Lifecycle status="Preview"/>

<VersionBlock lastVersion="1.8">

The ability to configure custom calendars, such as a fiscal calendar, is available in [dbt Cloud Versionless](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless) or dbt Core [v1.9 and higher](/docs/dbt-versions/core).
The ability to configure custom calendars, such as a fiscal calendar, is available in [dbt Cloud Versionless](/docs/dbt-versions/versionless-cloud) or dbt Core [v1.9 and higher](/docs/dbt-versions/core).

To access this feature, [upgrade to Versionless](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless) or your dbt Core version to v1.9 or higher.

To access this feature, [upgrade to Versionless](/docs/dbt-versions/versionless-cloud) or your dbt Core version to v1.9 or higher.
</VersionBlock>

<VersionBlock firstVersion="1.9">
Expand Down
38 changes: 36 additions & 2 deletions website/docs/docs/build/metrics-overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,18 @@ import SLCourses from '/snippets/_sl-course.md';
## Default granularity for metrics
It's possible to define a default time granularity for metrics if it's different from the granularity of the default aggregation time dimensions (`metric_time`). This is useful if your time dimension has a very fine grain, like second or hour, but you typically query metrics rolled up at a coarser grain. The granularity can be set using the `time_granularity` parameter on the metric, and defaults to `day`. If day is not available because the dimension is defined at a coarser granularity, it will default to the defined granularity for the dimension.
<VersionBlock lastVersion="1.8">
Default time granularity for metrics is useful if your time dimension has a very fine grain, like second or hour, but you typically query metrics rolled up at a coarser grain.
To set the default time granularity for metrics, you need to be on dbt Cloud Versionless or dbt v1.9 and higher.
</VersionBlock>
<VersionBlock firstVersion="1.9">
It's possible to define a default time granularity for metrics if it's different from the granularity of the default aggregation time dimensions (`metric_time`). This is useful if your time dimension has a very fine grain, like second or hour, but you typically query metrics rolled up at a coarser grain.

The granularity can be set using the `time_granularity` parameter on the metric, and defaults to `day`. If day is not available because the dimension is defined at a coarser granularity, it will default to the defined granularity for the dimension.

### Example
You have a semantic model called `orders` with a time dimension called `order_time`. You want the `orders` metric to roll up to `monthly` by default; however, you want the option to look at these metrics hourly. You can set the `time_granularity` parameter on the `order_time` dimension to `hour`, and then set the `time_granularity` parameter in the metric to `month`.
Expand All @@ -117,6 +128,7 @@ semantic_models:
name: orders
time_granularity: month -- Optional, defaults to day
```
</VersionBlock>

## Conversion metrics

Expand Down Expand Up @@ -270,6 +282,8 @@ A filter is configured using Jinja templating. Use the following syntax to refer

Refer to [Metrics as dimensions](/docs/build/ref-metrics-in-filters) for details on how to use metrics as dimensions with metric filters:

<VersionBlock firstVersion="1.8">

<File name="models/metrics/file_name.yml" >

```yaml
Expand All @@ -283,10 +297,30 @@ filter: |
{{ TimeDimension('time_dimension', 'granularity') }}
filter: |
{{ Metric('metric_name', group_by=['entity_name']) }} # Available in v1.8 or with versionless dbt Cloud.
{{ Metric('metric_name', group_by=['entity_name']) }}
```
</File>
</VersionBlock>

<VersionBlock lastVersion="1.7">


<File name="models/metrics/file_name.yml" >

```yaml
filter: |
{{ Entity('entity_name') }}
filter: |
{{ Dimension('primary_entity__dimension_name') }}
filter: |
{{ TimeDimension('time_dimension', 'granularity') }}
```
</File>
</VersionBlock>

For example, if you want to filter for the order date dimension grouped by month, use the following syntax:

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ description: "Import and auto-generate exposures from dashboards and understand
image: /img/docs/cloud-integrations/auto-exposures/explorer-lineage2.jpg
---

# Configure auto-exposures <Lifecycle status='preview' />
# Configure auto-exposures <Lifecycle status="preview,enterprise" />

As a data team, it’s critical that you have context into the downstream use cases and users of your data products. [Auto-exposures](/docs/collaborate/auto-exposures) integrates natively with Tableau and [auto-generates downstream lineage](/docs/collaborate/auto-exposures#view-auto-exposures-in-dbt-explorer) in dbt Explorer for a richer experience.

Expand Down
2 changes: 1 addition & 1 deletion website/docs/docs/collaborate/auto-exposures.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ pagination_next: "docs/collaborate/data-tile"
image: /img/docs/cloud-integrations/auto-exposures/explorer-lineage.jpg
---

# Auto-exposures <Lifecycle status='preview' />
# Auto-exposures <Lifecycle status="preview,enterprise" />

As a data team, it’s critical that you have context into the downstream use cases and users of your data products. Auto-exposures integrates natively with Tableau (Power BI coming soon) and auto-generates downstream lineage in dbt Explorer for a richer experience.

Expand Down
Loading

0 comments on commit 9fc3d6e

Please sign in to comment.