Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow measures that join to time spine to use agg_time_dim #1005

Merged
merged 21 commits into from
Feb 5, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions .changes/unreleased/Fixes-20240126-145338.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
kind: Fixes
body: For measures that join to time spine, allow joining when agg_time_dimension
is queried.
time: 2024-01-26T14:53:38.462443-08:00
custom:
Author: courtneyholcomb
Issue: "1009"
30 changes: 19 additions & 11 deletions metricflow/dataflow/builder/dataflow_plan_builder.py
Original file line number Diff line number Diff line change
Expand Up @@ -417,7 +417,7 @@ def _build_base_metric_output_node(
metric_input_measure_spec = self._build_input_measure_spec_for_base_metric(
filter_spec_factory=filter_spec_factory,
metric_reference=metric_reference,
query_contains_metric_time=queried_linkable_specs.contains_metric_time,
queried_linkable_specs=queried_linkable_specs,
child_metric_offset_window=metric_spec.offset_window,
child_metric_offset_to_grain=metric_spec.offset_to_grain,
cumulative_description=(
Expand Down Expand Up @@ -1059,7 +1059,7 @@ def _build_input_measure_spec_for_base_metric(
child_metric_offset_window: Optional[MetricTimeWindow],
child_metric_offset_to_grain: Optional[TimeGranularity],
descendent_filter_specs: Sequence[WhereFilterSpec],
query_contains_metric_time: bool,
queried_linkable_specs: LinkableSpecSet,
cumulative_description: Optional[CumulativeMeasureDescription],
) -> MetricInputMeasureSpec:
"""Return the input measure spec required to compute the base metric.
Expand Down Expand Up @@ -1101,15 +1101,23 @@ def _build_input_measure_spec_for_base_metric(
offset_to_grain=child_metric_offset_to_grain,
)

# Even if the measure is configured to join to time spine, if there's no metric_time in the query,
# there's no need to join to the time spine since all metric_time will be aggregated.
# Even if the measure is configured to join to time spine, if there's no agg_time_dimension in the query,
# there's no need to join to the time spine since all time will be aggregated.
after_aggregation_time_spine_join_description = None
if input_measure.join_to_timespine and query_contains_metric_time:
after_aggregation_time_spine_join_description = JoinToTimeSpineDescription(
join_type=SqlJoinType.LEFT_OUTER,
offset_window=None,
offset_to_grain=None,
)
if input_measure.join_to_timespine:
if (
len(
queried_linkable_specs.included_agg_time_dimension_specs_for_measure(
measure_reference=measure_spec.reference, semantic_model_lookup=self._semantic_model_lookup
)
)
> 0
):
after_aggregation_time_spine_join_description = JoinToTimeSpineDescription(
join_type=SqlJoinType.LEFT_OUTER,
offset_window=None,
offset_to_grain=None,
)

filter_specs: List[WhereFilterSpec] = []
filter_specs.extend(
Expand Down Expand Up @@ -1438,7 +1446,7 @@ def _build_aggregated_measure_from_measure_source_node(
)
return JoinToTimeSpineNode(
parent_node=aggregate_measures_node,
requested_agg_time_dimension_specs=list(queried_linkable_specs.metric_time_specs),
requested_agg_time_dimension_specs=queried_agg_time_dimension_specs,
use_custom_agg_time_dimension=not queried_linkable_specs.contains_metric_time,
join_type=after_aggregation_time_spine_join_description.join_type,
time_range_constraint=time_range_constraint,
Expand Down
20 changes: 20 additions & 0 deletions metricflow/test/integration/test_cases/itest_metrics.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -1194,6 +1194,26 @@ integration_test:
GROUP BY
1
---
integration_test:
name: simple_fill_nulls_with_0_agg_time_dim
description: Test a simple query that joins to time spine and fills nulls, with agg_time_dimension
model: SIMPLE_MODEL
metrics: ["bookings_fill_nulls_with_0"]
group_by_objs: [{"name": "booking__ds__day"}]
check_query: |
SELECT
subq_5.ds AS booking__ds__day
, COALESCE(subq_3.bookings, 0) AS bookings_fill_nulls_with_0
FROM {{ source_schema }}.mf_time_spine subq_5
LEFT OUTER JOIN (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so previously this would have just run the query but skipped the join, correct? That seems strange and incorrect to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic there was that if there's no time dimension to join to, we can't join to time spine anyway.
And then, if there is a time dimension that isn't agg time, should we time spine join to that dimension? Seems inconsistent if we're treating it like any other categorical dimension in all other ways.

SELECT
{{ render_date_trunc("ds", TimeGranularity.DAY) }} AS booking__ds__day
, SUM(1) AS bookings
FROM {{ source_schema }}.fct_bookings bookings_source_src_1
GROUP BY 1
) subq_3
ON subq_5.ds = subq_3.booking__ds__day
---
integration_test:
name: simple_fill_nulls_with_0_with_categorical_dimension
description: Test simple query that fills nulls but doesn't join to time spine (categorical dimension)
Expand Down
Loading