Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow measures that join to time spine to use agg_time_dim #1005

Merged
merged 21 commits into from
Feb 5, 2024

Conversation

courtneyholcomb
Copy link
Contributor

@courtneyholcomb courtneyholcomb commented Jan 26, 2024

Resolves #1009

Description

Some measures specify that they should be joined to time spine in the YAML. We only execute this join if metric time is queried, since that's the only time joining to time spine is allowed. Since we're relaxing that restriction, we now can execute the join if agg_time_dimension is queried.

@cla-bot cla-bot bot added the cla:yes label Jan 26, 2024
Copy link

Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the contributing guide.

@courtneyholcomb courtneyholcomb force-pushed the court/join-to-time-spine-agg-time-dim branch from 1e43d95 to 8b0a072 Compare January 26, 2024 22:44
@courtneyholcomb courtneyholcomb changed the base branch from main to court/time-offset-agg-time-dim January 26, 2024 22:44
@courtneyholcomb courtneyholcomb force-pushed the court/join-to-time-spine-agg-time-dim branch from 8b0a072 to fd61f3c Compare January 26, 2024 22:49
@courtneyholcomb courtneyholcomb marked this pull request as ready for review January 26, 2024 22:54
@courtneyholcomb courtneyholcomb changed the title Allow measures that join to time spine to use agg_time_dim Allow measures that join to time spine to use agg_time_dim Jan 26, 2024
@courtneyholcomb courtneyholcomb added the Run Tests With Other SQL Engines Runs the test suite against the SQL engines in our target environment label Jan 26, 2024
@courtneyholcomb courtneyholcomb removed the Run Tests With Other SQL Engines Runs the test suite against the SQL engines in our target environment label Jan 26, 2024
Copy link
Contributor

@tlento tlento left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking in before approving - are there any cases where this might result in an ambiguous plan?

I'm thinking of a situation where someone has multiple metrics, or some combination nested metric, and they group by this agg_time_dimension and also metric_time. I'm not even sure if that's a valid query ordinarily, honestly.

Comment on lines 1110 to 1119
query_contains_agg_time_dimension = queried_linkable_specs.contains_metric_time
if not query_contains_agg_time_dimension:
valid_agg_time_dimensions = self._semantic_model_lookup.get_agg_time_dimension_specs_for_measure(
measure_spec.reference
)
query_contains_agg_time_dimension = bool(
set(queried_linkable_specs.time_dimension_specs).intersection(set(valid_agg_time_dimensions))
)

if query_contains_agg_time_dimension:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah if we have some kind of helper I think it'll clean up a lot of these conditionals as well, because it'll hide the wonky logic inside of the helper and this can become something like:

if query_contains_agg_time_dimension(queried_linkable_specs, measure_spec):

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated!

subq_5.ds AS booking__ds__day
, COALESCE(subq_3.bookings, 0) AS bookings_fill_nulls_with_0
FROM {{ source_schema }}.mf_time_spine subq_5
LEFT OUTER JOIN (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so previously this would have just run the query but skipped the join, correct? That seems strange and incorrect to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic there was that if there's no time dimension to join to, we can't join to time spine anyway.
And then, if there is a time dimension that isn't agg time, should we time spine join to that dimension? Seems inconsistent if we're treating it like any other categorical dimension in all other ways.

@courtneyholcomb
Copy link
Contributor Author

Checking in before approving - are there any cases where this might result in an ambiguous plan?

I'm thinking of a situation where someone has multiple metrics, or some combination nested metric, and they group by this agg_time_dimension and also metric_time. I'm not even sure if that's a valid query ordinarily, honestly.

@tlento This clause should be able to execute for as many measures as are requested, each join will happen in its own subquery. We're checking for the agg_time_dimension specific to the measure here. So we would join to time spine only for measures whose agg_time_dimension was queried (or if metric_time was queried).
Does that answer the question?

@tlento tlento added the Run Tests With Other SQL Engines Runs the test suite against the SQL engines in our target environment label Jan 31, 2024
@tlento tlento temporarily deployed to DW_INTEGRATION_TESTS January 31, 2024 20:44 — with GitHub Actions Inactive
@tlento tlento temporarily deployed to DW_INTEGRATION_TESTS January 31, 2024 20:44 — with GitHub Actions Inactive
@tlento tlento temporarily deployed to DW_INTEGRATION_TESTS January 31, 2024 20:44 — with GitHub Actions Inactive
@tlento tlento temporarily deployed to DW_INTEGRATION_TESTS January 31, 2024 20:44 — with GitHub Actions Inactive
@github-actions github-actions bot removed the Run Tests With Other SQL Engines Runs the test suite against the SQL engines in our target environment label Jan 31, 2024
Copy link
Contributor

@tlento tlento left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This clause should be able to execute for as many measures as are requested, each join will happen in its own subquery.

Duh, of course it does, thank you.

Also, in re-reading that comment, I think the right answer for people duplicating their requested time dimensions across metrics is "don't do that" - I'm sure I'm missing some super critical use case where that's a totally reasonable thing to do, but it doesn't sound reasonable when I read it out loud.

Base automatically changed from court/time-offset-agg-time-dim to main February 3, 2024 02:40
@courtneyholcomb courtneyholcomb merged commit 63e7ffb into main Feb 5, 2024
9 checks passed
@courtneyholcomb courtneyholcomb deleted the court/join-to-time-spine-agg-time-dim branch February 5, 2024 19:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[SL-1619] [Feature] Allow measures that join to time spine to use agg_time_dimension
2 participants