Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support multiple time spines using new time spine configs #1348

Merged
merged 5 commits into from
Jul 26, 2024

Conversation

courtneyholcomb
Copy link
Contributor

@courtneyholcomb courtneyholcomb commented Jul 25, 2024

Completes SL-2313

Consume new time spine configs and add support for using multiple time spines at different granularities.

@cla-bot cla-bot bot added the cla:yes label Jul 25, 2024
@courtneyholcomb courtneyholcomb changed the title Upgrade DSI Support new time spines Jul 25, 2024
@dbt-labs dbt-labs deleted a comment from github-actions bot Jul 25, 2024
@courtneyholcomb courtneyholcomb changed the title Support new time spines Support multiple time spines using new time spine configs Jul 25, 2024
Copy link

linear bot commented Jul 25, 2024

@courtneyholcomb courtneyholcomb marked this pull request as ready for review July 25, 2024 23:38
Copy link
Contributor

@tlento tlento left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable so far. I assume we will have more robust tests added as we get some models with custom time spines in place.

I left a bunch of inlines. None of them are merge blocking, but may as well clean up whatever you think needs cleaning up before hitting the button.

return (
self.source_nodes_for_metric_queries + self.source_nodes_for_group_by_item_queries + (self.time_spine_node,)
)
return self.source_nodes_for_metric_queries + self.source_nodes_for_group_by_item_queries
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I read below and I see why this is this way, but I think it's a bit weird - this is mainly an artifact of us expecting callers to wire in the time spine nodes as group by item source nodes.

I can't come up with anything obviously better, and it's probably outside the scope of this PR to do that refactor, but we should probably address this. I suspect it'll get confusing with things like custom calendar processing.

Comment on lines 62 to 73
self._time_spine_source_nodes = {
granularity: MetricTimeDimensionTransformNode.create(
parent_node=ReadSqlSourceNode.create(
data_set=data_set_converter.build_time_spine_source_data_set(time_spine_source)
),
aggregation_time_dimension_reference=TimeDimensionReference(
element_name=time_spine_source.time_column_name
),
)
for granularity, time_spine_source in TimeSpineSource.create_from_manifest(
semantic_manifest_lookup.semantic_manifest
).items()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style nit - I think this would be easier to read if it's building a dict via a for loop. That's a lot of function nesting inside a dict comprehension. It'd also be a much smaller change in the diff, since we could just indent the locals inside the loop and replace the assignment with an add to dict.

# Provides the time spine.
time_spine_node: MetricTimeDimensionTransformNode
# Provides the time spines.
time_spine_nodes: Dict[TimeGranularity, MetricTimeDimensionTransformNode]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might consider typing this as a Mapping so it's not considered mutable.

assert (
agg_time_dimension_instances
), "Building time spine dataset requires agg_time_dimension_instances, but none were found."
smallest_agg_time_grain = min(dim.spec.time_granularity for dim in agg_time_dimension_instances)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, this works but it'll start failing with custom granularities. Good thing for me to note here - we'll need to be able to go from the agg time dimension instance to the time spine index granularity and check up on that.

if not compatible_time_spine_grains:
raise RuntimeError(
# TODO: update docs link when new docs are available
f"No compatible time spine found. This query requires a time spine with granularity {smallest_agg_time_grain} or smaller. See docs to configure a new time spine: https://docs.getdbt.com/docs/build/metricflow-time-spine"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's include the smallest time spine granularity as well. Something like:

This query requires a time spine with granularity {smallest_agg_time_grain}, but the smallest available time spine granularity is {min(self._time_spine_sources.keys())}, which is too large.

time_spine_sources = {
time_spine.primary_column.time_granularity: TimeSpineSource(
schema_name=time_spine.node_relation.schema_name,
table_name=time_spine.node_relation.relation_name, # is relation name the table name? double check
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

relation_name should be the table-or-view name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops I forgot about this comment 😅 thank you!

# backward compatibility. Ignore it if there is a new time spine config with the same granularity.
legacy_time_spines = semantic_manifest.project_configuration.time_spine_table_configurations
for legacy_time_spine in legacy_time_spines:
if not time_spine_sources.get(legacy_time_spine.grain):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use in instead of .get() here:

if legacy_time_spine.grain not in time_spine_sources:

That way you don't get weird bugs with falsey hashable values.

@@ -14,7 +15,9 @@ def test_date_spine_date_range( # noqa: D103
simple_semantic_manifest_lookup: SemanticManifestLookup,
create_source_tables: None,
) -> None:
time_spine_source = TimeSpineSource.create_from_manifest(simple_semantic_manifest_lookup.semantic_manifest)
time_spine_source = TimeSpineSource.create_from_manifest(simple_semantic_manifest_lookup.semantic_manifest)[
TimeGranularity.DAY
Copy link
Contributor

@tlento tlento Jul 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh ok. Technically this is the legacy time spine fallback, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, honestly, I'm a bit confused about why this test is here. I guess it's essentially testing that the legacy time spine model we have documented somewhere has the values we expect. But this feels like it should live in core or something, not here, since it's not actually related to MF functionality.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually it looks like its just testing the test data we have in tests_metricflow/fixtures/source_table_snapshots/time_spine_table/mf_time_spine.yaml! I guess that might be useful to keep people from changing it? ¯_(ツ)_/¯

@dbt-labs dbt-labs deleted a comment from github-actions bot Jul 26, 2024
@courtneyholcomb courtneyholcomb enabled auto-merge (squash) July 26, 2024 05:09
@courtneyholcomb courtneyholcomb merged commit 08016ac into main Jul 26, 2024
15 checks passed
@courtneyholcomb courtneyholcomb deleted the cout/time-spines branch July 26, 2024 05:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants