-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add test time spines for sub-daily granularity #1358
Changes from 5 commits
9cec4eb
d7ea994
fdcf983
ed9555c
2ef8de8
ae263fd
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3,6 +3,7 @@ | |
from typing import Mapping | ||
|
||
import pytest | ||
from dbt_semantic_interfaces.type_enums.time_granularity import TimeGranularity | ||
from metricflow_semantics.query.query_parser import MetricFlowQueryParser | ||
from metricflow_semantics.specs.column_assoc import ColumnAssociationResolver | ||
from metricflow_semantics.test_helpers.config_helpers import MetricFlowTestConfiguration | ||
|
@@ -90,7 +91,26 @@ def scd_query_parser( # noqa: D103 | |
|
||
|
||
@pytest.fixture(scope="session") | ||
def time_spine_source( # noqa: D103 | ||
def time_spine_sources( # noqa: D103 | ||
sql_client: SqlClient, mf_test_configuration: MetricFlowTestConfiguration # noqa: F811 | ||
) -> TimeSpineSource: | ||
return TimeSpineSource(schema_name=mf_test_configuration.mf_source_schema, table_name="mf_time_spine") | ||
) -> Mapping[TimeGranularity, TimeSpineSource]: | ||
legacy_time_spine_grain = TimeGranularity.DAY | ||
time_spine_base_table_name = "mf_time_spine" | ||
print("expected schema name:", mf_test_configuration.mf_source_schema) | ||
# Legacy time spine | ||
time_spine_sources = { | ||
legacy_time_spine_grain: TimeSpineSource( | ||
schema_name=mf_test_configuration.mf_source_schema, table_name=time_spine_base_table_name | ||
) | ||
} | ||
Comment on lines
+101
to
+105
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Wait, does this mean that we're no longer triggering the fallback behavior in the runtime, because we're overriding it with an explicit time spine input? What happens if I define an HOUR grain time spine but leave the DAY grain one in the original configuration? Does that fail unceremoniously, do we raise an informative error, or do we just use the HOUR spine for everything? I'm actually fine with raising an informative error or using the HOUR spine, especially for now, I just realized I'm just not clear on what happens. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The legacy time spine config will only be included in your manifest if you have the |
||
# Current time spines | ||
for granularity in TimeGranularity: | ||
if granularity.to_int() < legacy_time_spine_grain.to_int(): | ||
time_spine_sources[granularity] = TimeSpineSource( | ||
schema_name=mf_test_configuration.mf_source_schema, | ||
table_name=f"{time_spine_base_table_name}_{granularity.value}", | ||
time_column_name="ts", | ||
time_column_granularity=granularity, | ||
) | ||
|
||
return time_spine_sources |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
table_snapshot: | ||
table_name: mf_time_spine_hour | ||
column_definitions: | ||
- name: ts | ||
type: TIME | ||
rows: | ||
- ["2020-01-01 01:00:00"] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Huh. I just realized, are we going to date_trunc the time spine input to the specified grain? I'm pretty sure we don't do it today, but there's a type for it (DATE). The spec calls for the end user to configure that correctly, so I'm inclined not to date_trunc right now, but it might be something we need to do. Presumably most people are using packages to build these things so maybe we just rely on that. If we're worried but not very worried about this we could also set up a best-effort warehouse validation, or release a validation package for time spine models that people can use if they wish. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We don't apply that |
||
- ["2020-01-01 02:00:00"] | ||
- ["2020-01-01 03:00:00"] | ||
- ["2020-01-01 04:00:00"] | ||
- ["2020-01-01 05:00:00"] | ||
- ["2020-01-01 06:00:00"] | ||
- ["2020-01-01 07:00:00"] | ||
- ["2020-01-01 08:00:00"] | ||
- ["2020-01-01 09:00:00"] | ||
- ["2020-01-01 010:00:00"] | ||
- ["2020-01-01 11:00:00"] | ||
- ["2020-01-01 12:00:00"] | ||
- ["2020-01-02 01:00:00"] | ||
- ["2020-01-02 02:00:00"] | ||
- ["2020-01-02 03:00:00"] | ||
- ["2020-01-02 04:00:00"] | ||
- ["2020-01-02 05:00:00"] | ||
- ["2020-01-02 06:00:00"] | ||
- ["2020-01-02 07:00:00"] | ||
- ["2020-01-02 08:00:00"] | ||
- ["2020-01-02 09:00:00"] | ||
- ["2020-01-02 010:00:00"] | ||
- ["2020-01-02 11:00:00"] | ||
- ["2020-01-02 12:00:00"] | ||
- ["2020-01-03 01:00:00"] | ||
- ["2020-01-03 02:00:00"] | ||
- ["2020-01-03 03:00:00"] | ||
- ["2020-01-03 04:00:00"] | ||
- ["2020-01-03 05:00:00"] | ||
- ["2020-01-03 06:00:00"] | ||
- ["2020-01-03 07:00:00"] | ||
- ["2020-01-03 08:00:00"] | ||
- ["2020-01-03 09:00:00"] | ||
- ["2020-01-03 010:00:00"] | ||
- ["2020-01-03 11:00:00"] | ||
- ["2020-01-03 12:00:00"] |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
table_snapshot: | ||
table_name: mf_time_spine_microsecond | ||
column_definitions: | ||
- name: ts | ||
type: TIME | ||
rows: | ||
- ["2020-01-01 00:00:00.000000"] | ||
- ["2020-01-01 00:00:00.000001"] | ||
- ["2020-01-01 00:00:00.000002"] | ||
- ["2020-01-01 00:00:00.000003"] | ||
- ["2020-01-01 00:00:00.000004"] | ||
- ["2020-01-01 00:00:00.000005"] | ||
- ["2020-01-01 00:00:00.000006"] | ||
- ["2020-01-01 00:00:00.000007"] | ||
- ["2020-01-01 00:00:00.000008"] | ||
- ["2020-01-01 00:00:00.000009"] | ||
- ["2020-01-01 00:00:00.000010"] | ||
- ["2020-01-01 00:00:00.000011"] | ||
- ["2020-01-01 00:00:00.000012"] | ||
- ["2020-01-01 00:00:00.000013"] | ||
- ["2020-01-01 00:00:00.000014"] | ||
- ["2020-01-01 00:00:00.000015"] | ||
- ["2020-01-01 00:00:00.000016"] | ||
- ["2020-01-01 00:00:00.000017"] | ||
- ["2020-01-01 00:00:00.000018"] | ||
- ["2020-01-01 00:00:00.000019"] | ||
- ["2020-01-01 00:00:00.000020"] | ||
- ["2020-01-01 00:00:00.000021"] | ||
- ["2020-01-01 00:00:00.000022"] | ||
- ["2020-01-01 00:00:00.000023"] | ||
- ["2020-01-01 00:00:00.000024"] | ||
- ["2020-01-01 00:00:00.000025"] | ||
- ["2020-01-01 00:00:00.000026"] | ||
- ["2020-01-01 00:00:00.000027"] | ||
- ["2020-01-01 00:00:00.000028"] | ||
- ["2020-01-01 00:00:00.000029"] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We might want values from multiple days, even though they won't be contiguous in the input. I'm not sure if this really matters but I'm always wary of having test data pegged to a boundary (in this case, a year boundary). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fair enough - I can update that tomorrow! Shouldn't impact any of the tests, just will need to repopulate the source schemas. |
||
- ["2020-01-01 00:00:00.000030"] |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
table_snapshot: | ||
table_name: mf_time_spine_millisecond | ||
column_definitions: | ||
- name: ts | ||
type: TIME | ||
rows: | ||
- ["2020-01-01 00:00:00.001"] | ||
- ["2020-01-01 00:00:00.002"] | ||
- ["2020-01-01 00:00:00.003"] | ||
- ["2020-01-01 00:00:00.004"] | ||
- ["2020-01-01 00:00:00.005"] | ||
- ["2020-01-01 00:00:00.006"] | ||
- ["2020-01-01 00:00:00.007"] | ||
- ["2020-01-01 00:00:00.008"] | ||
- ["2020-01-01 00:00:00.009"] | ||
- ["2020-01-01 00:00:00.010"] | ||
- ["2020-01-01 00:00:00.011"] | ||
- ["2020-01-01 00:00:00.012"] | ||
- ["2020-01-01 00:00:00.013"] | ||
- ["2020-01-01 00:00:00.014"] | ||
- ["2020-01-01 00:00:00.015"] | ||
- ["2020-01-01 00:00:00.016"] | ||
- ["2020-01-01 00:00:00.017"] | ||
- ["2020-01-01 00:00:00.018"] | ||
- ["2020-01-01 00:00:00.019"] | ||
- ["2020-01-01 00:00:00.020"] | ||
- ["2020-01-01 00:00:00.021"] | ||
- ["2020-01-01 00:00:00.022"] | ||
- ["2020-01-01 00:00:00.023"] | ||
- ["2020-01-01 00:00:00.024"] | ||
- ["2020-01-01 00:00:00.025"] | ||
- ["2020-01-01 00:00:00.026"] | ||
- ["2020-01-01 00:00:00.027"] | ||
- ["2020-01-01 00:00:00.028"] | ||
- ["2020-01-01 00:00:00.029"] | ||
- ["2020-01-01 00:00:00.030"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I think I know why this is. Timestamp literals in Trino have to take the form
TIMESTAMP <literal>
so we'd need to do some substitution somewhere.If the error is due to the filter it means we have to do custom rendering against the filter expr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. I couldn't come up with an easy way to fix it here and wasn't sure it was worth doing the hard fix for an engine we barely use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. I think the right way to deal with this is via a more holistic approach to filter expression inputs. In the meantime, having a small gap in Trino test coverage - particularly one where the main difference essentially boils down to how we go about pasting in the user-provided expression at render time - seems fine to me.