-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add CumulativeTypeParams
& sub-daily granularities to semantic manifest
#10350
Conversation
e394800
to
2cd734c
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #10350 +/- ##
==========================================
+ Coverage 88.75% 88.77% +0.01%
==========================================
Files 180 180
Lines 22486 22517 +31
==========================================
+ Hits 19958 19989 +31
Misses 2528 2528
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
3b286ec
to
40bb8e7
Compare
Pulling out the yaml spec here:
|
CumulativeTypeParams
to semantic manifestCumulativeTypeParams
& sub-daily granularities to semantic manifest
CumulativeTypeParams
& sub-daily granularities to semantic manifestCumulativeTypeParams
to semantic manifest
CumulativeTypeParams
to semantic manifestCumulativeTypeParams
& sub-daily granularities to semantic manifest
} | ||
], | ||
"default": null | ||
"type": "array", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what this change is - I don't think it's related to my changes, might be that someone else did not update the schema in their commit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 I'm not sure either. I went on an archeological mission, and it looks like it was added in #10096. That PR adds primary_key
to the model resource. However, it appears it was added as a list of strings, never optional. To me, it looks like a mistake, and I think this removal of null is okay 🤷 @dave-connors-3 do you have more context?
@@ -17930,26 +18188,7 @@ | |||
"type": "string" | |||
}, | |||
"resource_type": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change also seems unrelated to my PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a fix. Basically, the resource_type
of a saved query isn't actually an enum, it's should always be saved_query
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what's happening is we aren't running the scripts/collect-artifact-schema.py
as often as we should. I'll bring this up to the rest of the core team. Maybe we can have a CI check for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great! Thank you for doing this work ❤️ I'm not sure on everything happening in manifest/v12.json
but it looks reasonable.
There is one thing we probably should change though. In the DSI counterpart (dbt-labs/dbt-semantic-interfaces#288) we discussed the migration plan. I think from that we are missing 2.b
where in we populate the new paths from the old paths (iff the new paths aren't specified and the old paths are)
"default": "1.8.0a1" | ||
"default": "1.9.0a1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MichelleArk / @ChenyuLInx This looks correct to me, but I don't know if there'd be any fall out from doing so. This seem okay?
} | ||
], | ||
"default": null | ||
"type": "array", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 I'm not sure either. I went on an archeological mission, and it looks like it was added in #10096. That PR adds primary_key
to the model resource. However, it appears it was added as a list of strings, never optional. To me, it looks like a mistake, and I think this removal of null is okay 🤷 @dave-connors-3 do you have more context?
@@ -17930,26 +18188,7 @@ | |||
"type": "string" | |||
}, | |||
"resource_type": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a fix. Basically, the resource_type
of a saved query isn't actually an enum, it's should always be saved_query
@@ -17930,26 +18188,7 @@ | |||
"type": "string" | |||
}, | |||
"resource_type": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what's happening is we aren't running the scripts/collect-artifact-schema.py
as often as we should. I'll bring this up to the rest of the core team. Maybe we can have a CI check for it.
), | ||
cumulative_type_params=self._get_optional_cumulative_type_params( | ||
type_params.cumulative_type_params | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think either here or in _get_optional_cumulative_type_params
we want to fall back on the deprecated values type_params.window
and type_params.grain_to_date
to populate the new values type_params.cumulative_type_params.window
and type_params.cululative_type_params.grain_to_date
respectively, if the new paths haven't been specified but the old paths have.
@QMalcolm ah ok! We are running the transformation in MF that should handle this. I didn't realize we would need to handle that in core, too. What's the reason for needing it in both places? |
In a vacuum, if all we care about is the functionality working, then that would be correct. However, that isn't our only goal. Our additional goal is to move in the direction of deprecating and removing the old paths. In order for DSI to remove the old paths, we first need to reasonably guarantee that nearly all semantic manifests being consumed have the new paths set when applicable. There are four possible states for the new paths and old paths:
Currently, any core produced semantic manifest will produce (3). In order for DSI to ever remove the old paths (and the related transformations), we have to guarantee that semantic manifests being consumed by the MetricFlow / SL don't have state (3). So, although we don't technically need to do it now, we need to do it eventually. However, the longer we wait to do it, the longer DSI and Metricflow have to handle both states. Thus, doing it now means the tech debt we incur can be shorter lived and arguably more contained. |
3818e35
to
96b79a8
Compare
@QMalcolm Updated, and added tests for the old paths too. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM ❤️ Thanks again!
def test_cumulative_metric(self, project): | ||
# initial parse | ||
runner = dbtRunner() | ||
result = runner.invoke(["parse"]) | ||
assert result.success | ||
assert isinstance(result.result, Manifest) | ||
|
||
manifest = get_manifest(project.project_root) | ||
metric_ids = set(manifest.metrics.keys()) | ||
expected_metric_ids_to_cumulative_type_params = { | ||
"metric.test.weekly_visits": CumulativeTypeParams( | ||
window=MetricTimeWindow(count=7, granularity=TimeGranularity.DAY), | ||
period_agg=PeriodAggregation.AVERAGE, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could probably be converted to unit test. I don't think it's necessary to do that as part of this PR though. I've been working on being able to unit test the reader clases in schema_yaml_readers.py
, and it should be possible, but I think there is one kink to still work out first.
Hmmm so the |
@QMalcolm I thought that was what I need the schema docs change for? Linked in the PR description but also here! dbt-labs/schemas.getdbt.com#42 |
@courtneyholcomb I looked into the failing check. Basically, if it fails, we know that this change impacts the cloud artifacts team. I've added the label |
@courtneyholcomb Additionally, in talking to the core team about the check. It is specifically not blocking, and we are good to proceed. |
@QMalcolm awesome thank you! Merging now |
@QMalcolm just kidding, I don't have permission to merge 🙃 Would you be able to merge for me? |
### Description Enable validations for `cumulative_type_params` and bump to a new patch version. This version will be added to dbt-core after [this PR](dbt-labs/dbt-core#10350) merges to ensure these fields are validated once they are enabled. ### Checklist - [x] I have read [the contributing guide](https://github.com/dbt-labs/dbt-semantic-interfaces/blob/main/CONTRIBUTING.md) and understand what's expected of me - [x] I have signed the [CLA](https://docs.getdbt.com/docs/contributor-license-agreements) - [x] This PR includes tests, or tests are not required/relevant for this PR - [ ] I have run `changie new` to [create a changelog entry](https://github.com/dbt-labs/dbt-semantic-interfaces/blob/main/CONTRIBUTING.md#adding-a-changelog-entry)
resolves #10360
CumulativeTypeParams
DSI has recently moved fields related to cumulative metrics (specifically,
window
andgrain_to_date
) fromtype_params
totype_params.cumulative_type_params
. This creates better organization in the metric spec, and also adds a new field calledtype_params.cumulative_type_params.period_agg
which is used to re-aggregate cumulative metrics for non-default granularities. This PR adds core support for the new fields.Note that the old fields
type_params.window
andtype_params.grain_to_date
will still work, so this is not a breaking change. There is a transformation that runs in MetricFlow to copy the old fields into the new fields if the new fields are not set, and sets the defaultperiod_agg
value toPeriodAggregation.FIRST
if it's not already set in the manifest.Soon, setting the old fields will trigger a validation warning prompting users to move those values to the new fields. The plan is to eventually deprecate those fields.
Sub-Daily Granularities
Since the new sub-daily granularity options are included in the DSI version upgrade, this PR also adds support for those.
Notes
Checklist