Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix incorrect cumulative metric output when start-time and end-time are used #871

Merged
merged 2 commits into from
Nov 15, 2023

Conversation

tlento
Copy link
Contributor

@tlento tlento commented Nov 15, 2023

When users submit CLI queries with the --start-time and --end-time
parameters, MetricFlow will automatically adjust the time constraint
window to include all relevant cumulative metric input.

Unfortunately, we got over-zealous with fencepost adjustments and
introduced an off-by-one error that meant cumulative metric
queries would only be correct when the specified granularity was
DAILY.

We never caught this issue because the error in the rendered SQL is
subtle, and all of our test input data on the integration tests
happened to fit within the artificially shortened constraint windows
in all of the cases where we were testing non-daily granularity
queries.

This change updates the integration test configurations to ensure
boundary conditions will be caught in the future. In addition, the
fix alters the values of the query output snapshot tests, which,
when compared with the relevant input test data, will reveal the
nature of the original bug and the effects of this fix.

Cumulative metric queries are very difficult to reason about,
especially when we do automated date sliding to include all of
the necessary input data.

This commit adds some output tests, which can be compared against
the input data config files for fct_bookings and fct_revenue for
correctness.
When users submit CLI queries with the --start-time and --end-time
parameters, MetricFlow will automatically adjust the time constraint
window to include all relevant cumulative metric input.

Unfortunately, we got over-zealous with fencepost adjustments and
introduced an off-by-one error that meant cumulative metric
queries would only be correct when the specified granularity was
DAILY.

We never caught this issue because the error in the rendered SQL is
subtle, and all of our test input data on the integration tests
happened to fit within the artificially shortened constraint windows
in all of the cases where we were testing non-daily granularity
queries.

This change updates the integration test configurations to ensure
boundary conditions will be caught in the future. In addition, the
fix alters the values of the query output snapshot tests, which,
when compared with the relevant input test data, will reveal the
nature of the original bug and the effects of this fix.
@cla-bot cla-bot bot added the cla:yes label Nov 15, 2023
@@ -80,7 +80,7 @@ def adjust_time_constraint_for_cumulative_metric(
) -> TimeRangeConstraint:
"""Given a time constraint for the overall query, adjust it to cover the time range for this metric."""
if granularity is not None:
return self._adjust_time_constraint_start_by_window(granularity, count - 1)
return self._adjust_time_constraint_start_by_window(granularity, count)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the actual fix.

@@ -6,7 +6,7 @@ integration_test:
metrics: ["trailing_2_months_revenue"]
group_bys: ["metric_time"]
order_bys: ["metric_time"]
time_constraint: ["2020-01-05", "2021-01-04"]
time_constraint: ["2020-03-05", "2021-01-04"]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left these tests in place for a couple of reasons, but one is to illustrate how the constraint adjustments need to be applied here vs in the where clause.

A number of these were producing correct results on inspection of input data only because the input data didn't start until within the too-short filter window.

1 2021-01-04 2000
2 2021-01-05 2000
3 2021-01-06 2000
0 2021-01-03 3000
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tlento tlento linked an issue Nov 15, 2023 that may be closed by this pull request
2 tasks
Copy link
Contributor

@courtneyholcomb courtneyholcomb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice find 🙏 🙏 Thank you for fixing!!

@tlento tlento merged commit 8e69cda into main Nov 15, 2023
22 checks passed
@tlento tlento deleted the fix-incorrect-cumulative-metric-output branch November 15, 2023 03:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] --start-time / --end-time with cumulative windows broken
2 participants