Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SL-1987] Post-join filter application on time spine queries results in invalid query exceptions #1119

Closed
Tracked by #1010
tlento opened this issue Apr 10, 2024 · 3 comments · Fixed by #1120
Closed
Tracked by #1010
Assignees
Labels
bug Something isn't working High priority Created by Linear-GitHub Sync Metricflow Created by Linear-GitHub Sync

Comments

@tlento
Copy link
Contributor

tlento commented Apr 10, 2024

For details please refer to the user report in slack.

Due to an issue with the time spine join causing an explosion of extra date/time rows in the output results (see #1039 for details), we added a second-stage filter operation in #1056 . Unfortunately, the test cases covered time-only filters, but not filters including time and some other dimension/entity filter.

There are two problems here:

  1. In the case where the dimension is selected the time spine will be an expensive no-op, as we will apply the filter a second time, thereby throwing away any time spine rows that don't match the dimension. So if you did something where you filtered on listing__is_deactivated and you wanted to fill zeros for all the dates where no deactivated listings had activity, you'd just filter out all of those dates because there would be a NULL in the column in the outer query.
  2. In the case where the dimension is not selected, the query will fail as in the linked thread.

There is no clean solution to this problem today. We either have unexpected dates, as described in #1056, or we have inappropriate behavior, as described here.

Happily, predicate pushdown will offer a way out of this mess. If the time-identification components of predicate pushdown will take too long to materialize we should probably just roll back #1056 before our next release, as returning too many rows is something end users can work around while we give them the tools necessary to apply pre-filtering.

SL-1987

@tlento tlento changed the title Post-join filter application on time spine queries results in invalid query exceptions [SL-1987] Post-join filter application on time spine queries results in invalid query exceptions Apr 10, 2024
@tlento tlento added bug Something isn't working High priority Created by Linear-GitHub Sync labels Apr 10, 2024
@tlento
Copy link
Contributor Author

tlento commented Apr 10, 2024

To be clear, the things we need for a complete solution are:

  1. Remove the existing double-filter logic
  2. Enable pushdown of metric_time or agg_time_dimension filters to the time_spine

This issue should be closed when the first of these is complete, and predicate pushdown will account for the time spine in a more natural way. I've linked it to that main tracking task so we don't forget to push to time spine, but that's hardly a thing I'm likely to forget given the layout of the code.

@tlento tlento self-assigned this Apr 10, 2024
@tlento tlento added this to the v.18 milestone Apr 10, 2024
@tlento tlento assigned courtneyholcomb and unassigned tlento Apr 10, 2024
@tlento tlento modified the milestones: v.18, v.19 Apr 23, 2024
@tlento tlento modified the milestones: v.19, v.20 May 7, 2024
@tlento tlento added the Metricflow Created by Linear-GitHub Sync label May 8, 2024
@tlento tlento modified the milestones: v.20, v.21 May 21, 2024
@courtneyholcomb
Copy link
Contributor

@tomkit.lento I don't think we want to remove the double-filter logic, do we? My thought is that we'll end up with the scenario where you've selected a categorical dimension, filtered out nulls, and then we join them back in so you see nulls in the output. I think the customer would expect the where filter to override the join_to_timespine directive.

My proposed solution is to only apply the second filter if the filtered spec is also requested in the group by (since those should be the only ones that might return unexpected results).

@tlento tlento linked a pull request May 23, 2024 that will close this issue
@tlento tlento reopened this May 23, 2024
@tlento tlento closed this as completed May 23, 2024
@tlento tlento reopened this May 23, 2024
@tlento tlento closed this as completed May 23, 2024
@tlento tlento reopened this May 23, 2024
@tlento tlento closed this as completed May 23, 2024
@tlento tlento reopened this May 23, 2024
@tlento tlento closed this as completed May 23, 2024
@tlento tlento reopened this May 23, 2024
@tlento tlento closed this as completed May 23, 2024
@tlento tlento reopened this May 23, 2024
@tlento tlento closed this as completed May 23, 2024
@tlento tlento reopened this May 23, 2024
@tlento tlento closed this as completed May 23, 2024
@tlento tlento reopened this May 23, 2024
@tlento tlento closed this as completed May 23, 2024
@tlento tlento reopened this May 23, 2024
@tlento tlento closed this as completed May 23, 2024
@tlento tlento reopened this May 23, 2024
@tlento tlento closed this as completed May 23, 2024
@tlento tlento reopened this May 23, 2024
@tlento tlento closed this as completed May 23, 2024
@tlento tlento reopened this May 23, 2024
@tlento tlento closed this as completed May 23, 2024
@tlento tlento reopened this May 23, 2024
@tlento tlento closed this as completed May 23, 2024
@tlento tlento reopened this May 23, 2024
@tlento tlento closed this as completed May 23, 2024
@tlento tlento reopened this May 23, 2024
@tlento tlento closed this as completed May 23, 2024
@tlento tlento reopened this May 23, 2024
@tlento tlento closed this as completed May 23, 2024
@tlento tlento reopened this May 23, 2024
@tlento tlento closed this as completed May 23, 2024
@tlento tlento reopened this May 23, 2024
@tlento tlento closed this as completed May 23, 2024
@tlento tlento reopened this May 23, 2024
@tlento tlento closed this as completed May 23, 2024
@tlento tlento reopened this May 23, 2024
@tlento
Copy link
Contributor Author

tlento commented Jun 17, 2024

I'm pretty sure this is fixed and I have no idea why it's showing that I've been opening and closing and opening and closing. Super weird.

@tlento tlento closed this as completed Jun 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working High priority Created by Linear-GitHub Sync Metricflow Created by Linear-GitHub Sync
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants