Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encapsulate time range constraints for predicate pushdown #1215

Merged
merged 1 commit into from
May 24, 2024

Conversation

tlento
Copy link
Contributor

@tlento tlento commented May 15, 2024

MetricFlow currently allows for a limited scope form of filter
predicate pushdown that is particular to time range constraints where
the querying user has provided an explicit time window for us to
query against. The application of the pushdown operation is managed
by threading the time range constraint through the entire dataflow
plan builder and applying the filter operation as appropriate.

This is exactly what we need to do for robust predicate pushdown
evaluation for our expanded set of pushdown operations. Rather than
wire a whole new set of parameters through, we simply encapsulate
the time range constraints inside of a new object that is more
readily extensible to other predicate pushdown handling.

This specific change is as mechanical as possible in order to minimize
confusion. Places that stood out for improvement via the encapsulating
object have been marked for later updates, which will follow shortly.

@tlento tlento requested review from courtneyholcomb and plypaul May 15, 2024 22:22
@cla-bot cla-bot bot added the cla:yes label May 15, 2024
Copy link
Contributor Author

tlento commented May 15, 2024

Copy link

Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the contributing guide.

@tlento tlento force-pushed the add-conversion-metric-rendering-tests branch from 876b8fd to 5ee3296 Compare May 15, 2024 22:47
@tlento tlento force-pushed the encapsulate-time-range-constraint-for-pushdown branch from 10a9d01 to bfb75ba Compare May 15, 2024 22:47
@@ -59,6 +59,16 @@ class MultiHopJoinCandidate:
lineage: MultiHopJoinCandidateLineage


@dataclass(frozen=True)
class PredicatePushdownParameters:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: The issue that I've run into with classes that are plural is that when you have a variable that is a collection of them, it's unclear how to name the variable. I've been using ...ParameterSet instead, but up for other ideas.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, good point. I don't think we'd ever want to collect these but also I never like having the name parameters in a class. It's not utils-level bad but it still basically means "bag of crap."

Maybe PredicatePushdownState? It effectively tracks the known predicates and whether or not to attempt to push them down, so at any point you have a representation of state you can use to apply a constraint (or not).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops I already use PredicatePushdownState as a state enumeration later on. 🤦

PredicatePushdownDirective? PredicatePushdownEvaluationInput? ThingsWeNeedToDoPredicatePushdownCorrectly?

I'm going with directive because it's least verbose. We can rename later if we need to.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@plypaul PredicatePushdownState is now available! Renamed in #1224 - tried to do it here but it was a huge pain.

@tlento tlento force-pushed the add-conversion-metric-rendering-tests branch from fc69e24 to 820afa1 Compare May 23, 2024 23:41
@tlento tlento force-pushed the encapsulate-time-range-constraint-for-pushdown branch from 88723e2 to ece1e07 Compare May 23, 2024 23:41
Copy link
Contributor Author

tlento commented May 24, 2024

Merge activity

  • May 23, 5:33 PM PDT: @tlento started a stack merge that includes this pull request via Graphite.
  • May 23, 5:34 PM PDT: Graphite rebased this pull request as part of a merge.
  • May 23, 5:40 PM PDT: @tlento merged this pull request with Graphite.

Base automatically changed from add-conversion-metric-rendering-tests to main May 24, 2024 00:33
MetricFlow currently allows for a limited scope form of filter
predicate pushdown that is particular to time range constraints where
the querying user has provided an explicit time window for us to
query against. The application of the pushdown operation is managed
by threading the time range constraint through the entire dataflow
plan builder and applying the filter operation as appropriate.

This is exactly what we need to do for robust predicate pushdown
evaluation for our expanded set of pushdown operations. Rather than
wire a whole new set of parameters through, we simply encapsulate
the time range constraints inside of a new object that is more
readily extensible to other predicate pushdown handling.

This specific change is as mechanical as possible in order to minimize
confusion. Places that stood out for improvement via the encapsulating
object have been marked for later updates, which will follow shortly.
@tlento tlento force-pushed the encapsulate-time-range-constraint-for-pushdown branch from ece1e07 to 7e723bc Compare May 24, 2024 00:34
@tlento tlento merged commit b7537bf into main May 24, 2024
15 checks passed
@tlento tlento deleted the encapsulate-time-range-constraint-for-pushdown branch May 24, 2024 00:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants