-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow specification of filters as either lists or strings #171
Conversation
Current dependencies on/for this PR:
This comment was auto-generated by Graphite. |
The WhereFilter implementation has a more complicated parser than most other Pydantic implementation objects, because where filters can be expressed as bare strings in the YAML, and for various historical reasons we have had to allow the parser to process either raw string inputs, properly serialized objects, or actual python instances. With the impending switch to WhereFilterIntersection objects, which will need to parse all of this and more, some tests documenting and enforcing the expected behavior for these classes seemed in order.
We are converting the filter: parameters accepting WhereFilter protocols in the semantic manifest to instead accept a set of WhereFilters which will be applied as an intersection. The cleanest way to do this while retaining backwards compatibility is to add a new protocol for a WhereFilterIntersection and a corresponding Pydantic model object implementation that consolidates all of the parsing logic updates necessary to handle legacy semantic manifests. This also has the advantage of allowing for easy extensions to these filters for any other properties we might wish to apply to the full set (e.g., predicate pushdown configuration options). This commit simply adds the protocol, a working implementation, and the additional test cases to demonstrate theoretical backwards compatibility. No direct YAML handling has been tested yet, as the YAML parser has not been hooked up to the new protocol layout.
…ntersection The call_parameter_sets for each of the WhereFilters contained in a WhereFilterIntersection currently have to be accessed one at a time in a list. In addition to making it harder to run sensible validations against an implementation of the WhereFilterIntersection, this also complicates runtime processing for any implementation (e.g., MetricFlow) that needs to access these parameter sets as a collection. This adds a property to the protocol spec for getting a sequence of pairs between the filter expression sql and the call parameter sets it contains, which allows for downstream flexibility for managing the WhereFilter components of a WhereFilterIntersection.
WhereFilterIntersection is now ready for integration into the metric parser and validation logic.
For the sake of consistency, and everybody's sanity, we should have the new things use the same constructs as the old things. While the original Sequence[WhereFilter] type is nominally fine, it would become rather annoying to have to deal with lists in some places and objects containing lists in others, especially when collecting call parameter sets.
579a5ce
to
fb8e8e9
Compare
8a3c6cf
to
12e94cc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is incredibly well put together and detailed. I want my code to look like this. I found no issues or nits on my first or second pass, not even typos. I think the level of detail this goes into to massage things to keep things backwards compatible will make it much easier for me on the core side of things. Ship it 🚀
Glad you appreciate the PR construction effort! The parsing changes were hard for me to reason about so I kind of had to write it all down first. I did that in the docblock just so I wouldn't lose it, which proved to be a good call, made it easy to clean up into documentation. Let's just not talk about how many times I screwed up and had to rewrite my git history along the way.... |
In order to support more advanced forms of filter rendering for things like
predicate pushdown (see dbt-labs/metricflow#747),
we need a way for users to indicate that a given subset of a complex filter
expression can be treated as an independent part of an intersection of
multiple filters.
Providing a construct that can accept, at the config level, a list as well as a string
gives us the flexibility we need to operate on sub-filters as independent entities
while maintaining backwards compatibility with the existing bare string filter
expressions in the wild today.
This PR does all of the necessary work for enabling filter lists in the interface
itself, by adding a filter container protocol class and corresponding Pydantic
implementation, along with test cases and json schema updates to allow
either list or string input in our sample YAML spec.
This does not appear to be backport-eligible, as the output property change
will not be backwards compatible in dbt-core 1.6 even though the Pydantic
object stack that MetricFlow uses internally should cover the discrepancy
in the query runtime.