Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow specification of filters as either lists or strings #171

Merged
merged 6 commits into from
Oct 10, 2023

Conversation

tlento
Copy link
Collaborator

@tlento tlento commented Oct 10, 2023

In order to support more advanced forms of filter rendering for things like
predicate pushdown (see dbt-labs/metricflow#747),
we need a way for users to indicate that a given subset of a complex filter
expression can be treated as an independent part of an intersection of
multiple filters.

Providing a construct that can accept, at the config level, a list as well as a string
gives us the flexibility we need to operate on sub-filters as independent entities
while maintaining backwards compatibility with the existing bare string filter
expressions in the wild today.

This PR does all of the necessary work for enabling filter lists in the interface
itself, by adding a filter container protocol class and corresponding Pydantic
implementation, along with test cases and json schema updates to allow
either list or string input in our sample YAML spec.

This does not appear to be backport-eligible, as the output property change
will not be backwards compatible in dbt-core 1.6 even though the Pydantic
object stack that MetricFlow uses internally should cover the discrepancy
in the query runtime.

Copy link
Collaborator Author

tlento commented Oct 10, 2023

Current dependencies on/for this PR:

This comment was auto-generated by Graphite.

tlento added 5 commits October 9, 2023 21:11
The WhereFilter implementation has a more complicated parser than
most other Pydantic implementation objects, because where filters
can be expressed as bare strings in the YAML, and for various
historical reasons we have had to allow the parser to process either
raw string inputs, properly serialized objects, or actual python
instances.

With the impending switch to WhereFilterIntersection objects, which
will need to parse all of this and more, some tests documenting and
enforcing the expected behavior for these classes seemed in order.
We are converting the filter: parameters accepting WhereFilter
protocols in the semantic manifest to instead accept a set of
WhereFilters which will be applied as an intersection.

The cleanest way to do this while retaining backwards compatibility
is to add a new protocol for a WhereFilterIntersection and a corresponding
Pydantic model object implementation that consolidates all of the parsing
logic updates necessary to handle legacy semantic manifests.

This also has the advantage of allowing for easy extensions to these
filters for any other properties we might wish to apply to the full
set (e.g., predicate pushdown configuration options).

This commit simply adds the protocol, a working implementation, and
the additional test cases to demonstrate theoretical backwards compatibility.
No direct YAML handling has been tested yet, as the YAML parser has not been
hooked up to the new protocol layout.
…ntersection

The call_parameter_sets for each of the WhereFilters contained in a
WhereFilterIntersection currently have to be accessed one at a time in a list.
In addition to making it harder to run sensible validations against an implementation
of the WhereFilterIntersection, this also complicates runtime processing for any
implementation (e.g., MetricFlow) that needs to access these parameter sets as
a collection.

This adds a property to the protocol spec for getting a sequence of pairs between
the filter expression sql and the call parameter sets it contains, which allows
for downstream flexibility for managing the WhereFilter components of a
WhereFilterIntersection.
WhereFilterIntersection is now ready for integration into the
metric parser and validation logic.
For the sake of consistency, and everybody's sanity, we should
have the new things use the same constructs as the old things.

While the original Sequence[WhereFilter] type is nominally fine, it
would become rather annoying to have to deal with lists in some places
and objects containing lists in others, especially when collecting
call parameter sets.
@tlento tlento changed the title Add test cases for PydanticWhereFilter parsing Allow specification of filters as either lists or strings Oct 10, 2023
@tlento tlento force-pushed the throw-errors-on-invalid-entity-names branch from 579a5ce to fb8e8e9 Compare October 10, 2023 04:18
@tlento tlento force-pushed the make-where-filters-a-list branch from 8a3c6cf to 12e94cc Compare October 10, 2023 04:18
@tlento tlento requested review from QMalcolm and plypaul October 10, 2023 04:18
Base automatically changed from throw-errors-on-invalid-entity-names to main October 10, 2023 04:46
Copy link
Collaborator

@QMalcolm QMalcolm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is incredibly well put together and detailed. I want my code to look like this. I found no issues or nits on my first or second pass, not even typos. I think the level of detail this goes into to massage things to keep things backwards compatible will make it much easier for me on the core side of things. Ship it 🚀

@tlento
Copy link
Collaborator Author

tlento commented Oct 10, 2023

Glad you appreciate the PR construction effort! The parsing changes were hard for me to reason about so I kind of had to write it all down first. I did that in the docblock just so I wouldn't lose it, which proved to be a good call, made it easy to clean up into documentation.

Let's just not talk about how many times I screwed up and had to rewrite my git history along the way....

@tlento tlento merged commit fe071e7 into main Oct 10, 2023
12 checks passed
@tlento tlento deleted the make-where-filters-a-list branch October 10, 2023 17:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature] Allow metric and input metric/measure filter elements to be lists as well as strings
2 participants