-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added SqlWindowFunctionExpression
and MetadataSpec/Instance
#355
Conversation
a796a05
to
b4889ff
Compare
input_data_set: SqlDataSet = node.parent_node.accept(self) | ||
input_data_set_alias = self._next_unique_table_alias() | ||
|
||
row_number_spec = ExtraSpec.from_name("mf_row_number") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how should I name this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't have time to read this thoroughly. Directionally it seems fine to me. A couple of notes:
- ExtraSpec does need a new name, especially in the InstanceSpec interface. Maybe
internal_specs
oradded_metadata_spec
(as I believe @plypaul suggested) - something a bit more obvious that it's a framework internal thing. - You mentioned a new identifier type (
internal_key
orrow_number
or something). It's hard for me to see how that will shake out - on the one hand it eliminates the need for a special container for extra specs. On the other hand, it mixes internal elements with model elements in the instance spec interface, and requires runtime checking to see if one of these things exists anywhere in there. My instinct is to stick with what you have here, but an internal identifier type is an idea worth exploring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remind me how the row number is used again?
metricflow/sql/sql_exprs.py
Outdated
def matches(self, other: SqlExpressionNode) -> bool: # noqa: D | ||
if not isinstance(other, SqlWindowFunctionExpression): | ||
return False | ||
order_by_matches = all(x == y for x, y in itertools.zip_longest(self.order_by_args, other.order_by_args)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why zip_longest
? Shouldn't we check for same length?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated the check to do len(x) == len(y) and set(x) == set(y)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nvm, we should actually just do an equality check since ordering matters for this. So I just made it a simple self.order_by_args == other.order_by_args
row_number_select_column = SqlSelectColumn( | ||
expr=SqlWindowFunctionExpression( | ||
sql_function=SqlWindowFunction.ROW_NUMBER, | ||
order_by_args=[SqlWindowOrderByArg(expr=x) for x in input_sql_column_references], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There will have to be a FilterElementsNode
before this node so we don't have a long list of columns here.
f4f71b2
to
9dddf28
Compare
9dddf28
to
27db0e9
Compare
SqlWindowFunctionExpression
and AppendRowNumberColumnNode
SqlWindowFunctionExpression
and MetadataSpec/Instance
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -107,6 +111,24 @@ def qualified_name(self) -> str: | |||
raise NotImplementedError() | |||
|
|||
|
|||
@dataclass(frozen=True) | |||
class MetadataSpec(InstanceSpec): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks fine for now, but there will likely be different types of metadata later on so there might need to be subclasses of this. I'm also thinking whether to make this Visitable
(also for later).
Context
In aid of Conversion Metrics #352, It was best to break it out into multiple PRs. This PR is used to add the sql rendering node for window functions
and the dataflow plan node for appending a row number column. The reason this was broken out into a separate PR was due to the nature of adding a row number column into a data set.Meaning there was no where to define this column in the current framework as this was never defined from anywhere (ie., in the config). In this PR, we introduce a new spec/instance type calledMetadataSpec
andMetadataInstance
respectively. This is used for places that requires an additional spec or instance.Changes
SqlFunctionExpression
is now a parent object that is inherited by all sql function objectsSqlFunctionExpression
->SqlAggregationFunctionExpressionNode
SqlWindowFunctionExpressionNode
MetadataSpec
andMetadataInstance
to SpecSetsAddedAppendRowNumberColumnNode
Note
After further discussion,
AppendRowNumberColumnNode
is unnecessary, but everything else could still be useful so keeping those in.