-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Naming Schemes to Represent Different Input Formats #893
Conversation
Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the contributing guide. |
49ea7af
to
03f45dc
Compare
03f45dc
to
6897942
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems reasonable! I left a bunch of minor comments. At the very least, please do clean up the error types before merging.
metricflow/naming/naming_scheme.py
Outdated
If this scheme cannot accommodate the spec, return None. This is needed to handle a case with DatePart in | ||
DunderNamingScheme, but naming schemes should otherwise be complete. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I expect the DunderNamingScheme itself to fall further and further behind due to its inherent limitations.
It would be nice to just get rid of it but I suspect that's never going to happen.
If this scheme cannot accommodate the spec, return None. This is needed to handle a case with DatePart in | |
DunderNamingScheme, but naming schemes should otherwise be complete. | |
If this scheme cannot accommodate the spec, return None. This is needed to handle unsupported cases in | |
DunderNamingScheme, such as DatePart, but naming schemes should otherwise be complete. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
metricflow/naming/dunder_scheme.py
Outdated
# | ||
if time_dimension_spec.date_part is not None: | ||
return None | ||
names = _DunderNameTransform().transform(LinkableSpecSet.from_specs((instance_spec,))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
names = _DunderNameTransform().transform(LinkableSpecSet.from_specs((instance_spec,))) | |
names = _DunderNameTransform().transform(spec_set) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
for time_dimension_spec in spec_set.time_dimension_specs: | ||
# From existing comment in StructuredLinkableSpecName: | ||
# | ||
# Dunder syntax not supported for querying date_part | ||
# | ||
if time_dimension_spec.date_part is not None: | ||
return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels like an incredibly roundabout way of getting this value. I guess it's ok for now while we think about how to improve the spec class interfaces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I agree. However, I haven't been able to come up with a better one. Have ideas?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had an idea a while back but it pushed too much stuff into the common interface. We'll come up with something.
metricflow/naming/dunder_scheme.py
Outdated
@override | ||
def spec_pattern(self, input_str: str) -> EntityLinkPattern: | ||
if not self.input_str_follows_scheme(input_str): | ||
raise RuntimeError(f"{repr(input_str)} does not follow this scheme.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What exception gets thrown in this case today? I assume this is almost always going to be a query input error, which means we may want to throw an exception we can handle and re-cast for things like alert management.
If we don't have a custom exception we can go to here I'd tag this as ValueError since it's an invalid value. RuntimeError is really broad. Also, we use ValueError below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In practice, this case should not be hit as we would call input_str_follows_scheme
before running that input through this method. If the input does not follow the scheme, we would create an error issue instead of raising exceptions. Updated to ValueError
though. Also updated the docstring about it.
@override | ||
def spec_pattern(self, input_str: str) -> SpecPattern: | ||
if not self.input_str_follows_scheme(input_str): | ||
raise ValueError( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, ValueError. Same question about a custom exception applies, and we should be consistent here.
pass | ||
|
||
@abstractmethod | ||
def input_str_follows_scheme(self, input_str: str) -> bool: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be worth adding an enforcing version of this that's implemented to raise a consistent error, since mismatches are likely to all share the same root cause and error type/response messaging, something like:
def assert_input_str_follows_scheme(self, input_str: str) -> None:
if not self.input_str_follows_scheme(input_str):
raise ....
Then the implementations can just call the assert method when they need it instead of handling the exception info itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In later commits of this set, we avoid raising exceptions in favor of creating query issues so that all errors can be collected and displayed to the user.
|
||
@override | ||
def input_str_follows_scheme(self, input_str: str) -> bool: | ||
if ObjectBuilderNamingScheme._NAME_REGEX.match(input_str) is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this regex, or can we jump to the parsing? It's not clear to me what the regex is adding, honestly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this was a remnant of an earlier implementation. Removed.
initializer_parameters = [] | ||
entity_link_names = list(entity_link.element_name for entity_link in entity_links) | ||
if len(entity_link_names) > 0: | ||
initializer_parameters.append(repr(entity_link_names[-1] + DUNDER + element_name)) | ||
else: | ||
initializer_parameters.append(repr(element_name)) | ||
if time_granularity is not None: | ||
initializer_parameters.append( | ||
f"'{time_granularity.value}'", | ||
) | ||
if date_part is not None: | ||
initializer_parameters.append(f"date_part_name={repr(date_part.value)}") | ||
if len(entity_link_names) > 1: | ||
initializer_parameters.append(f"entity_path={repr(entity_link_names[:-1])}") | ||
|
||
return ", ".join(initializer_parameters) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like there has to be a better way to do this but without overhauling our spec property interfaces I don't know what it is. The logic here is a little brittle because of the assumption that all the parameters passed are correct for the spec type but since it's all in this one small class I think it's fine.
return EntityLinkPattern( | ||
EntityLinkPatternParameterSet( | ||
element_name=entity_call_parameter_set.entity_reference.element_name, | ||
entity_links=entity_call_parameter_set.entity_path, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tangentially related - do we actually allow linked entities, like user__listing
, in a query? Or is this forward-looking for things we might add, or for using this for internal cases where we may need that entity path resolution in an intermediate query step?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we do allow queries for linked entities like user__listing
right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really? TIL. I don't think I've ever seen it.
return DunderNamingScheme() | ||
|
||
|
||
def test_input_str(dunder_naming_scheme: DunderNamingScheme) -> None: # noqa: D |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I'm not a fan of tests that do multi-step assertions against different logic triggers, I'd rather they be parameterized on input/output or else split into separate methods - that way when I inevitably change something that breaks a bunch of stuff I don't have to go through this run test, fix bug one, run test, fix bug 2, run test, fix bug 3
workflow because all of the errors are exposed on the first run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. But given the simplicity of these cases, I ended up bunching them. Could change them later though.
a12e578
to
1db958e
Compare
6897942
to
51ea8bb
Compare
These naming scheme classes will be used in the query parser to convert string inputs into patterns. The patterns will be used later to resolve ambiguous group-by-items.
51ea8bb
to
7992b74
Compare
Description
These naming scheme classes will be used in the query parser to convert string inputs into patterns. The patterns will be used later to resolve ambiguous group-by-items.