Add Naming Schemes to Represent Different Input Formats #893

plypaul · 2023-11-21T19:08:37Z

Description

These naming scheme classes will be used in the query parser to convert string inputs into patterns. The patterns will be used later to resolve ambiguous group-by-items.

github-actions · 2023-11-21T19:08:56Z

Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the contributing guide.

tlento

Seems reasonable! I left a bunch of minor comments. At the very least, please do clean up the error types before merging.

tlento · 2023-11-22T19:52:11Z

metricflow/naming/naming_scheme.py

+        If this scheme cannot accommodate the spec, return None. This is needed to handle a case with DatePart in
+        DunderNamingScheme, but naming schemes should otherwise be complete.


I expect the DunderNamingScheme itself to fall further and further behind due to its inherent limitations.

It would be nice to just get rid of it but I suspect that's never going to happen.

Suggested change

If this scheme cannot accommodate the spec, return None. This is needed to handle a case with DatePart in

DunderNamingScheme, but naming schemes should otherwise be complete.

If this scheme cannot accommodate the spec, return None. This is needed to handle unsupported cases in

DunderNamingScheme, such as DatePart, but naming schemes should otherwise be complete.

tlento · 2023-11-22T19:55:16Z

metricflow/naming/dunder_scheme.py

+            #
+            if time_dimension_spec.date_part is not None:
+                return None
+        names = _DunderNameTransform().transform(LinkableSpecSet.from_specs((instance_spec,)))


nit:

Suggested change

names = _DunderNameTransform().transform(LinkableSpecSet.from_specs((instance_spec,)))

names = _DunderNameTransform().transform(spec_set)

tlento · 2023-11-22T19:56:23Z

metricflow/naming/dunder_scheme.py

+        for time_dimension_spec in spec_set.time_dimension_specs:
+            # From existing comment in StructuredLinkableSpecName:
+            #
+            # Dunder syntax not supported for querying date_part
+            #
+            if time_dimension_spec.date_part is not None:
+                return None


This feels like an incredibly roundabout way of getting this value. I guess it's ok for now while we think about how to improve the spec class interfaces.

Yeah, I agree. However, I haven't been able to come up with a better one. Have ideas?

I had an idea a while back but it pushed too much stuff into the common interface. We'll come up with something.

tlento · 2023-11-22T20:14:54Z

metricflow/naming/dunder_scheme.py

+    @override
+    def spec_pattern(self, input_str: str) -> EntityLinkPattern:
+        if not self.input_str_follows_scheme(input_str):
+            raise RuntimeError(f"{repr(input_str)} does not follow this scheme.")


What exception gets thrown in this case today? I assume this is almost always going to be a query input error, which means we may want to throw an exception we can handle and re-cast for things like alert management.

If we don't have a custom exception we can go to here I'd tag this as ValueError since it's an invalid value. RuntimeError is really broad. Also, we use ValueError below.

In practice, this case should not be hit as we would call input_str_follows_scheme before running that input through this method. If the input does not follow the scheme, we would create an error issue instead of raising exceptions. Updated to ValueError though. Also updated the docstring about it.

tlento · 2023-11-22T20:28:01Z

metricflow/naming/object_builder_scheme.py

+    @override
+    def spec_pattern(self, input_str: str) -> SpecPattern:
+        if not self.input_str_follows_scheme(input_str):
+            raise ValueError(


Nice, ValueError. Same question about a custom exception applies, and we should be consistent here.

tlento · 2023-11-22T20:29:09Z

metricflow/naming/naming_scheme.py

+        pass
+
+    @abstractmethod
+    def input_str_follows_scheme(self, input_str: str) -> bool:


It might be worth adding an enforcing version of this that's implemented to raise a consistent error, since mismatches are likely to all share the same root cause and error type/response messaging, something like:

def assert_input_str_follows_scheme(self, input_str: str) -> None: if not self.input_str_follows_scheme(input_str): raise ....

Then the implementations can just call the assert method when they need it instead of handling the exception info itself.

In later commits of this set, we avoid raising exceptions in favor of creating query issues so that all errors can be collected and displayed to the user.

tlento · 2023-11-22T20:37:27Z

metricflow/naming/object_builder_scheme.py

+
+    @override
+    def input_str_follows_scheme(self, input_str: str) -> bool:
+        if ObjectBuilderNamingScheme._NAME_REGEX.match(input_str) is None:


Do we need this regex, or can we jump to the parsing? It's not clear to me what the regex is adding, honestly.

I think this was a remnant of an earlier implementation. Removed.

tlento · 2023-11-22T20:41:42Z

metricflow/naming/object_builder_scheme.py

+        initializer_parameters = []
+        entity_link_names = list(entity_link.element_name for entity_link in entity_links)
+        if len(entity_link_names) > 0:
+            initializer_parameters.append(repr(entity_link_names[-1] + DUNDER + element_name))
+        else:
+            initializer_parameters.append(repr(element_name))
+        if time_granularity is not None:
+            initializer_parameters.append(
+                f"'{time_granularity.value}'",
+            )
+        if date_part is not None:
+            initializer_parameters.append(f"date_part_name={repr(date_part.value)}")
+        if len(entity_link_names) > 1:
+            initializer_parameters.append(f"entity_path={repr(entity_link_names[:-1])}")
+
+        return ", ".join(initializer_parameters)


I feel like there has to be a better way to do this but without overhauling our spec property interfaces I don't know what it is. The logic here is a little brittle because of the assumption that all the parameters passed are correct for the spec type but since it's all in this one small class I think it's fine.

tlento · 2023-11-22T20:42:28Z

metricflow/naming/object_builder_scheme.py

+            return EntityLinkPattern(
+                EntityLinkPatternParameterSet(
+                    element_name=entity_call_parameter_set.entity_reference.element_name,
+                    entity_links=entity_call_parameter_set.entity_path,


Tangentially related - do we actually allow linked entities, like user__listing, in a query? Or is this forward-looking for things we might add, or for using this for internal cases where we may need that entity path resolution in an intermediate query step?

Yes, we do allow queries for linked entities like user__listing right now.

Really? TIL. I don't think I've ever seen it.

tlento · 2023-11-22T20:47:08Z

metricflow/test/naming/test_dunder_naming_scheme.py

+    return DunderNamingScheme()
+
+
+def test_input_str(dunder_naming_scheme: DunderNamingScheme) -> None:  # noqa: D


nit: I'm not a fan of tests that do multi-step assertions against different logic triggers, I'd rather they be parameterized on input/output or else split into separate methods - that way when I inevitably change something that breaks a bunch of stuff I don't have to go through this run test, fix bug one, run test, fix bug 2, run test, fix bug 3 workflow because all of the errors are exposed on the first run.

I agree. But given the simplicity of these cases, I ended up bunching them. Could change them later though.

These naming scheme classes will be used in the query parser to convert string inputs into patterns. The patterns will be used later to resolve ambiguous group-by-items.

cla-bot bot added the cla:yes label Nov 21, 2023

plypaul added the Skip Changelog label Nov 21, 2023

plypaul force-pushed the plypaul--58.8--naming-scheme branch from 49ea7af to 03f45dc Compare November 21, 2023 19:17

plypaul marked this pull request as ready for review November 21, 2023 19:26

plypaul requested review from tlento, courtneyholcomb and WilliamDee November 21, 2023 19:26

plypaul force-pushed the plypaul--58.8--naming-scheme branch from 03f45dc to 6897942 Compare November 22, 2023 05:30

tlento approved these changes Nov 22, 2023

View reviewed changes

plypaul force-pushed the plypaul--58.7--spec-pattern branch 2 times, most recently from a12e578 to 1db958e Compare November 27, 2023 21:38

Base automatically changed from plypaul--58.7--spec-pattern to main November 27, 2023 22:04

plypaul force-pushed the plypaul--58.8--naming-scheme branch from 6897942 to 51ea8bb Compare November 27, 2023 22:10

Add naming schemes.

7992b74

These naming scheme classes will be used in the query parser to convert string inputs into patterns. The patterns will be used later to resolve ambiguous group-by-items.

plypaul force-pushed the plypaul--58.8--naming-scheme branch from 51ea8bb to 7992b74 Compare November 27, 2023 22:17

plypaul merged commit 955b897 into main Nov 28, 2023
8 checks passed

plypaul deleted the plypaul--58.8--naming-scheme branch November 28, 2023 00:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Naming Schemes to Represent Different Input Formats #893

Add Naming Schemes to Represent Different Input Formats #893

plypaul commented Nov 21, 2023 •

edited

Loading

github-actions bot commented Nov 21, 2023

tlento left a comment

tlento Nov 22, 2023

plypaul Nov 27, 2023

tlento Nov 22, 2023

plypaul Nov 27, 2023

tlento Nov 22, 2023

plypaul Nov 27, 2023

tlento Nov 27, 2023

tlento Nov 22, 2023

plypaul Nov 27, 2023

tlento Nov 22, 2023

tlento Nov 22, 2023

plypaul Nov 27, 2023

tlento Nov 22, 2023

plypaul Nov 27, 2023

tlento Nov 22, 2023

tlento Nov 22, 2023

plypaul Nov 27, 2023

tlento Nov 27, 2023

tlento Nov 22, 2023

plypaul Nov 27, 2023

		If this scheme cannot accommodate the spec, return None. This is needed to handle a case with DatePart in
		DunderNamingScheme, but naming schemes should otherwise be complete.

	names = _DunderNameTransform().transform(LinkableSpecSet.from_specs((instance_spec,)))
	names = _DunderNameTransform().transform(spec_set)

		return DunderNamingScheme()


		def test_input_str(dunder_naming_scheme: DunderNamingScheme) -> None: # noqa: D

Add Naming Schemes to Represent Different Input Formats #893

Add Naming Schemes to Represent Different Input Formats #893

Conversation

plypaul commented Nov 21, 2023 • edited Loading

Description

github-actions bot commented Nov 21, 2023

tlento left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

plypaul commented Nov 21, 2023 •

edited

Loading