Support `label` attribute #158

QMalcolm · 2023-09-27T00:36:14Z

Resolves #143

Description

In this PR we

Add optional label attribute to the following protocols:
- Metric
- SavedQuery
- SemanticModel
- Dimension
- Entity
- Measure
Begin supporting optional label attribute on the corresponding pydantic implementations and parsing
Add validations which ensure
- metric labels are unique to other metrics
- saved query labels are unique to other saved queries
- semantic model labels are unique to other semantic models
- dimension labels on a semantic model are unique to other dimensions on the same semantic model
- entity labels on a semantic model are unique to other entities on the same semantic model
- measure labels on a semantic model are unique to other measures on the same semantic model
- entities with the same name across semantic models have the same label (or are None)

Checklist

I have read the contributing guide and understand what's expected of me
I have signed the CLA
This PR includes tests, or tests are not required/relevant for this PR
I have run changie new to create a changelog entry

This probably should have been part of #148, but we forgot. I'm adding these tests here because I plan to add `label` to the `SavedQuery` protocol in the coming commits. With that, I'll want to updated the parsing tests to check it. To do that, the tests need to exist.

tlento

Not done but I figured you might want to look at the cat gif earlier....

tests/parsing/test_semantic_model_parsing.py

tlento · 2023-09-28T01:16:35Z

tests/parsing/test_semantic_model_parsing.py

@@ -334,6 +420,37 @@ def test_semantic_model_primary_time_dimension_parsing() -> None:
    assert dimension.type_params is not None


+def test_base_semantic_model_dimension_parsing() -> None:
+    """Test parsing base attributes of PydanticDimension object."""
+    description = "Test sematic_model dimension description"


This is the cutest copy/paste gif I could find.....

Not marking comment as resolved because cat gif 🐈

tlento · 2023-09-28T01:27:13Z

tests/test_implements_satisfy_protocols.py

-    # assert isinstance(categorical_dim, RuntimeCheckableDimension)
+@given(builds(PydanticDimension))
+def test_dimension_protocol(dimesnion: PydanticDimension) -> None:  # noqa: D
+    assert isinstance(dimesnion, RuntimeCheckableDimension)


How is this not subject to the same problem with categorical dimension types as the old test?

Thats a good question. So I ran the test a few different ways after seeing your comment in this review.

import pdb @given(builds(PydanticDimension)) def test_dimension_protocol(dimension: PydanticDimension) -> None: # noqa: D if dimension.type == DimensionType.CATEGORICAL: pdb.set_trace() assert isinstance(dimension, RuntimeCheckableDimension)

import pdb @given(builds(PydanticDimension)) def test_dimension_protocol(dimension: PydanticDimension) -> None: # noqa: D if dimension.type == DimensionType.TIME: pdb.set_trace() assert isinstance(dimension, RuntimeCheckableDimension)

import pdb @given(builds(PydanticDimension)) def test_dimension_protocol(dimension: PydanticDimension) -> None: # noqa: D if dimension.type_params is not None: pdb.set_trace() assert isinstance(dimension, RuntimeCheckableDimension)

Variation (1) and (2) hit the break point when the test was ran, but (3) never was. Thus it appears the type_params aren't getting automatically generated by hypothesis. I'm gonna do some more investigating and see if this is happening to all nested objects. If so I'll have to build better strategies. I haven't found a setting in hypothesis to handle this better by default. It looks like we might have be able to register strategies for custom types, which is maybe what we have to do?

Digging in further, this might be a pydantic class defaulting issue HypothesisWorks/hypothesis#3218. Which makes sense why I was seeing it on some things, but not on others.

So I improved our strategies, but that was a separate fix. Why the tests no longer have the issue with categorical dimensions is we removed the assertion causing the issue in this commit 83cc44d

tlento · 2023-09-28T01:27:59Z

tests/test_implements_satisfy_protocols.py

+SIMPLE_METRIC_STRATEGY = builds(
+    PydanticMetric,
+    type=just(MetricType.SIMPLE),
+    type_params=builds(PydanticMetricTypeParams, measure=builds(PydanticMetricInputMeasure)),
+)
+
+SEMANTIC_MODEL_STRATEGY = builds(
+    PydanticSemanticModel,
+    dimensions=lists(builds(PydanticDimension)),
+    entities=lists(builds(PydanticEntity)),
+    measures=lists(builds(PydanticMeasure)),


Oh neat, it infers values based on type hints, I guess recursively all the way down.

It does it pretty well! Although it sometimes chokes on nested lists of structured objects and just does an empty list instead. Hence why I explicitly tell it to generate lists of dimensions/entities/measures

It also might be choking on nested objects in general. Or maybe it's only nested optional objects 🤔

tlento · 2023-09-28T01:30:34Z

dbt_semantic_interfaces/validations/labels.py

@@ -0,0 +1,43 @@
+import logging


I got to here..... will look over validations tomorrow!

QMalcolm · 2023-09-28T03:10:57Z

Gonna do commits to address comments. I'll fix them up into the relevant commits before merging, but after reviewing is finished 🙂

tlento

Looks good to me, I have one open question about the dimension uniqueness rule.

tlento · 2023-09-28T17:26:59Z

dbt_semantic_interfaces/validations/labels.py

+                issues.append(
+                    ValidationError(
+                        context=FileContext.from_metadata(semantic_model.metadata),
+                        message=f"Dimension labels must be unique within a semantic model. The label `{label}` was "


Do we want them to be globally unique for dimensions? I can see people setting user__country and sales_rep__country to both be Country, which isn't great.

For those following along, we discussed this in a call. The decision for now was to not require labels for dimensions or measures to be globally unique. If people end up consistently running into disambiguation problems, we'll look into being stricter

tlento

Per discussion today the current validation looks good!

QMalcolm · 2023-10-06T03:28:09Z

Rebasing to fixup commit 9373115 into 59357a1

Specifically we added the attribute to the following protocols: - Metric - SavedQuery - SemanticModel - Dimension - Entity - Measure

…to check `label`

To really do protocol satisfaction testing well, we have to write really complex instantiations of the implementations. We want to cover when things are optional vs not. Some implementations require certain attributes to be set depending on paraent attributes during instantiation. And the protocols are still growing (new attributes are still getting added often). This makes how we were during these tests really brittle, and we weren't testing all the cases we wanted to. By using hypothesis each test will be ran many times with different generated instantiations of the main built object. Additionally, hypothesis does shrinking, such that it will try to find the minimal failing case. This gives us better guarantees that we're actually hitting all possible variations.

…e `label` (or `None`)

…action So pydantic 1.x doesn't play nice with hypothesis in that all defaulted and optional fields won't have strategies automatically generated for them. Therefore we need to explicitly set strategies for these properties unfortunately. I tried to reduce as much duplication as possible by composing strategies as much as possible.

QMalcolm added enhancement New feature or request Protocol Spec Change labels Sep 27, 2023

QMalcolm requested review from tlento and plypaul September 27, 2023 00:36

cla-bot bot added the cla:yes label Sep 27, 2023

QMalcolm force-pushed the qmalcolm--143-support-label-attribute branch from e9dc17d to a96638c Compare September 27, 2023 00:40

tlento reviewed Sep 28, 2023

View reviewed changes

tlento approved these changes Sep 29, 2023

View reviewed changes

QMalcolm added 10 commits October 5, 2023 20:29

Update parsing tests to check for label attr parsing

d21f99a

Add label attribute to protocols

1c936db

Specifically we added the attribute to the following protocols: - Metric - SavedQuery - SemanticModel - Dimension - Entity - Measure

Update pydantic implementations to support label attr

cea5169

Update jsonschema to support label attr

6559812

Update tests asserting pydantic implementations satisfy the protocol …

d0079c6

…to check `label`

Add validation rule ensuring metric labels are unique

6247725

Add validation rule checking that labels are unique on semantic models

b3f443e

Add validation rule ensuring entities with the same name have the sam…

b96ca84

…e `label` (or `None`)

QMalcolm force-pushed the qmalcolm--143-support-label-attribute branch from d807b48 to ff2d7e5 Compare October 6, 2023 03:29

QMalcolm merged commit b306a5b into main Oct 6, 2023
9 checks passed

QMalcolm deleted the qmalcolm--143-support-label-attribute branch October 6, 2023 03:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support `label` attribute #158

Support `label` attribute #158

QMalcolm commented Sep 27, 2023

tlento left a comment

tlento Sep 28, 2023

QMalcolm Sep 28, 2023

tlento Sep 28, 2023

QMalcolm Sep 29, 2023

QMalcolm Sep 29, 2023

QMalcolm Oct 6, 2023

tlento Sep 28, 2023

QMalcolm Sep 28, 2023 •

edited

Loading

QMalcolm Sep 29, 2023

tlento Sep 28, 2023

QMalcolm commented Sep 28, 2023

tlento left a comment

tlento Sep 28, 2023

QMalcolm Sep 29, 2023

tlento left a comment

QMalcolm commented Oct 6, 2023

Support label attribute #158

Support label attribute #158

Conversation

QMalcolm commented Sep 27, 2023

Description

Checklist

tlento left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

QMalcolm Sep 28, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

QMalcolm commented Sep 28, 2023

tlento left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tlento left a comment

Choose a reason for hiding this comment

QMalcolm commented Oct 6, 2023

Support `label` attribute #158

Support `label` attribute #158

QMalcolm Sep 28, 2023 •

edited

Loading