-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support label
attribute
#158
Conversation
This probably should have been part of #148, but we forgot. I'm adding these tests here because I plan to add `label` to the `SavedQuery` protocol in the coming commits. With that, I'll want to updated the parsing tests to check it. To do that, the tests need to exist.
e9dc17d
to
a96638c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not done but I figured you might want to look at the cat gif earlier....
@@ -334,6 +420,37 @@ def test_semantic_model_primary_time_dimension_parsing() -> None: | |||
assert dimension.type_params is not None | |||
|
|||
|
|||
def test_base_semantic_model_dimension_parsing() -> None: | |||
"""Test parsing base attributes of PydanticDimension object.""" | |||
description = "Test sematic_model dimension description" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not marking comment as resolved because cat gif 🐈
# assert isinstance(categorical_dim, RuntimeCheckableDimension) | ||
@given(builds(PydanticDimension)) | ||
def test_dimension_protocol(dimesnion: PydanticDimension) -> None: # noqa: D | ||
assert isinstance(dimesnion, RuntimeCheckableDimension) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is this not subject to the same problem with categorical dimension types as the old test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thats a good question. So I ran the test a few different ways after seeing your comment in this review.
import pdb
@given(builds(PydanticDimension))
def test_dimension_protocol(dimension: PydanticDimension) -> None: # noqa: D
if dimension.type == DimensionType.CATEGORICAL:
pdb.set_trace()
assert isinstance(dimension, RuntimeCheckableDimension)
import pdb
@given(builds(PydanticDimension))
def test_dimension_protocol(dimension: PydanticDimension) -> None: # noqa: D
if dimension.type == DimensionType.TIME:
pdb.set_trace()
assert isinstance(dimension, RuntimeCheckableDimension)
import pdb
@given(builds(PydanticDimension))
def test_dimension_protocol(dimension: PydanticDimension) -> None: # noqa: D
if dimension.type_params is not None:
pdb.set_trace()
assert isinstance(dimension, RuntimeCheckableDimension)
Variation (1) and (2) hit the break point when the test was ran, but (3) never was. Thus it appears the type_params aren't getting automatically generated by hypothesis. I'm gonna do some more investigating and see if this is happening to all nested objects. If so I'll have to build better strategies. I haven't found a setting in hypothesis to handle this better by default. It looks like we might have be able to register strategies for custom types, which is maybe what we have to do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Digging in further, this might be a pydantic class defaulting issue HypothesisWorks/hypothesis#3218. Which makes sense why I was seeing it on some things, but not on others.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I improved our strategies, but that was a separate fix. Why the tests no longer have the issue with categorical dimensions is we removed the assertion causing the issue in this commit 83cc44d
SIMPLE_METRIC_STRATEGY = builds( | ||
PydanticMetric, | ||
type=just(MetricType.SIMPLE), | ||
type_params=builds(PydanticMetricTypeParams, measure=builds(PydanticMetricInputMeasure)), | ||
) | ||
|
||
SEMANTIC_MODEL_STRATEGY = builds( | ||
PydanticSemanticModel, | ||
dimensions=lists(builds(PydanticDimension)), | ||
entities=lists(builds(PydanticEntity)), | ||
measures=lists(builds(PydanticMeasure)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh neat, it infers values based on type hints, I guess recursively all the way down.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does it pretty well! Although it sometimes chokes on nested lists of structured objects and just does an empty list instead. Hence why I explicitly tell it to generate lists of dimensions/entities/measures
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It also might be choking on nested objects in general. Or maybe it's only nested optional objects 🤔
@@ -0,0 +1,43 @@ | |||
import logging |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got to here..... will look over validations tomorrow!
Gonna do commits to address comments. I'll fix them up into the relevant commits before merging, but after reviewing is finished 🙂 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, I have one open question about the dimension uniqueness rule.
issues.append( | ||
ValidationError( | ||
context=FileContext.from_metadata(semantic_model.metadata), | ||
message=f"Dimension labels must be unique within a semantic model. The label `{label}` was " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want them to be globally unique for dimensions? I can see people setting user__country
and sales_rep__country
to both be Country
, which isn't great.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For those following along, we discussed this in a call. The decision for now was to not require labels for dimensions or measures to be globally unique. If people end up consistently running into disambiguation problems, we'll look into being stricter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per discussion today the current validation looks good!
Specifically we added the attribute to the following protocols: - Metric - SavedQuery - SemanticModel - Dimension - Entity - Measure
To really do protocol satisfaction testing well, we have to write really complex instantiations of the implementations. We want to cover when things are optional vs not. Some implementations require certain attributes to be set depending on paraent attributes during instantiation. And the protocols are still growing (new attributes are still getting added often). This makes how we were during these tests really brittle, and we weren't testing all the cases we wanted to. By using hypothesis each test will be ran many times with different generated instantiations of the main built object. Additionally, hypothesis does shrinking, such that it will try to find the minimal failing case. This gives us better guarantees that we're actually hitting all possible variations.
…e `label` (or `None`)
…action So pydantic 1.x doesn't play nice with hypothesis in that all defaulted and optional fields won't have strategies automatically generated for them. Therefore we need to explicitly set strategies for these properties unfortunately. I tried to reduce as much duplication as possible by composing strategies as much as possible.
d807b48
to
ff2d7e5
Compare
Resolves #143
Description
In this PR we
Add optional
label
attribute to the following protocols:Begin supporting optional
label
attribute on the corresponding pydantic implementations and parsingAdd validations which ensure
None
)Checklist
changie new
to create a changelog entry