-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Use Pydantic models to validate metadata in aggregation.py and remove drop_nones #750
ENH: Use Pydantic models to validate metadata in aggregation.py and remove drop_nones #750
Conversation
# use first as template but filter away invalid entries first: | ||
template = _utils.filter_validate_metadata(self.source_metadata[0]) | ||
# use first as template | ||
template = copy.deepcopy(self.source_metadata[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I could see from the doc, Pydantic should default to "Ignore" on unknown fields so it should be okay to remove this filtering method. Is this correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, plus we expect that incoming metadata is already valid and will be further validated once we put it into the Pydantic model
@@ -62,7 +64,7 @@ class AggregatedData: | |||
tagname: str = "" | |||
verbosity: str = "DEPRECATED" # keep for while | |||
|
|||
_metadata: dict = field(default_factory=dict, init=False) | |||
_metadata: internal.DataClassMeta = field(init=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had to change the type of _metadata to internal.DataClassMeta here in order to pass the mypy checks. Not sure if that is what we wanted, or if there is a better way to do it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what we wanted and the right way to do it. We might not want to use the internal class -- but that situation is a bit confusing and can addressed outside of this pr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work 👍
self._generate_aggrd_metadata(obj, real_ids, uuids, compute_md5) | ||
if skip_null: | ||
self._metadata = _utils.drop_nones(self._metadata) | ||
self._set_metadata(obj, real_ids, uuids, compute_md5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think eventually we should move to call _set_metadata()
in __post_init__()
but that can be taken in a separate PR.
a deprecation warning will be raised. | ||
**kwargs: See AggregatedData() arguments; initial will be overridden by |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We would eventually like to get away from this pattern too, where a class instance has its state updated so it can be reused. **kwargs
is present in a few other places as well.
Resolves #730 and #693
Now using Pydantic models in aggregation.py to validate metadata. In addition, the following refactoring and changes has been made:
_generate_aggrd_metadata
has been renamed to_set_metadata
filter_validate_metadata
in_utils
as validation will be taken care of by Pydanticskip_null
has been deprecated and will always be set to True. A deprecation warning will be raised if it is being used.Tests have been updated/fixed to match the new changes and use the correct types as enforced by Pydantic.