Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Use Pydantic models to validate metadata in aggregation.py and remove drop_nones #750

Conversation

slangeveld
Copy link
Contributor

Resolves #730 and #693

Now using Pydantic models in aggregation.py to validate metadata. In addition, the following refactoring and changes has been made:

  • The internal method _generate_aggrd_metadata has been renamed to _set_metadata
  • Remove the method filter_validate_metadata in _utils as validation will be taken care of by Pydantic
  • If the validation of the metadata with the Pydantic models fail, we raise an exception
  • We do the model_dump only when the metadata is returned. The internal metadata object will be stored as a Pydantic base model
  • The input parameter skip_null has been deprecated and will always be set to True. A deprecation warning will be raised if it is being used.

Tests have been updated/fixed to match the new changes and use the correct types as enforced by Pydantic.

# use first as template but filter away invalid entries first:
template = _utils.filter_validate_metadata(self.source_metadata[0])
# use first as template
template = copy.deepcopy(self.source_metadata[0])
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I could see from the doc, Pydantic should default to "Ignore" on unknown fields so it should be okay to remove this filtering method. Is this correct?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, plus we expect that incoming metadata is already valid and will be further validated once we put it into the Pydantic model

@@ -62,7 +64,7 @@ class AggregatedData:
tagname: str = ""
verbosity: str = "DEPRECATED" # keep for while

_metadata: dict = field(default_factory=dict, init=False)
_metadata: internal.DataClassMeta = field(init=False)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had to change the type of _metadata to internal.DataClassMeta here in order to pass the mypy checks. Not sure if that is what we wanted, or if there is a better way to do it?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what we wanted and the right way to do it. We might not want to use the internal class -- but that situation is a bit confusing and can addressed outside of this pr.

Copy link
Collaborator

@mferrera mferrera left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work 👍

self._generate_aggrd_metadata(obj, real_ids, uuids, compute_md5)
if skip_null:
self._metadata = _utils.drop_nones(self._metadata)
self._set_metadata(obj, real_ids, uuids, compute_md5)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think eventually we should move to call _set_metadata() in __post_init__() but that can be taken in a separate PR.

Comment on lines +309 to 310
a deprecation warning will be raised.
**kwargs: See AggregatedData() arguments; initial will be overridden by
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would eventually like to get away from this pattern too, where a class instance has its state updated so it can be reused. **kwargs is present in a few other places as well.

@mferrera mferrera merged commit c52c63c into equinor:main Jul 18, 2024
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Have aggregation.py validate metadata with Pydantic model
2 participants