Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mixin class for equality and equivalence #133

Merged
merged 5 commits into from
Feb 5, 2024

Conversation

DaniBodor
Copy link
Member

@DaniBodor DaniBodor commented Dec 1, 2023

This PR does the following:

  • additions to docstring of slicing module (refactor: slicing mixin class #132 ) + add init file to mixins subpackage
  • create a generic __eq__ function for alive data structures
  • create a generic isequivalent function for alive data structures
  • apply isequivalent throughout

@DaniBodor DaniBodor changed the base branch from main to 129a_slicing_mixin_dbodor December 1, 2023 17:31
@DaniBodor DaniBodor mentioned this pull request Dec 1, 2023
4 tasks
@DaniBodor
Copy link
Member Author

DaniBodor commented Dec 2, 2023

@psomhorst, in 30efe49, you say:

Before this commit, check_equivalence() had to be called as a class
method. Two variants a and b could be checked for equivalence by
calling Variant.check_equivalence(a, b) or
a.check_equivalence(a, b). The last command made little sense.

By removing the classmethod decorator Variant.check_equivalence(a, b)
still works. The second command is changed to a.check_equivalence(b),
which makes more sense.

However, I think that as a static method (the way you did it for Variant), the command a.check_equivalence(b) doesn't work, where as a standard instance method Variant.check_equivalence(a, b) doesn't work. Not sure if there's a way around either or whether I'm missing something.
We can also revert to class method, as the a.check_equivalence(a, b) command is kind of odd, but not problematic. The default way to call it would still be Variant.check_equivalence(a, b).

What's your opinion on this?

@DaniBodor
Copy link
Member Author

Consider renaming check_equvalence to isequivalent, to be more in line with issubclass and isinstance, etc.
The behavior is slightly different, though, as this can optionally raise an error as an alternative output to returning a bool. So might not be a great idea to name it analogously to those builtins.

@psomhorst
Copy link
Contributor

However, I think that as a static method (the way you did it for Variant), the command a.check_equivalence(b) doesn't work, where as a standard instance method Variant.check_equivalence(a, b) doesn't work. Not sure if there's a way around either or whether I'm missing something. We can also revert to class method, as the a.check_equivalence(a, b) command is kind of odd, but not problematic. The default way to call it would still be Variant.check_equivalence(a, b).

What's your opinion on this?

Then it should not be a static method but a normal method with self as first argument and other as second argument. You can call it as a.check_equivalence(b) (a becomes self, because the method is bound to the instance a) as Variant.check_equivalence(a, b) (a becomes self because it's not bound to an instance).

@psomhorst
Copy link
Contributor

Consider renaming check_equvalence to isequivalent, to be more in line with issubclass and isinstance, etc. The behavior is slightly different, though, as this can optionally raise an error as an alternative output to returning a bool. So might not be a great idea to name it analogously to those builtins.

I agree that this is a better name. If raise_=False by default, the normal behaviour is to return a boolean. This is also in line with e.g. numpy, scipy and pandas.

@DaniBodor DaniBodor force-pushed the 129a_slicing_mixin_dbodor branch 2 times, most recently from 81ab152 to a4515dd Compare December 6, 2023 12:02
@DaniBodor DaniBodor force-pushed the 129b_eq_dbodor branch 11 times, most recently from df95c03 to a8464f1 Compare December 7, 2023 22:10
@DaniBodor DaniBodor force-pushed the 129a_slicing_mixin_dbodor branch from e216214 to 6dc006b Compare December 8, 2023 11:11
Base automatically changed from 129a_slicing_mixin_dbodor to 181_refactor_sequence_psomhorst December 8, 2023 11:21
@DaniBodor DaniBodor force-pushed the 129b_eq_dbodor branch 5 times, most recently from 241f054 to 6cb6b48 Compare December 8, 2023 15:10
@DaniBodor DaniBodor marked this pull request as ready for review December 8, 2023 15:14
@DaniBodor DaniBodor requested a review from psomhorst December 8, 2023 15:14
@DaniBodor DaniBodor force-pushed the 129b_eq_dbodor branch 5 times, most recently from 220dd2b to 7ba4d40 Compare December 13, 2023 22:01
eitprocessing/mixins/equality.py Outdated Show resolved Hide resolved
eitprocessing/mixins/equality.py Outdated Show resolved Hide resolved
eitprocessing/mixins/slicing.py Outdated Show resolved Hide resolved
eitprocessing/eit_data/__init__.py Outdated Show resolved Hide resolved
eitprocessing/mixins/equality.py Outdated Show resolved Hide resolved
@@ -51,6 +51,9 @@ class VariantCollection(dict, Generic[V]):

variant_type: type[V]

def __eq__(self, other):
return Equivalence.__eq__(self, other)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return Equivalence.__eq__(self, other)
return self.variant_type == other.variant_type and super().__eq__(other)

f"Variant types don't match: {self.variant_type}, {other.variant_type}": self.variant_type == other.variant_type,
f"VariantCollections do not contain the same variants: {self.keys()=}, {other.keys()=}": set(self.keys()) == set(other.keys()),
}
for key in self.keys():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This raises a KeyError if self contains keys that other does not. The second check in the dictionary above does check for this, but that does not result in an actually throws exception, just the value being False. So this code is being evaluated, even if set(self.keys()) != set(other.keys()).

Suggested change
for key in self.keys():
for key in set(self.keys()) & set(other.keys()):

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even more problematic is that self and other both can be objects that do not implement keys(). This method raises AttributeError if other is not a VariantCollection, just like the lines above.

eitprocessing/mixins/slicing.py Outdated Show resolved Hide resolved
) -> bool:
# fmt: off
checks = {
f"Variant types don't match: {self.variant_type}, {other.variant_type}": self.variant_type == other.variant_type,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check leads to issues. This check runs regardless of the type of other (or even self, as it can be called as a class method). When testing this with other as something different from a VariantCollection, this raises an AttributeError because other.variant_type does not exist. These checks should ideally run after isequivalent() checks whether the type of other equals the type of self.

Elsewhere I proposed to make a separate method that performs these checks, which is called from isequivalent(). This method can be called after isequivalent() makes sure the type matches.

Alternatively, this check should first check whether other has a variant type.

Suggested change
f"Variant types don't match: {self.variant_type}, {other.variant_type}": self.variant_type == other.variant_type,
f"Variant types don't match: {self.variant_type}, {other.variant_type if hasattr(other, "variant_type") else ""}": hasattr(other, "variant_type") and self.variant_type == other.variant_type,

This becomes quite verbose quite quickly. Error messages do not properly convey the issue at hand.

# fmt: off
checks = {
f"Variant types don't match: {self.variant_type}, {other.variant_type}": self.variant_type == other.variant_type,
f"VariantCollections do not contain the same variants: {self.keys()=}, {other.keys()=}": set(self.keys()) == set(other.keys()),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line has the same problem as above.

Suggested change
f"VariantCollections do not contain the same variants: {self.keys()=}, {other.keys()=}": set(self.keys()) == set(other.keys()),
f"VariantCollections do not contain the same variants: {self.keys()=}, {other.keys() if hasattr(other, 'keys') else ''=}": hasattr(self, "keys") and hasattr(other, "keys") and set(self.keys()) == set(other.keys()),

@DaniBodor
Copy link
Member Author

@psomhorst , I've implemented a few of the changes you suggested, mainly on the core workings of the Equivalence class. The ones that are left now are kind of specific to some of the (more complicated) classes that will hold the data.

To me, the main objective of this PR is to create a structure/class that can be used generically for testing equality and equivalence. I think I re-coded all the separate checks that previously were written per subclass into this structure. Currently it is a bit hard for me to judge exactly how each class will work and what needs to be checked for each of them, and it might also still be a bit fluent.

I suggest that in this PR, we focus on the core of the Equivalence class, and then resolve the details for each daughter class as we are finalizing those classes. What do you think of that?
Alternatively, maybe we can have a call to fix the ones where you still have open comments, or discuss it during our next meeting, as right now I find it hard to see exactly what is going on in each.

Note that these are not functional for the `_Variant`-type classes, as these will likely be dropped.
These are not functional and will need to be overwritten when implementing real tests for this.
Copy link
Contributor

@psomhorst psomhorst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks good! I have not looked at testing this extensively, but we'll run into problems when implementing classes using Equivalence anyway.

self._check_equivalence: list[str]
for attr in self._check_equivalence:
if (s := getattr(self, attr)) != (o := getattr(other, attr)):
raise f"{attr.capitalize()}s don't match: {s}, {o}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
raise f"{attr.capitalize()}s don't match: {s}, {o}"
msg = f"{attr.capitalize()}s don't match: {s}, {o}"
raise EquivalenceError(msg)

@DaniBodor DaniBodor force-pushed the 129b_eq_dbodor branch 3 times, most recently from a443a81 to fee1ddf Compare February 5, 2024 10:42
@DaniBodor
Copy link
Member Author

I think this looks good! I have not looked at testing this extensively, but we'll run into problems when implementing classes using Equivalence anyway.

Since your review, I've rebased the entire branch to make the commits more self-sufficient, but functionally nothing has changed. I will merge as is.

As to your suggestion for the error message, I agree, but will do it in #143, as it is part of the ruff formatting, and will immediately tackle all existing error messages as well.

@DaniBodor DaniBodor merged commit 2617164 into 181_refactor_sequence_psomhorst Feb 5, 2024
@DaniBodor DaniBodor deleted the 129b_eq_dbodor branch February 5, 2024 10:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants