Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create mixins for common methods #129

Open
2 of 4 tasks
DaniBodor opened this issue Nov 29, 2023 · 4 comments
Open
2 of 4 tasks

Create mixins for common methods #129

DaniBodor opened this issue Nov 29, 2023 · 4 comments
Assignees
Labels
refactoring Improve clarity or readability of code

Comments

@DaniBodor
Copy link
Member

DaniBodor commented Nov 29, 2023

Make mix-in classes for:

The more I think about this, the more I feel like it would make sense to have a super-class with all these methods, rather than a whole bunch of mixins. To me, mixins make sense if we want to pick and choose which class uses which of these functionalities, but it seems to me that all classes will use all the methods.
Anyway, for now I will create them as mixins and then it should be fairly straightforward to refactor it into a superclass if we decide to go that route.

@DaniBodor DaniBodor self-assigned this Nov 29, 2023
@DaniBodor DaniBodor added this to the Major restructuring milestone Nov 29, 2023
@DaniBodor DaniBodor added the refactoring Improve clarity or readability of code label Nov 29, 2023
@psomhorst
Copy link
Contributor

psomhorst commented Nov 29, 2023

@DaniBodor In different variants, data is stored in differently named attributes. E.g. EITDataVariant has an attribute pixel_impedance that stores data, while a ContinuousDataVariant has an attribute values that stores data. For concatenation it is important to be able to address the data attribute. There are a few ways I can think of to address this.

  • Unify the name. For most Variants, values is the logical name. For EITDataVariant, this is different. Both pixel_impedance and global_impedance are often used. It would not be clear what data values would hold. When initialising EITDataVariant, it is more clear to explicitly state what data you're passing on (pixel impedance).
  • Assume the fourth field of the dataclass is the data storing attribute. Variant has three fields, so any subclass can implement the fourth as the data storage. This currently works for EITDataVariant and ContinuousDataVariant, but is not a strong assumption to be making.
  • Add an attribute that points to the name of the data storage attribute. E.g. EITDataVariant._data_storage = "pixel_impedance". This requires no assumptions, but is a bit ugly to use: getattr(variant, variant._data_storage).
  • Add a property _data_storage that returns the proper attribute. See example below. It's better to use than the previous one: variant._data_storage.
class EITDataVariant(Variant):
    pixel_impedance: NDArray

    @property
    def _data_storage(self) -> NDarray:
        return self.pixel_impedance

What do you think? I think the last option is the neatest.

@psomhorst
Copy link
Contributor

I think __len__() can be added to the mixin list. The length (number of time steps) of each dataset should be the length of the first axis, e.g. the 0th element of the shape. E.g.:

class EITDataVariant:
    def __len__(self) -> int:
        return self.pixel_impedance.shape[0]

This requires a uniform way to access the proper attribute; see previous comment.

@DaniBodor
Copy link
Member Author

@DaniBodor In different variants, data is stored in differently named attributes. E.g. EITDataVariant has an attribute pixel_impedance that stores data, while a ContinuousDataVariant has an attribute values that stores data. For concatenation it is important to be able to address the data attribute. There are a few ways I can think of to address this.

* Unify the name. For most Variants, `values` is the logical name. For EITDataVariant, this is different. Both `pixel_impedance` and `global_impedance` are often used. It would not be clear what data `values` would hold. When initialising EITDataVariant, it is more clear to explicitly state what data you're passing on (pixel impedance).

* Assume the fourth field of the dataclass is the data storing  attribute. Variant has three fields, so any subclass can implement the fourth as the data storage. This currently works for EITDataVariant and ContinuousDataVariant, but is not a strong assumption to be making.

* Add an attribute that points to the name of the data storage attribute. E.g. `EITDataVariant._data_storage = "pixel_impedance"`. This requires no assumptions, but is a bit ugly to use: `getattr(variant, variant._data_storage)`.

* Add a property `_data_storage` that returns the proper attribute. See example below. It's better to use than the previous one: `variant._data_storage`.
class EITDataVariant(Variant):
    pixel_impedance: NDArray

    @property
    def _data_storage(self) -> NDarray:
        return self.pixel_impedance

What do you think? I think the last option is the neatest.

Thanks for sharing these considerations. I'm not sure what I prefer at this point, as it's still a bit abstract for me. I will think about it and play with it as I start implementing it and get back to you with my thoughts then.

@DaniBodor
Copy link
Member Author

@DaniBodor In different variants, data is stored in differently named attributes. E.g. EITDataVariant has an attribute pixel_impedance that stores data, while a ContinuousDataVariant has an attribute values that stores data. For concatenation it is important to be able to address the data attribute. There are a few ways I can think of to address this.

  • Unify the name. For most Variants, values is the logical name. For EITDataVariant, this is different. Both pixel_impedance and global_impedance are often used. It would not be clear what data values would hold. When initialising EITDataVariant, it is more clear to explicitly state what data you're passing on (pixel impedance).

  • Assume the fourth field of the dataclass is the data storing attribute. Variant has three fields, so any subclass can implement the fourth as the data storage. This currently works for EITDataVariant and ContinuousDataVariant, but is not a strong assumption to be making.

  • Add an attribute that points to the name of the data storage attribute. E.g. EITDataVariant._data_storage = "pixel_impedance". This requires no assumptions, but is a bit ugly to use: getattr(variant, variant._data_storage).

  • Add a property _data_storage that returns the proper attribute. See example below. It's better to use than the previous one: variant._data_storage.

class EITDataVariant(Variant):
    pixel_impedance: NDArray

    @property
    def _data_storage(self) -> NDarray:
        return self.pixel_impedance

What do you think? I think the last option is the neatest.

OK, I now have a better understanding of the overall structure. Do I understand correctly that any Variant (bet it EITData, ContinuousData, or SparseData) will only have 1 attribute that should be concatenated?
If so, I agree that your last suggestion, giving it a property) is the neatest. The property can then also be used by __len__. The way I envision it now is that a new object can be create with all/most arguments identical to self apart from this _data_storage property (and the label/name)? The arguments that don't match need to be stated in the child method, similar to how I did the equivalence checks in #133.
The one thing is that Sequence wouldn't inherit the generic method, but would need to loop over all stored data structures and concatenate them separately (which is not an issue).

If not, then I'm not sure there's a neat unified way to do concatenation, and we may as well stick to separate methods for each.

@psomhorst psomhorst removed this from the Major restructuring milestone Jun 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
refactoring Improve clarity or readability of code
Projects
None yet
Development

No branches or pull requests

2 participants