Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add array type handling for normalization #835

Merged
merged 14 commits into from
Jan 6, 2025

Conversation

eroell
Copy link
Collaborator

@eroell eroell commented Dec 4, 2024

Description of changes

Implements different array types for normalization functions which serve as example and guidance on how we implement this for all other functions.

Technical details

Uses singledispatch. Inspired from e.g. scanpy's scale; but less performance engineering, rather focusing on leveraging frameworks such as dask-ml.

@eroell eroell changed the title initial suggestions of array type checks on example of scale_norm initial suggestions of array type handling on example of scale_norm Dec 4, 2024
@eroell eroell changed the title initial suggestions of array type handling on example of scale_norm initial suggestions of array type handling on example of normalization methods Dec 4, 2024
@eroell
Copy link
Collaborator Author

eroell commented Dec 6, 2024

Very happy about your input on any of the questions I raised in the comments @flying-sheep :)

@eroell eroell requested a review from nicolassidoux December 6, 2024 17:10
@eroell
Copy link
Collaborator Author

eroell commented Dec 6, 2024

Also interested what @nicolassidoux thinks here! :)

Copy link
Member

@Zethson Zethson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. You added lots of NotImplementedError messages. I wonder whether we can DRY this more easily?
  2. Those error messages should also always suggest which types are support for that function as a sort of solution.
  3. adata_to_norm_casted = adata_to_norm.copy() why do we always need those copies? Doesn't the fixture already create new copies?

ehrapy/preprocessing/_normalization.py Outdated Show resolved Hide resolved
ehrapy/preprocessing/_normalization.py Show resolved Hide resolved
tests/preprocessing/test_normalization.py Outdated Show resolved Hide resolved
@eroell
Copy link
Collaborator Author

eroell commented Dec 9, 2024

You added lots of NotImplementedError messages. I wonder whether we can DRY this more easily?

Yess

Those error messages should also always suggest which types are support for that function as a sort of solution.

Good point, the dynamic registries help here again

adata_to_norm_casted = adata_to_norm.copy() why do we always need those copies? Doesn't the fixture already create new copies?

because I just not paying attention to this yet, I'll remove :)

@github-actions github-actions bot added the chore label Dec 12, 2024
@Zethson
Copy link
Member

Zethson commented Jan 6, 2025

I addressed the feedback raised in the discussions. Hope I didn't miss any. I'll merge this now but if I forgot something @eroell please make me aware. Thank you!

@Zethson Zethson marked this pull request as ready for review January 6, 2025 16:05
@Zethson Zethson changed the title initial suggestions of array type handling on example of normalization methods Add array type handling for normalization Jan 6, 2025
@Zethson Zethson merged commit 15c6315 into main Jan 6, 2025
11 checks passed
@Zethson Zethson deleted the enhancement/normalization-array-format-harmonization branch January 6, 2025 16:10
@Zethson Zethson mentioned this pull request Jan 6, 2025
16 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants