-
Notifications
You must be signed in to change notification settings - Fork 190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add methods to sparsify and densify waveforms to ChannelSparsity
#1985
Add methods to sparsify and densify waveforms to ChannelSparsity
#1985
Conversation
src/spikeinterface/core/sparsity.py
Outdated
def __repr__(self): | ||
ratio = np.mean(self.mask) | ||
txt = f"ChannelSparsity - units: {self.unit_ids.size} - channels: {self.channel_ids.size} - ratio: {ratio:0.2f}" | ||
sparsity = 1 - np.mean(self.mask) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So here you rather have ratio = np.mean(self.mask) as it was before and just add the clarification that P(x=1), right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/spikeinterface/core/sparsity.py
Outdated
@@ -119,6 +125,65 @@ def unit_id_to_channel_indices(self): | |||
self._unit_id_to_channel_indices[unit_id] = channel_inds | |||
return self._unit_id_to_channel_indices | |||
|
|||
def sparsify_waveforms(self, waveforms: np.ndarray, unit_index: int) -> np.ndarray: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add some checks for dimensions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/spikeinterface/core/sparsity.py
Outdated
waveforms : np.array | ||
The sparsified waveforms array of shape (num_units, num_samples, num_active_channels). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is waveforms_or_template
, right? It can be 3d (waveforms) or 2d (templates) (as in the tests). Maybe we should change the docstring and variable names accordingly. What do you think @h-mayorquin ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a good suggesiton. I think on templates as central / canonical / representative forms of waveforms so I would like to keep the name of the variable waveforms as it is the general type. But I think that the docstring should clarify this and make it explicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made some changes to the docstring. Let me know what you think.
If you think that templates should be a variable name it would still prefer to have instead a method called sparsify_template
and densify_template
that take care of the 2 dimensional case and separate the functions. Maybe that's even better ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's ok for now! I agree that templates are "mean" waveforms. Thanks for clarifying the docstring!
@@ -119,6 +125,85 @@ def unit_id_to_channel_indices(self): | |||
self._unit_id_to_channel_indices[unit_id] = channel_inds | |||
return self._unit_id_to_channel_indices | |||
|
|||
def sparsify_waveforms(self, waveforms: np.ndarray, unit_id: str) -> np.ndarray: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unit_id is not always a str can be int
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be usefull also to have unit_index entry somtimes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I forgot to do this. In fact, mea culpa, I kind of new it when I did it but I ... wished that unit_ids were only string and intentionally forgot : P
I will push a small fix for this.
I would not like to do have a method with two different type of inputs, then 1/4 of the code is used to validate the input with if cases.
|
||
return sparsified_waveforms | ||
|
||
def densify_waveforms(self, waveforms: np.ndarray, unit_id: str) -> np.ndarray: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
assert_msg = ( | ||
"Waveforms do not seem to be be in the sparsity shape of this unit_id. The number of active channels is " | ||
f"{len(non_zero_indices)} but the waveform has {waveforms.shape[-1]} active channels." | ||
) | ||
assert self.are_waveforms_sparse(waveforms=waveforms, unit_id=unit_id), assert_msg |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very little detail as a genral partice: here you format the message even before you made the assert.
But if the assert pass there is no need to make the text formatting. And so putting the message after the assert ...,
is better no ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do feel that waste of computation but I prefer this for readability. This should not have any really impact on performance anywhere.
As an aside to your aside, it is usually considered a bad practice to use asserts the way that we do here (but most people, doing this is very popular!) because you can dissable assertions in optimize code:
The recommended way to do this would be to do and if statement and then raise the appropiate exception but I think that most people does not like this? What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a science vs industry thing. Every piece of science code I've ever read uses assert instead of if-exception. I think with the assumption that scientists aren't going to shut off assertions to increase speed. And using the asserts does empower the user such that if they don't want the protections designed by the developer they can shut them off to get the speed boost. But just my two cents.
I have been wanting to add something like this for a while and it will be useful in the context of #1982.
The idea is that if we have channel sparsity, it should be possible to use the class to compute a sparse or dense representation of a waveform or a template. Now the
ChannelSparsity
object is part of WaveformExtractor but it does not have a way of changing the representation at a waveform at will in the spot.Instead, we do sparsifying operations all over the place directly with the mask:
https://github.com/h-mayorquin/spikeinterface/blob/aa5248454462084dcd7a05296ef82c0bb717a285/src/spikeinterface/core/waveform_extractor.py#L946-L950
https://github.com/h-mayorquin/spikeinterface/blob/aa5248454462084dcd7a05296ef82c0bb717a285/src/spikeinterface/core/waveform_extractor.py#L893-L909
I think it is better to centralize this operation where it semantically make sense (a sparsity class should sparsify!) and test it appropriately.