Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an option for count_num_spikes_per_unit #2209

Merged
merged 9 commits into from
Nov 22, 2023

Conversation

samuelgarcia
Copy link
Member

This should be merge after #2198

DradeAW and others added 4 commits November 13, 2023 15:42
Spike vector should be the default, as computing 1 spike train will mean
the cache is available (but doesn't mean it's faster to use the cached
spike trains)
@samuelgarcia
Copy link
Member Author

With this, some part of the code using count_num_spikes_per_unit() can be rewritten with array in mind and should be better than looping over a dict. (for instance using np.any, np.all, ...)

should I do this in the same PR ?

@alejoe91 alejoe91 added enhancement New feature or request core Changes to core module labels Nov 15, 2023
@h-mayorquin
Copy link
Collaborator

@samuelgarcia

should I do this in the same PR ?

I vote not, one PR per feature as much as we can.

@@ -269,35 +269,40 @@ def get_total_num_spikes(self):
)
return self.count_num_spikes_per_unit()

def count_num_spikes_per_unit(self) -> dict:
def count_num_spikes_per_unit(self, output="dict"):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to just make another function for this behavior.

What is the advantage of doing it with a keyword?
The disavatnages are:

  • Harder to discover
  • More complicated output type which makes harder to reason about code.
  • Adds complexity to the function.

Functions should do one thing and do it well. Specially at the core. Or make two private methods that do the specific task and make this a dispatcher with options.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me the main advantage is internal use of sorting.count_num_spikes_per_unit(output="array")
replacing at several places the uggly np.array(list(sorting.count_num_spikes_per_unit().values()))

Copy link
Collaborator

@h-mayorquin h-mayorquin Nov 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But you can just have another method that will allow you to do exactly the same:

sorting.count_num_spikes_per_unit_array instead of np.array(list(sorting.count_num_spikes_per_unit().values()))

I think the method is a good idea.
I think mixing this new logic in the old method is not for the reasons stated above.

I am asking: why do you want to mix the logic to produce counts as arrays and counts as dict in the same fuction? What's the advantage of it? Is than an aesthethical thing? You dislike having more functions? You personally find keywords easier to discover than functions? I am looking for your reasoning here.

"Functions should do only one thing and do it well" seems like a really good principle to me that is more or less widely accepted. Design rules can -and sometimes should- be broken, but I think we should have a good reason to do it. Which one is here?

propagate the outputs option at several place when it make sens.
@samuelgarcia
Copy link
Member Author

@DradeAW : can you chekc this ?

@DradeAW
Copy link
Contributor

DradeAW commented Nov 17, 2023

I think the way you implemented it is the right way: using spike trains if they are all cached or if the spike_vector isn't cached.

Also love the idea of getting an array or a dict!
I never know the return type and often need to convert it depending on my needs, I like the option :)

I'll make some test to make sure it makes things faster on my code.

@DradeAW
Copy link
Contributor

DradeAW commented Nov 17, 2023

Yep I do have a noticeable speedup (several seconds)!

Thanks @samuelgarcia :)

Copy link
Collaborator

@zm711 zm711 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although I do think that separating out functions is likely better, I added docstring/error clarifications with the assumption that this PR will remain as is. Feel free to ignore if you plan to change the structure.

src/spikeinterface/core/basesorting.py Outdated Show resolved Hide resolved
src/spikeinterface/core/basesorting.py Outdated Show resolved Hide resolved
src/spikeinterface/core/basesorting.py Outdated Show resolved Hide resolved
@alejoe91 alejoe91 merged commit 5d7b64e into SpikeInterface:main Nov 22, 2023
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Changes to core module enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants