-
Notifications
You must be signed in to change notification settings - Fork 190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add an option for count_num_spikes_per_unit #2209
Conversation
Spike vector should be the default, as computing 1 spike train will mean the cache is available (but doesn't mean it's faster to use the cached spike trains)
…rface into count_spike_array
With this, some part of the code using should I do this in the same PR ? |
I vote not, one PR per feature as much as we can. |
@@ -269,35 +269,40 @@ def get_total_num_spikes(self): | |||
) | |||
return self.count_num_spikes_per_unit() | |||
|
|||
def count_num_spikes_per_unit(self) -> dict: | |||
def count_num_spikes_per_unit(self, output="dict"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to just make another function for this behavior.
What is the advantage of doing it with a keyword?
The disavatnages are:
- Harder to discover
- More complicated output type which makes harder to reason about code.
- Adds complexity to the function.
Functions should do one thing and do it well. Specially at the core. Or make two private methods that do the specific task and make this a dispatcher with options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For me the main advantage is internal use of sorting.count_num_spikes_per_unit(output="array")
replacing at several places the uggly np.array(list(sorting.count_num_spikes_per_unit().values()))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But you can just have another method that will allow you to do exactly the same:
sorting.count_num_spikes_per_unit_array
instead of np.array(list(sorting.count_num_spikes_per_unit().values()))
I think the method is a good idea.
I think mixing this new logic in the old method is not for the reasons stated above.
I am asking: why do you want to mix the logic to produce counts as arrays and counts as dict in the same fuction? What's the advantage of it? Is than an aesthethical thing? You dislike having more functions? You personally find keywords easier to discover than functions? I am looking for your reasoning here.
"Functions should do only one thing and do it well" seems like a really good principle to me that is more or less widely accepted. Design rules can -and sometimes should- be broken, but I think we should have a good reason to do it. Which one is here?
propagate the outputs option at several place when it make sens.
@DradeAW : can you chekc this ? |
I think the way you implemented it is the right way: using spike trains if they are all cached or if the spike_vector isn't cached. Also love the idea of getting an array or a dict! I'll make some test to make sure it makes things faster on my code. |
Yep I do have a noticeable speedup (several seconds)! Thanks @samuelgarcia :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although I do think that separating out functions is likely better, I added docstring/error clarifications with the assumption that this PR will remain as is. Feel free to ignore if you plan to change the structure.
Co-authored-by: Zach McKenzie <[email protected]>
for more information, see https://pre-commit.ci
This should be merge after #2198