Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvement when counting num spikes #2198

Closed
wants to merge 2 commits into from

Conversation

DradeAW
Copy link
Contributor

@DradeAW DradeAW commented Nov 13, 2023

Spike vector should be the default, as computing 1 spike train will mean the cache is available (but doesn't mean it's faster to use the cached spike trains)

Spike vector should be the default, as computing 1 spike train will mean
the cache is available (but doesn't mean it's faster to use the cached
spike trains)
@alejoe91
Copy link
Member

Have you profiled it? In the spike trains approach, you just count arrays in memory. With the spike vector approach you need a np.unique call, which sounds slower. Of course in case neither spike trains nor spike vectors are cached, probably the latter is better. But if both ar cached I suspect that spike trains is faster!

@alejoe91 alejoe91 added core Changes to core module performance Performance issues/improvements labels Nov 14, 2023
@DradeAW
Copy link
Contributor Author

DradeAW commented Nov 14, 2023

The problem is that if you called get_unit_spike_train on one unit_id, then in the old version it's going to compute the spike train of all units, which is very slow.

np.unique is very fast, even for very big vectors. I don't think a human can tell the difference!

@alejoe91
Copy link
Member

Which "old" version? If they are all cached it should still be faster, but I agree that they might not be all cached and that unique is fast. So good for me ;)

@DradeAW
Copy link
Contributor Author

DradeAW commented Nov 14, 2023

By "old" I meant current (not this PR)
Maybe instead of reversing we can check whether they are all cached or not.
But checking whether the cache exists is not sufficient.

@samuelgarcia
Copy link
Member

@DradeAW : I will build on top of this PR a option to return a dict or a numpy.array for counting spikes.
In many situtation the array will be easier (and maybe faster)

@samuelgarcia
Copy link
Member

See this PR which is built on top of this one : #2209

@DradeAW
Copy link
Contributor Author

DradeAW commented Nov 15, 2023

Yep makes sense!

Also another useful option would be to return per segment or the sum over all segments?

@h-mayorquin
Copy link
Collaborator

I agree with @alejoe91 that you should profile this.

np.unique is very fast, even for very big vectors. I don't think a human can tell the difference!

Really optimization is something that should not be based on guesses or expected behavior.

@DradeAW
Copy link
Contributor Author

DradeAW commented Nov 15, 2023

Here is a benchmark for a big soring (NumpyFolderSorting with 531 units, for a total of 11,298,565 spikes)

import time
import numpy as np
import spikeinterface.core as si


sorting = si.load_extractor("/mnt/raid0/data/MEArec/1h_3000cells/analyses/ks2_5_pj7-3/sorting")
N = 100
time_spike_trains = np.empty(N, dtype=np.float32)
time_spike_vector = np.empty(N, dtype=np.float32)

# Cache spike vector and spike trains
sorting.to_spike_vector(use_cache=True)
for unit_id in sorting.unit_ids:
	sorting.get_unit_spike_train(unit_id, use_cache=True)

# Benchmark
for i in range(N):
	t1 = time.perf_counter()
	sorting.count_num_spikes_per_unit(method="spike_trains")  # I added the argument 'method' to benchmark
	t2 = time.perf_counter()
	sorting.count_num_spikes_per_unit(method="spike_vector")
	t3 = time.perf_counter()

	time_spike_trains[i] = t2 - t1
	time_spike_vector[i] = t3 - t2

print(f"Spike trains: {1e3*np.mean(time_spike_trains):.1f} +- {1e3*np.std(time_spike_trains):.1f} ms")
print(f"Spike vector: {1e3*np.mean(time_spike_vector):.1f} +- {1e3*np.std(time_spike_vector):.1f} ms")

Output:

Spike trains: 0.5 +- 0.1 ms
Spike vector: 263.8 +- 6.4 ms

So a clear difference, but also clearly under a second for each.
Although keep in mind that you should add the time of checking whether the spike train cache is complete for a good comparison.

@h-mayorquin
Copy link
Collaborator

The diff on git made this very confusing to me.

You changed the method to use the spike_vector instead of the dictionary to count spikes when you have the cache. Then you are showing us that when you have both the cache and the spike_vector. What about if you don't have the spike vector to begin with?

Also, why did you change the else branch?

@DradeAW
Copy link
Contributor Author

DradeAW commented Nov 16, 2023

I didn't change the else, I just switched which one came first.

The idea is that, right now, I don't know how to check whether all the spike trains have been computed, and the computation is only faster if you have all of them (as shown in my other PR where computing spike trains is very slow)

If you don't have the spike vector to begin with then it default to looking at the spike trains, which is the correct behaviour

@samuelgarcia
Copy link
Member

The #2209 is now doing a better choice on top of this one.

@DradeAW
Copy link
Contributor Author

DradeAW commented Nov 17, 2023

Agreed, I'll close this PR :)

@DradeAW DradeAW closed this Nov 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Changes to core module performance Performance issues/improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants