-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use multiple processors #7
Comments
How do you want to use multiple cores exactly? Le lundi 27 avril 2015, Dan Goodman [email protected] a écrit :
|
The main one is in the E-step. We have a key loop which, for each cluster, involves iterating over all spikes. I use an OpenMP parallel for over this inner loop over spikes in the C++ version. I'd like to do the equivalent in the Python version. |
Maybe we can use this feature to implement a parallel for loop with Numba? |
Note to myself: to do this in Cython using OpenMP, we don't have access to the keyword that makes a copy of the variable for each thread, but we can allocate them in a list/array of variables and then access them using the thread index. |
Do you think Numba will let us use multiple CPUs here? |
I think it can be done but might be simpler using Cython. Am happy to switch to Numba but since everything is in Cython at the moment I'll stick with that for now. The big advantage of Numba to me would be that I wouldn't have to type all the variables explicitly, and we could mix and match arrays with different dtypes (e.g. float32, float64, int16, int32, int64). This is possible in Cython but gets complicated when you have multiple arrays each of which could have different dtypes. |
OK this is done for the E-step now and it works pretty well. I'll leave it open in case we want to do the M-step too, but the E-step is most of the work. |
Is it possible to set the number of threads that klustakwik will use? Right now it's using all of my physical and virtual CPUs, I'd like to be able to specify how many if possible. I'm using it through phy and have my OMP_NUM_THREADS=1. Thanks! |
I'll look into this, I created a new issue #67 that you can follow if you want. |
OK I fixed this. It was indeed ignoring OMP_NUM_THREADS but it was by design (long story). I've added a new parameter |
Great. Just to make sure I understand: to use this, I add “num_cpus=12" to the klustakwik2 dictionary of my prm? |
Yes, if you have the latest version of KK2. On 15/07/2015 21:15, Chris Wilson wrote:
|
note that others have reported a bug in phy where KK2 params were not properly taken into account -- should be fixed this week |
In the old KlustaKwik there was some but not a huge benefit to using multiple cores because the problem was memory bandwidth limited. However, in KK2 the memory usage is reduced by orders of magnitude (especially for larger problems), so we might well see much better speed improvements to multiple processors.
There is a technical issue. As far as I know, Numba does not support multiple processors except in the vectorize decorator which is not something we can use in KK2 (and then only in the 'pro' version). I don't see any way around this. This might mean we have to stick to Cython.
@rossant any thoughts?
The text was updated successfully, but these errors were encountered: