-
Notifications
You must be signed in to change notification settings - Fork 191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimizing job_kwargs
/ providing default job_kwargs
#2232
Comments
Also something I have in mind since long time: we almost always do n_jobs= -1 and max_threads_per_process=1. This is not optimal in many case. Maybe n_jobs=0.25 and max_threads_per_process=4 would be more optimal. Something like : What do you think ? |
I think that would be great. I currently also need to change my n_jobs because my computer freaks out with -1, so I always just leave one core behind, but if there can be a smart way/auto way to do it that would be perfect. And it would fix both issues (with the added bonus that if the winter is chilly I can just set the optimizer to As far as I guess #1063, also mentions running out of storage (which is a big problem since I can't currently easily switch drives), but I don't know if checking storage would go here or would need to be a different utility function. |
Just moving some of the conversation from #2029 here, some points that came up:
|
I am adding these to the discussion as this |
spikeinterface/src/spikeinterface/preprocessing/phase_shift.py Lines 119 to 141 in cec72f4
|
clearly! |
Github Issues
Here is a non-exhaustive list of issues that either directly were related to
job_kwargs
(n_jobs
being the most common issue) or the potential benefit of additional guardrails in spikeinterface. I haven't directly linked any PRs for this section).#1063
#2026
#2029
#2217
#2202
#1922
#1845
Discussion Issues
To keep this issue manageable I'm only including two topics-- how to optimize kwargs and n_jobs specifically.
Optimizing kwargs
It has come up on other occasions (the Cambridge Neurotech talk, for example), where people were unsure how to optimize the kwargs themselves. For example they know they change
n_jobs
to be a different number, but they don't know how to pick the appropriate number. Or how doeschunk_size
really affect things. Should the default help with small or big datasets or do I need to set it based on my RAM, etc. Part of this can be explained by documentation, but the fact that people are still asking means either 1) the docs are unclear or 2) that part of the docs is hard to find.n_jobs
The default for this is
n_jobs=-1
which means all available (logical) cores. As we began to discuss in #2218, it might be nice to change this default to something that provides the OS a little breathing room when doing multiprocessing. Heberto pointed out to me that both Intel and AMD do in fact have the logical processing concept (I still need to test my Mac, but I think they do not). I'm not sure if that actually influences this or not. So if we setn_jobs=0.9
as @alejoe91 suggested it should still leave at least one logical processor to do OS tasks so I think it would safer, but maybe it is better to have a whole physical core. That I'm not sure of. Unfortunatelyos
does not provide a way to check logical vs physical cores currently, so it would require the addition ofpsutil
to core in order to be able to check this if the cutoff should be decided based on logical vs physical cores.progress_bar
This is very small but the tqdm is not working on Windows similar to what was seen in #2122.
The text was updated successfully, but these errors were encountered: